Hey Trivedi,
The comparison between PostgreSQL and Dgraph for handling hierarchical data, as in a multi-level comments section, is fair to an extent, but specific system factors must be taken into consideration. Both databases have strengths in different areas due to their underlying structures and use cases.
The observed performance difference could be due to factors such as data storage, computing design and indexing, queries, hardware setup, and Docker resource limits(Which in non-linux systems exists no matter what you do - Docker will never work “natively” in macOS. It is a VM, only in Linux it runs close to the kernel which Apple would never let Docker or others free open source to run something close to the kernel. Only VMWare, Paralels may touch Apple Hypervisor). However, an additional factor to consider in this comparison is the operating system and hardware environment. As you’re using a macOS M1, this could potentially impact the results. Dgraph doesn’t have a build for macOS and likely isn’t running on the recent (and not yet fully tested) ARM version. On the other hand, PostgreSQL has support for Apple ARM. In your case, you executed Dgraph in a Docker container, and for a fair comparison, PostgreSQL should be run in the same manner.
Furthermore, when dealing with more than 1 million nodes, it’s beneficial to have additional memory. Dgraph would need to expand its use of RAM for larger queries, which could also affect performance.
Despite PostgreSQL’s superior performance in this specific task, the choice of database should take into account other factors like ease of use, community support, project maturity(which is a win for Postgres), and specific features. Database optimization and correct setup are also key when comparing different databases.
It is nice to conduct such comparisons in fair scenarios. As these not only provide valuable insights into the performance characteristics of different systems, but also pave the way for improvements. It’s important to remember, however, that the suitability of a database system depends on the specific use case, and performance is just one of many factors to consider.
As you may not doing anything serious, just evaluating the product. Maybe you can call Postgres a winner for your case. Cuz if you are comparing a Graph Database with Postgres in performance. You are not interested in Graph characteristics right? Otherwise you would be comparing with Neo, Tiger, or even RedisGraph.
Cheers.