Sparse adjacency matrices instead of posting lists?

the-alchemist · November 12, 2018, 10:36pm

Based on some reading and going through the code, looks like Dgraph uses posting lists/adjacency lists. It appears this vocabulary comes from Facebook?

We’re interested in doing some research to see if Dgraph would benefit from sparse adjacency matrices, like RedisGraph. Has anyone tried this out, even has a proof-of-concept? Is this even worth trying?

Any ideas appreciated, thanks!

mrjn · November 13, 2018, 6:18pm

I don’t know too much about the storage concepts behind matrices, but here’s my take, so correct me if I’m wrong.

Redis Graph has the design benefit of completely being served out of memory. Dgraph being a DB cannot make that assumption. From my understanding, adjacency matrices would need to be completely stored in memory to be served. Also, updates to that, would require regeneration of the entire matrix.

One more thing: Dgraph also supports facets, which get incorporated into the posting lists. Matrices don’t allow for that.

the-alchemist · November 13, 2018, 7:05pm

Thank you ,Manish, for the quick reply!

I think you’re right: RedisGraph stores everything in-memory.

However, I think there’s some still opportunity for research. I found a variety of papers/code, in R and C++, for efficient computation of matrices on external storage:

Perhaps random access could be enabled with a good on-disk matrix format and roaring bitmaps ? (Roaring bitmaps support efficient random access.)

Our long-term goal is to enable integration of GraphBLAS, which would GPU-accelerated operations in dgraph.

Could you point us to where in the code we should focus on? We’re looking at dgraph/algo/, dgraph/posting/.

Perhaps facets could be represented as “special” edges?

P.S. I’d post more URLs but the forum doesn’t allow new users to post >2 links, heh.

mrjn · November 13, 2018, 11:29pm

Many packages are involved with running queries. Posting, Worker and Query packages are the main ones.

Topic		Replies	Views
Data == postings lists == indexes and resource utilization Dgraph Cloud	6	701	May 13, 2021
[Feature Request] In-memory mode Dgraph dgraph , kind:feature	4	772	February 14, 2021
Beginner Question on Graph data modelling Misc dgraph	4	825	September 23, 2020
Adding support for Dgraph in Redash Users	1	657	August 7, 2019
Scale the shit out of this! - Dgraph Blog Blog	1	1510	November 19, 2020

Sparse adjacency matrices instead of posting lists?

Related topics