In memory sub graph loading

Typically if the graph is a huge, like at internet scale, we usually need something like a what a user owns or is directly connected more often than any distant connections between nodes that are not directly related.

In my case, each user owns a certain set of nodes with directed edges(same predicate) between them. Then some of these nodes will be connected to nodes of other users(different predicates).

The use case i have is each user logs in and mostly does work with his nodes(vertices). There are times when he may decide to connect to other users’ nodes but compared to the time spent on working on his own nodes the connection effort with others’ nodes is low. The only time heavy work is when figuring out what work from user’s nodes needs to be transmitted to other users’ nodes.

So, my thinking is, if the DGraph can be ‘hinted’ somehow that a certain subgraph(based on certain criteria) be loaded into memory like into elasticache and that can speed up work substantially.

It would certainly help me but i am seeing this also helpful to other implementing users of DGraph. They may wish to examine certain queries that are frequently made related to a sub graph and see if that query can be converted into ‘hints’ to DGraph. I understand that sometimes the initial loading may be very resource consuming but may help in long term.

Does that make sense to anyone?

1 Like

Any such mechanisms will come at the cost of complexity. And if something is adding complexity, its benefits should be very clear, and the use cases should be common. I don’t see that to be the case here.

Dgraph does load up posting lists in memory on request; so any successive queries could be run entirely in RAM. If you have allocated enough RAM, those PLs would stay in memory forever. Alternatively, you could put Dgraph on tmpfs, which should ensure all data is in RAM and Dgraph never hits the disk.

In any case, the first step is to actually run Dgraph with this data, and see if you’re witnessing any performance issues; or performance isn’t up to par.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.