Does Dgraph have a scalability problems with graphs having single heavy predicate?

My question is mostly theoretical now after I read the paper published about Dgraph.

The paper is saying that all the data for a single predicate will be stored on the same shard. It also says that there is a predicate splitting logic created for cases, when a single predicate has too many values. At the same time, the paper doesn’t say that splits could be sharded separately (different splits could be moved to different shards).

If my understanding of the paper is correct, I would expect that at some point we can reach the limit of scalability and it will be bound to the disk size of the biggest Alpha. It will happen if some predicate is used for all nodes, its value size is pretty big and graph contains a lot of nodes.

Can you please comment on why this decison was made and what is your vision on this problem.

Yes its quite correct in the understanding. We currently don’t split shards. So if you have one big predicate, it would have to reside on one disk. But querying it would still be fast. Badger supports vlog, so big values are part of logs and we directly query the pointer in the vlog. We would face problems if you query the entire data at once because we won’t able to read it in memory. So stuff like traversal, indexes could face issues.
We are thinking that we can split a predicate in multiple groups after a certain size to avoid this issue. We are thinking we can shard on the basis of nodes, but researching different avenues as well.

We did it this way so that it’s faster to traverse within a predicate. Hence we want to try to maintain the performance of queries even with splits.

Thanks @harshil_goel for the explanation. Can you please clarify a little more on So stuff like traversal, indexes could face issues. - I don’t think I fully got the reason of traversals being slow. Is it because theoretically a single predicate traversal could return too many results which will lead to too many second level reads? I would really appreciate a more detailed explanation here, so I’m making a fully informed decision.

While performing any query, we make a list of UIDs at every level. So if you have any large predicate, we could end up reading all the UIDs based on the query.
Basically if you have any query like

query(has(predicate)) @filter(eq(predicate), "something") {uid}

In this query, we would get two UID lists. One would be the result of has(predicate) and one with eq(predicate) == “something”. With large predicates, this starts to get slow. It depends from query to query when we are reading too much data.

Also any query where they are too many reads, would slow down.

In general I would advice you to make a dataset and do benchmarking for your use case. We have been fixing a lot of stuff related to performance. If certain use case is slow, we could see if there’s an easy fix we have for right now.

Okay, I think I got your point. Can you please confirm that I fully understand it? So, you’re saying that the biggest problem is the speed of read from hdd, not the intersect operation and the size of the structure with a list of UIDs or network latency potentially needed to execute this query? Basically it means that if I can (or can afford to) map the FS on all Alfa to RAM, even the query will be very fast?

On you other point with testing - yes, I will definitely do that. Right now I’m stuck with the dataset migration here The --upsertPredicate flag is ignored during JSON live upload if you can please help me with that too.

It’s not the reading from disk, it’s loading the entire list of uids in memory. If you have a lot of ram, we would be able to load the data in memory. But then doing any operations could be slow. Dgraph works over list of uids, on which we may do union, intersection or sorting.

We already keep a lot of data in ram using MMAP, but using fs to keep all data on ram would be faster sure.

Sorry for the delay in the --upsertPredicate issue. I see that you have found a solution.

Oh! Interesting. But I thought you have an interesting optimization (compression algorithm) implemented to send huge lists of uids effectively?

Are you talking about roaring bitmaps? It used to be there, but right now we had to remove it because of stability issues. We would soon reintroduce it.

We do have some algorithms that try to optimise how we store uids and do these operations. But as the scale increases, these operations would get slow too.

Yes, I was referring to roaring bitmaps. Thanks for the answer! I’ll do my test on a full scale dataset.