A given predicate always belongs to same group, i.e will always go same node (and its replicas)
So this node/server will eventually fill-up and there is no way to horizontally scale the storage.
Has any work been done on splitting the predicate into multiple groups? ( https://docs.dgraph.io/design-concepts/#group)
If not, what solutions are offered, without vertically scaling (which has limits and be expensive) ?
Vik - Thanks for starting thread and capturing all of my notes in a one sentence succinct fashion.
Here’s my unabridged questions and notes from Slack, a little tuned up. Please let me know if I misunderstand the magnitude of a predicate being restricted to a single server. Are the predicates stored so small it doesn’t matter? If so how would one calculate a server size anyway?
- When you run out of space on a server, due to one predicate - how do you add a new node and rebalance the shards?
- How do you calculate an estimated size for a predicate group, such that you can size a server to handle the anticipated growth of a predicate group.
- How do you calculate the size of a node to accommodate the size of the predicate
- Specifically, I’m looking for more details on the sharding for part of a cluster design - If I design small economy sized nodes and need to add more in the future, how do the predicate groups split themselves to accommodate more data than initially planned for? How do you calculate an estimation of how large the predicate group will be?
The information for this is scattered, here’s a list of references I’ve found:
From the dgraph documentation on Group design (https://docs.dgraph.io/design-concepts/#group), dating back to the 0.8.0 release in July 17, 2017, "In a future version, if a group gets too big, it could be split further. In this case, a single
Predicate essentially gets divided across two groups.”
Sydney 5 Min agenda, June 22, 2016: “Distributed: Automatically distribute data to and serve from provided servers. Handle shard splits, shard merges, and shard movement.” (dgraph/g.slide at 7851bb75ca753e6fe98ad504dc53cc629ed0573a · dgraph-io/dgraph · GitHub)
From GitHub Issue on Dgraph vocabulary (Q: Servers, replicas, groups, shards, and tablets. Do I understand this correctly? · Issue #2171 · dgraph-io/dgraph · GitHub) “A predicate is just the middle part in a <Subject> <Predicate> <Object>/ObjectValue . RDF NQuad. Dgraph shards data by a predicate. Therefore all data for a predicate must reside on the same server.” If you were modeling Twitter followers, several hops out - wouldn’t the predicate “follows” eventually fill the server?
From the Dgraph documentation on High Availability (https://docs.dgraph.io/deploy/#ha-cluster-setup), “Over time, the data would be evenly split across all the groups. So, it’s important to ensure that the number of Dgraph alphas is a multiple of the replication setting. For e.g., if you set --replicas=3 in Zero, then run three Dgraph alphas for no sharding, but 3x replication. Run six Dgraph alphas, for sharding the data into two groups, with 3x replication.” - but how does that actually work?
I can’t seem to find more details on it.
From the Go Time #108: Graph Databases podcast on Dgraph (Go Time #108: Graph Databases (podcast) - YouTube) - “partition data set on the predicate… there are many predicate - like name or age or friends, that information is separate so they could be on multiple machines”… “separating the data like that, no mater how much data you’re fetching the number of network requests you’re sending is proportional to the number of predicates you’re fetching not the amount of data.” - Francesc
Can a predicate overrun the memory of a single node?
From the GoSF Badger talk (KV Database: Badger Talk at GoSF Meetup - 21 Feb (2018) - YouTube), talking about storing only keys vs storing values with the keys… this makes sense and certainly reduces the size necessary to store the information, but again - how’s the predicate sharding work, especially if additional nodes are added to the cluster to support more data? Is there a slide on this, or a talk someplace?
During the talk, the following stats were mentioned:
16-byte keys, 10-byte value pointer to SSD, and 64MB table - 2.5 mil keys per table
Does this mean for every 2.5 million keys (predicate + subject = group), 64MB is needed? Is this the estimation for how large to build a server? 25 million keys would require 640MB…
June 2016 - Discussion on dynamic sharding - #3 by mohitranka. RAFT is labeled as a solution to moving shards around, but I don’t think that includes splitting predicates.
From the Dgraph documentation on New Servers & Discovery (https://docs.dgraph.io/design-concepts/#new-server-and-discovery) I see this about transferring data to new machines, but is that a replica transferred to balance total data, or splitting a shard to add total capacity to the cluster?
From the Dgraph documentation on Client Implementation (https://docs.dgraph.io/clients/#implementation)
For multi-node setups, predicates are assigned to the group that first sees that predicate. Dgraph also automatically moves predicate data to different groups in order to balance predicate distribution. This occurs automatically every 10 minutes.
We still do not shard the predicate across multiple groups but we plan to add this feature into Dgraph in the year of 2020. We are also working on releasing a paper that provides details of the internals of Dgraph, keep an eye out.
Do you have any suggestions for back-of-napkin math on how to know what the right size of RAM and SSD drive is for the amount of predicates?
As far as I understand, RAM requirements depend on how often you query (read or write) into the database.
Disk requirements are tricky to calculate beforehand, it really depends on your dataset. We perform various encoding and compression of the data that we store and that could vary from dataset to dataset. Though, you could perform a small experiment with a decently large dataset (say 2GB) and extrapolate your disk requirements.
Does this reflect how Predicates, Groups, and Shards relate on a node?
I am not sure I follow everything in the slides, but more details are available here https://docs.dgraph.io/deploy/#understanding-dgraph-cluster. As far as I understand, we have groups and each group can contain one or more than one predicates which will be replicated to each alpha in the group depending upon the replication factor.
We still do not shard the predicate across multiple groups but we plan to add this feature into Dgraph in the year of 2020
Any update on this? Don’t recall seeing it in the most recent upgrades.
@matthewmcneely this was discussed in the roadmap discussion for 2021, and is put on hold for now to accommodate for other priority features in Dgraph. We considered the request from users but we haven’t received many request for supporting use cases requiring large predicates ( requiring more than 1 TB in storage ).
@hardik - In the not-too-distant future, my team and I are going to be building a search engine that will be caching all the publicly available web pages on the net (i.e. like Google / Bing do). In addition, the plan is to try to add in the URLs from the Way Back Machine.
From here, we could tentatively say that we’d need to consider 2 billion websites. Assuming each website has an average of 100 URLs (which may be high or low), we’re talking potentially 200 billion URLs.
Since almost all the URLs will be greater than 5 bytes long, we’ll be talking at least a TB in storage for that. Moreover, if we’re using the above estimates, we’re talking about 200 billion nodes of a single type, and every node will have many different predicates.
We’ll obviously be storing the web cache in other software and doing most of the page analysis using other tools, but the plan is to feed all that data analysis into Dgraph and build our search engine on top of that.
While this is obviously going to be somewhat at the extreme end, it is a use-case that we’ll be developing in the not-too-distant future.
@eugaia We will bring this again for consideration in this case.
@hardik - Great, thanks.
In case you didn’t see it before (and you might find the suggestions useful), I wrote some ideas I had about sharding predicates here: Idea - Shard predicate / Edge across groups - #3 by eugaia.
In addition to these thoughts, I’ve had a few more:
Sharding by type and/or predicate
It may be a nice option to shard all the fields of a type collectively in such a way that predicate requests are only sent to the appropriate tablets and not to all tablets (where possible to know in advance).
So, if there’s say a @shard directive, it could be placed either on a type or a field definition.
Sharding using ‘x mod(n)’
Using mod(n) of a number x (e.g. hash or just the UID), where n is the number of shards, could be an easy way to know which tablet to send the requests to.
Sharding by UID or hash of UID
Potentially, if you use something like x mod(n), it could be done on just the UID, though the distribution of nodes may be better if you use a hash of the UID instead.