Discussion on dynamic sharding

As Dgraph matures as a distributed system, it would need the ability to support increase or decrease the hardware resources eg. nodes or shards in a cluster as the load in the production environment changes.

Dgraph currently uses a modulo based hashing function to determine which predicate (and associated entities) reside on which shard based on total number of nodes in the cluster (assumption??). When the number of nodes in the shard change, so does the shard associated with the given predicate.

Which means on teardown or bootstrapping of a new node in the cluster the hash for all existing predicates need to be recomputed and data need to be moved around the shards accordingly, resulting in network congestion (albeit asynchronously) and possibly “downtime” before the cluster is fully operational.

What are the approaches Dgraph plan to use to minimize the internal network traffic on bootstrapping/tearing down a node.

Note:
1. The below mentioned writeup is based on my understanding and assumptions regarding Dgraph, which is likely be erroneous. Please correct them as you see fit.
2. This may be little far in the future from Dgraph’s product roadmap point of view.

Hey @mohitranka,

That’s how the loader works. Note that at loading time; we have little information about the cluster. So, we do mod sharding for all the predicates, depending on how many servers you want to start with.

Once loading is done, and you bring up the live instance, there’d be a rebalancing of the predicates depending upon various heuristics. Currently, that doesn’t happen, but we’re moving forward towards that feature with v0.4 – via RAFT, a consensus algorithm.

What would happen with this algorithm is that all the nodes would share a global map, which would store information about which predicate is on which server. They’d use this map to figure out how to redirect parts of the queries.

If you add more servers to the cluster, we’ll move some of these predicate shards to the new server, and update our global map. So, all the servers would then know about the move and update the query redirection accordingly.

Once our RAFT functionality is up, we’d no longer use the hash mechanism to determine which predicate goes to which server. All of that would be dictated via the global map.

If a node goes down, we’ll have to move the data to other nodes in the cluster. That’s all the excess network traffic that’d be incurred that I can think of on top of my head. Is there something, in particular, you’re thinking about.

1 Like

This clears it up for me that RAFT with replace hashing completely and decide which all nodes to save the data to based on quorum and replication factor. Thanks.

and adding more nodes. Basically I was keen to understand how the balancing of data/node loads will happen as the available hardware resources change. And I suppose the answer is Dgraph is going to delegate it to RAFT.

Yeah, that’s right.

Just to clarify – we’ll continue to use the hash purely for bulk data loading before any live instances are brought up – largely targeting the first time usage of Dgraph. Once Dgraph server runs, it would generate this global map, which would then become the go-to place for the distribution of predicates among servers.

Any further rearranging would happen via this global map handled by RAFT.

2 Likes

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.