Deploying Dgraph for 5 billion nodes and 20 billion edges

Hi, We are looking forward to deploy object graph with 5 billion nodes and 20 billion edges. We are yet to decide on certain criteria before taking the decision. Here are few of our concerns wrt Dgraph :-

  1. The largest active deployment in production we can find is of Factset with 160 million nodes and 2 billion edges. But we are looking for even bigger deployment. Are there any customers with clusters supporting greater sepcs.
  2. What will be the size of metadata associated with each node and edge ? Maybe if you can roughly give estimate as function of node size.
  3. If I deploy, say 3 node cluster, with no replica for each shard. If one node goes (permanently) down, will the recovery be initiated by dgraph zero with resharding on remaining 2 nodes or do we need to manually bring up third node with backup data ?
  4. Is the storage layer customizable or pluggable?
    It will be really helpful if you can help us with the answers of above question.

Hey @meghaag! Welcome to Dgraph and thanks for your interest!

We have been testing Dgraph with 1TB dataset. You might want to check the progress there. CC: @ashishgoswami

1 Like

Thanks @Anurag for quick response but the above thread still looks like Work In Progress. It will be great if you can share largest deployment so far in production.

Hi @megha,

Thanks for showing your interest in Dgraph.
Do you think if a call would be helpful here? We can answer your questions. Also we can understand your use case and requirements better.

cc: @dmai @dereksfoster99 @omar @anand

1 Like

Can you share the email details on which to share zoom/hangouts invite.