Data Distribution Across to Servers

rwer81 · December 22, 2020, 12:50pm

Hello Everyone,
I am new for DGraph. I read documents but I didn’t find exact answer.
We use HGraphDB(a tinkerpop implemention with Hbase) and we have some problems so we want to migrate another graph db. But I have some question;
We have petabytes level of linked data with many different vertex and edge types. However the big portion of edges consists of one edge type. For example, for 1PB, 800TB of edge data has same type, other 200TB differs.. I read some document that mentions DGraph stores same edges in same server!!. I don’t know it true or not.
Can anyone explain this issue or share related documents? If yes how can I handle it?

Thank you so much.

MichelDiz · December 22, 2020, 1:21pm

Yes and no. Each edge will be stored in groups. A group can have one or more edges. And each group will be stored in a single Alpha. But if you have replication, this data will be replicated* across the cluster accordingly to your cluster config.

In the future we gonna work in even more sharding, at the pred/edge level. But this will be something to 2021 Q4 I guess - dunno exact.

Read the paper, it might help Dgraph Whitepapers: In-Depth Insights and Analysis

rwer81 · December 22, 2020, 1:48pm

Thanks Michel,
So you mean, DGraph stores same edge types in a single Alpha. In my case, 800TB of data that consists same edge type will be stored in a single Alpha.

MichelDiz · December 22, 2020, 1:55pm

Yes, I think the edge sharding would be good for you in the future. There is an issue you can follow about it.

One thing you can do now is create pseudo homonyms, so each new pred would go to a different group. But that would be hard to handle in queries.

rwer81 · December 22, 2020, 2:01pm

Yes you are right.
Thank you so much.

Topic		Replies	Views
Gundb vs dgraph Misc	1	1790	April 26, 2017
Shard zero and alpha nodes in a federated flavour Dgraph kind:question	2	443	August 29, 2020
Dgraph and RocksDB Users	1	1112	July 20, 2016
Performance: storing large documents associated to nodes Users	2	388	January 11, 2020
Storing and querying historical data Users	11	2341	September 14, 2018

Data Distribution Across to Servers

Related Topics