dgraph is segmented by predicates. The same predicates will be in a group. You can view which predicates are in each group on the cluster page of ratel.
cluster Info
Alpha Nodes: 7
Zero Nodes: 1
Shard Replica: 1
(Essentially 7 groups with no replication)
That was my understanding too and I confirm from the ratel!!! But, what I don’t understand is…
This cluster was stood up about 72 hrs ago
Based on the K8S logs, all along predicates are on the move on two Alpha nodes (see attached screen capture highlighted in red circle) which I don’t understand
Same here, dgraph zero always thinks that the shards are not equal enough, so it keeps moving, so I set rebalance_ interval, disable it from auto balancing.
You can manually equalize shards in ratel.
@Valdanito thanks so much for suggesting --rebalance_interval trick.
I completely rebuilt my cluster with setting --rebalance_interval 1200000h when starting zero’s
My cluster is looking so beautiful
Although, predicates are bit imbalanced
For now, only my biggest unresolved issues is, bulk uploading more than 350GB (compressed size) data. Below is the command I run, let me know if you have similar tricks
I don’t know if your problem is that the imported file is too large. You can try this flag –badger, which is available in both alpha and bulk commands.
Using this compression setting (Snappy) provides a good compromise between the need for a high compression ratio and efficient CPU usage.
I think it depends on the performance of your machine, the size of the dataset and which indexes are used. There are no definite numbers. You need to run the import many times to get experience.
I can’t find the document of --num_go_routines. I remember it is positively related to import speed and memory consumption.