Alpha data skew

musiciansLyf · April 26, 2021, 3:00am

What I want to do

When looking at the resource occupancy of each pod in k8s, I found that there is a serious data skew in the alpha pod resource occupancy.What is the reason for this kind of skew and how to avoid this kind of skew?

What I did

Imported more than 20 million nodes (there is no way to count the specific amount of data).

Dgraph metadata

dgraph version

Dgraph version   : v20.11.2
Dgraph codename  : tchalla-2
Commit timestamp : 2021-02-23 13:07:17 +0530
Branch           : HEAD
Go version       : go1.15.5
jemalloc enabled : true

MichelDiz · April 26, 2021, 1:35pm

The difference in storage size might be related to unbalanced predicates. The automatically balancing happens with time but you can accelerate it with move a tablet procedure. See this link https://dgraph.io/docs/deploy/dgraph-zero/#endpoints

Cheers.

iluminae · April 26, 2021, 2:11pm

@MichelDiz - @musiciansLyf is showing the output of kubectl top pods but cut off the headers - the final column is memory, not storage.

@musiciansLyf I assume this is the leader of your group with 3 pod replicas - he will always do more.

MichelDiz · April 26, 2021, 2:30pm

In that case, the solution is simple. Just put a load balancer in front of the pods. This works well with HTTP requests. gRPC is quite hard to deal with in K8s. NGINX works nicely with gRPC balancing(Docker).

Also, if you are using Live loader. You can insert all the exposed addresses of the Alphas to Liveloader. So the liveloader will balance it for you. But all addresses needs to be exposed. Or you do it via an init pod or sidecar.

musiciansLyf · April 27, 2021, 7:50am

Yes, dgraph-alpha-2 is the alpha leader.Can you please tell me how to check the specific usage of each pvc?

musiciansLyf · April 27, 2021, 7:51am

But there were no queries or mutations at the time, so would its memory usage be so high?

MichelDiz · April 27, 2021, 11:08am

You said “Imported more than 20 million nodes”, the RAM takes some time to free.

musiciansLyf · April 27, 2021, 12:13pm

Uhhh, the node was imported about a month ago, but the memory usage is still so high now.

MichelDiz · April 27, 2021, 12:36pm

So I don’t know what it is. Put down the pods and restart it. If increases again it might happening something else.

musiciansLyf · April 28, 2021, 2:50am

After I restarted, the memory consumption seems to be the same. I will continue to observe the usage later. Is there a tool that can observe the distribution of memory consumption?

musiciansLyf · April 29, 2021, 5:54am

When can the fragmentation strategy support fragmentation by predicate?

MichelDiz · April 29, 2021, 2:44pm

Dunno, it is on the roadmap tho.

musiciansLyf · April 30, 2021, 6:13am

Okay, please sync after support.

MichelDiz · April 30, 2021, 10:42am

To be clear, when you say this you mean “Sharding at predicate level” right? - You won’t see this in the short term, probably mid 2022.

musiciansLyf · April 30, 2021, 1:46pm

Alright, got it.

Topic		Replies	Views
Issues with Dgraph running in Kubernetes (K8 Loadbalancing?) Dgraph kind:bug	6	1124	October 7, 2020
[K8s / Devops] Need to investigate OOMs of pods with setup done using Helm charts Dgraph dgraph , status:accepted , area:operations	3	626	August 14, 2020
Dgraph storage doubled in less than 24 hours Dgraph	2	332	March 18, 2021
Load Balancing in Kubernetes environment Dgraph	2	426	March 22, 2021
Dgraph Alpha Eating Up All RAM Dgraph	7	596	September 9, 2021

Alpha data skew

What I want to do

What I did

Dgraph metadata

Related topics