Hi all,
I would like to know if following Error really causes any data loss:
“draft.go:467] Lastcommit 10591 > current 10575. This would cause some commits to be lost.”
I found above error in Dgraph Alpha’s log, while live-loading “A bigger dataset” running 3-node Dgraph cluster built on GCP. https://tour.dgraph.io/moredata/1/
What you mean with “switching Leader in live loading”?
How many Alphas instances do you have? why “-idx=1001”? do you have more then 1001 Alphas?
Please don’t use “-b 2000” as you’re using “–badger.vlog=disk” I think you may have hdd storages, so do you have less performance. This may cause issues increasing the value of batch. Let it default or try to use SSDs or NVMe.
What you mean with “switching Leader in live loading”?
I found the error after new leader had elected while live loading.
I think this is triggered by high load condition.
Here is the log around the error but I modified little bit, combined and add node name each line.
How many Alphas instances do you have? why “-idx=1001”? do you have more then 1001 Alphas?
I have 3 Alphas.
node1: Zero idx:1, Alpha idx:1001
node2: Zero idx:2, Alpha idx:1002
node3: Zero idx:3, Alpha idx:1003
Please don’t use “-b 2000” as you’re using “–badger.vlog=disk” I think you may have hdd storages, so do you have less performance. This may cause issues increasing the value of batch. Let it default or try to use SSDs or NVMe.
The reason why I used -b 2000 is to know how Dgraph behave in high load situation.
However I will use --badger.vlog=mmap and don’t use -b 2000 in normal operation.
I believed --badger.vlog=disk gives me more safety because vlog is WAL and it must be flushed to storages in RDBMS like PostgreSQL.
Can you share your specs?
on GCP:
n1-standard-2 (vCPU x 2, RAM 7.5 GB), Standard disk (it should be HDD, not SSD)
In my opinion (This is a personal comment) - if you are going to use HDD, you would necessarily need to increase the amount of memory and consequently the lru_mb cache. HDDs are very slow, the fastest of them with 15k RPM has 400 IOPS - And the most basic SSD has 5K IOPS and an NVMe has around 120K IOPS, up to 10 million IOPS read. In theory a DDR4 RAM can give you 1.7 million IOPS write. SSD, NVMe and RAM have in common low latency and fast access.
Realize? more memory resolves physical storage bottleneck problems.
When we are talking about Dgraph, this is a DB designed to use the maximum of SSDs or NVMe. If you use HDD you have to compensate for this. And compensate a lot because in this hypothesis you are tripling the work of the Dgraph. With less memory and greater work you will have problems as with any DB.
On testing the Dgraph load. I’d think you’d better create a test with clients. Like this guy
This is the best way to test Dgraph. Live Load needs some adjustments to keep up with some of the changes in Dgraph in recent times, so I do not recommend using it for that purpose or increasing its default values.