So after much pain and anxiety due to Alphas getting out of sync with each other (missing data), the root cause is definitely a failure on our part to ensure correct ulimit settings.
I hope we have resolved our issue long term with the below config:
We run DGraph (as root) on VM’s running Ubuntu 22.04 LTS here is the config we have set for the Alphas:
Each host is 32 CPU’s and 128GB RAM (x3)
/etc/security/limits.conf root soft nofile 1000001 root hard nofile 1000001
[email protected]:~> ulimit -n 1000001
Previously root ulimit was 1024.
Does this look appropriate?