Note: This is a WIP. I will frequently update this space as tests are done
This page will capture the longevity tests done with bulk
and live
loader tools.
Environment
I am using a GCP machine with the following specs:
- Arch:
Linux paras-1 5.3.0-1026-gcp #28~18.04.1-Ubuntu SMP Sat Jun 6 00:09:26 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
The machines are outfitted with 1TB Disk space and 64GB memory and 16 cores.
- ulimit -n 1048576
I have changed the Max open files
ulimit to 1048576 so that the bulk
and live
loader do not run into the too many open files
error.
- DataSets
a. 1.344B RDFs: This dataset is the 21Million RDF data set concatenated 64 times. It is 56GB
b. 2.688B RDFs: 2 times (a) above. It is 112 GB.
- Cluster
1Z, 1A for Live
1Z for Bulk
Artifacts
For bulk test, I will collect the CPU, Heap and Block profiles on bulk process along with zero and bulk logs. top outputs are taken as well.
For live test, I will collect the CPU and Heap profiles on the the zero and alpha along with zero, alpha and live logs. top outputs are taken as well.
Test Results
Dgraph version v20.07.0-beta.Jun22
-
Bulk Loader, 672M.rdf = PASS
-
Bulk Loader, 1.344B rdf = FAIL (REDUCE phase blocked hogging memory)
-
Bulk Loader, 2.688B rdf = FAIL (REDUCE phase blocked hogging memory)
Dgraph version v20.03.3
-
Bulk Loader, 672M.rdf = PASS
-
Bulk Loader, 1.344B rdf = FAIL (REDUCE phase blocked hogging memory)
Notes
-
Results may vary depending on the data set, the amount of disk space and memory.
-
Always make sure to bump up the
ulimit -n
to a high value.