version: v21.03.2
Compressed data size before ingestion: 375GB
Total out size: 894GB
Zero: 32 core, 256 GB, 2TB SSD
Run: 31 hours for map phase, 11 map reduce phase
Bulk loader command
dgraph bulk -f /coldstartinput/upload/pending_predicates -s /coldstartinput/upload/rdf_schema/patient.rdf --out /coldstartoutput/out --replace_out --num_go_routines=20 --reducers=7 --format=rdf --store_xids --map_shards=20 --reduce_shards=10 > check.log &
- Bulk slows down dramatically towards the last 30%
- I can see too much swapping happening
Question:
- Does swapoff help here?
- How to turn off swapoff ? Note sure if the cloud operators allow this?