I used sample data to test the import on my personal computer.
Respectively used 3 ways :
live loader --new_uids=false,
ludicrous mode live loader --new_uids=false
ludicrous mode live loader --new_uids=true
The remaining parameters are default values.
In each case, I imported the same data three times repeatedly, and found that the import rate dropped severely. And when it was imported for the fourth time, the imported logs began to show a lot of timeouts.
I want to understand the following 2 questions.
- Will the import rate continue to decline? If I want to maintain the dynamic import rate at 50,000 per second, what is the correspondence between the hardware configuration and the amount of data?
- After the data is imported, I find that the “id” is gone. Is there a corresponding relationship between the id in the data and the id in the final query? Can I only find it through attributes?
benchmarks/1million.rdf.gz at master · dgraph-io/benchmarks · GitHub
System: macOS Catalina 10.15.5
CPU: 2.2GHz 4 Intel Core i7
Memory: 16GB 1600MHz DDR3
Docker compose file command
dgraph alpha --my=alpha:7080 --lru_mb=2048 --zero=zero:5080 --ludicrous_mode