I was trying to use the Go client for the Dgraphloader and did some initial analysis for the 22 Million RDF dataset. The numbers are similar to the HTTP client though the server side processing should hopefully be faster given that RDF parsing happens at the client in this case.
Processing ../../../benchmarks/data/22million.rdf.gz Number of mutations run : 22054: 22053051 RDFs per second: 74530 Number of RDFs processed : 22053051 Time spent : 4m57.26096832s RDFs processed per second : 74252
There a couple of things which need to still be done to get more accurate analysis
- Datetime is not supported right now by the Go client. Need to add support for it and verify that backup matches with the source.
- There is some copying going on between rdf.NQuad -> graph.Nquad and then back to rdf.NQuad. That can be avoided.
I wonder if we should spend more time on this given the present numbers. What do you think @mrjn?