Using Go client for Dgraphloader

I was trying to use the Go client for the Dgraphloader and did some initial analysis for the 22 Million RDF dataset. The numbers are similar to the HTTP client though the server side processing should hopefully be faster given that RDF parsing happens at the client in this case.

Processing ../../../benchmarks/data/22million.rdf.gz
Number of mutations run   : 22054: 22053051 RDFs per second:   74530
Number of RDFs processed  : 22053051
Time spent                : 4m57.26096832s
RDFs processed per second : 74252

There a couple of things which need to still be done to get more accurate analysis

  1. Datetime is not supported right now by the Go client. Need to add support for it and verify that backup matches with the source.
  2. There is some copying going on between rdf.NQuad → graph.Nquad and then back to rdf.NQuad. That can be avoided.

I wonder if we should spend more time on this given the present numbers. What do you think @mrjn?

The code is at https://github.com/dgraph-io/dgraph/blob/loader-goclient/cmd/dgraphloader/main.go

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.