RocksDB PrepareForBulkLoad

To try to speed up loader, I turned on the bulkload option from RocksDB. However, ti seems to me that there is little gain here.

Here is my PR:
https://github.com/dgraph-io/dgraph/compare/feature/bulkload

Here are the CPU and memory profiles.

https://drive.google.com/open?id=0B67vVtF8Pcy1Vk9MMXVDWllHQUE

I didn’t touch my machine as it does all the profiling. I did four runs in total (mem/cpu) x (bulk load or not).

It seems to me that the differences are within noise levels and I don’t think we should try to conclude anything from it, except that RocksDB’s bulkload has little impact on dgraphloader. In fact, runtime.cgocall seems to take less time without bulkload than with bulkload…

Previously, I also did some runs with assigner and loader, and it seems that bulkload actually slows down everything significantly. I’m not sure why that didn’t show up in the profiling of only the loader.

Unless anyone objects @minions, my take is not to use bulk load and there is no need to submit / review the PR.

1 Like

How about using 20 compaction threads, as they were using in their test case?

Test 3.

Also, can we look into this:
https://www.facebook.com/notes/facebook-engineering/scalable-memory-allocation-using-jemalloc/480222803919

Good point. I will try these out. It seems that jemalloc is already being used for the embedded build (search for cgo_flags_jemalloc.go).

1 Like

Will post cpu profiles tomorrow.

Quick update: No significant difference in running time when I try max_background_compact=20, 10, 1 (default).

And no significant difference between jemalloc and stdmalloc. Note that jemalloc is the default malloc for embedded build. To disable it, you do go build -tags="embed stdmalloc". I run the two different loader binaries (with different file sizes) and see no significant difference in running time.

Note: On my machine running time can fluctuate quite a bit. It varies between 7.5min+ to 8min+ for loading only rdf-films.gz. It is hard to draw any conclusion if the improvement is small. Perhaps, I should run some benchmarks instead.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.