Bulk loader OOM

Hello, I have 200G rdf.gz files and try to use bulk loader to create my Dgraph(latest version from docker).
And I already used --xidmap to save my xids. Below is my command:

dgraph bulk -f=<...> -s=test.schema --ignore_errors --map_shards=32 --reduce_shards=3 --xidmap=adxid -z= --log_dir=logd --badger.compression_level=16

My machine is 128G memory/4T disk
MAP process worked fine, while REDUCE process reported OOM error.

[05:47:03Z] REDUCE 02h24m35s 25.18% edge_count:748.6M edge_speed:1.771M/sec plist_count:402.2M plist_speed:951.7k/sec. Num Encoding: 0
fatal error: runtime: out of memory

xidmap file takes 44G.
and Noticing there was already someone offered some solutions. Like Bulk loader xidmap memory optimization. But current version I don’t think there are any --limitMemory options.

Guessing REDUCE process will read all map out files into memory?

So, can anybody help me? How can I avoid OOM error? How can I successful create this graph?
It is urgent, please :fearful:

(P.S. already try single machine,single zero and alpha, didn’t work too)

Hi @jokk33, Have you tried the troubleshooting options?

Thanks for replying. I am not sure about that.
--badger.vlog=disk and --lru_mb are the params for dgraph alpha.
But I just stop alpha and use dgraph bulk inside dgraph zero docker.
How can I pass those params into dgraph bulk?

This is a known issue. The team is working on the analysis of this issue.
Meanwhile, can you try using a live loader running from a separate machine from alpha?

Also, can you try increasing the number of shards using map_shards and reduce_shards flags of bulk loader. This may be helpful if you are willing to have multiple alpha groups.

