Bulk loader still OOM during reduce phase

What I want to do

Import an existing data set that consists of around 350M RDF quads. There are a lot of uids but I don’t have precise statistics on those.

There are also a large number of unique predicates… I don’t have an exact count, but when I tried to create a schema file that referenced them all (including types), it was about 30GiB uncompressed.

What I did

dgraph bulk --schema bulk-atl/out.schema --files bulk-atl --num_go_routines 1 --mapoutput_mb 1024 --zero dgraph-zero-2.dgraph-zero:5080 --tmp tmp --partition_mb 2

and various alternatives of the above with different sizes for the map files (down to 64MB) and the default --partition_mb value.

I have found earlier reports that have been addressed in the latest builds with the addition of jemalloc, but my build has that and I have failed to import none-the-less.

It would be important to note that I running in a kubernetes cluster monopolizing progressively larger nodes to where my most recent tests have been running on vms with 256GiB RAM; however, this doesn’t seem to matter as the process dies with

runtime: out of memory: cannot allocate 4194304-byte block (83409764352 in use)

When usage is at around 100GiB

I understand that this may be systems management issue, but while I pursue that potential problem, I did want to get some comments on the feasibility and resource requirements of importing a data set like mine.

I have not yet tried to increase the number of map or reduce shards as suggested in other discussions since the help docs state that can only increase memory, but I will try that while waiting on a response for completeness.

Thanks for any help.

Dgraph metadata

dgraph bulk identifying logs graph version : v21.03.1 Dgraph codename : rocket-1 Dgraph SHA-256 : a00b73d583a720aa787171e43b4cb4dbbf75b38e522f66c9943ab2f0263007fe Commit SHA-1 : ea1cb5f35 Commit timestamp : 2021-06-17 20:38:11 +0530 Branch : HEAD Go version : go1.16.2 jemalloc enabled : true

For Dgraph official documentation, visit https://dgraph.io/docs.
For discussions about Dgraph , visit https://discuss.dgraph.io.
For fully-managed Dgraph Cloud , visit https://dgraph.io/cloud.

Licensed variously under the Apache Public License 2.0 and Dgraph Community License.
Copyright 2015-2021 Dgraph Labs, Inc.

___ Begin jemalloc statistics ___
Version: “5.2.1-0-gea6b3e973b477b8061e0076bb257dbd7f3faa756”
Build-time option settings
config.cache_oblivious: true
config.debug: false
config.fill: true
config.lazy_lock: false
config.malloc_conf: “background_thread:true,metadata_thp:auto”
config.opt_safety_checks: false
config.prof: true
config.prof_libgcc: true
config.prof_libunwind: false
config.stats: true
config.utrace: false
config.xmalloc: false
Run-time option settings
opt.abort: false
opt.abort_conf: false
opt.confirm_conf: false
opt.retain: true
opt.dss: “secondary”
opt.narenas: 128
opt.percpu_arena: “disabled”
opt.oversize_threshold: 8388608
opt.metadata_thp: “auto”
opt.background_thread: true (background_thread: true)
opt.dirty_decay_ms: 10000 (arenas.dirty_decay_ms: 10000)
opt.muzzy_decay_ms: 0 (arenas.muzzy_decay_ms: 0)
opt.lg_extent_max_active_fit: 6
opt.junk: “false”
opt.zero: false
opt.tcache: true
opt.lg_tcache_max: 15
opt.thp: “default”
opt.prof: false
opt.prof_prefix: “jeprof”
opt.prof_active: true (prof.active: false)
opt.prof_thread_active_init: true (prof.thread_active_init: false)
opt.lg_prof_sample: 19 (prof.lg_sample: 0)
opt.prof_accum: false
opt.lg_prof_interval: -1
opt.prof_gdump: false
opt.prof_final: false
opt.prof_leak: false
opt.stats_print: false
opt.stats_print_opts: “”
Profiling settings
prof.thread_active_init: false
prof.active: false
prof.gdump: false
prof.interval: 0
prof.lg_sample: 0
Arenas: 129
Quantum size: 16
Page size: 4096
Maximum thread-cached size class: 32768
Number of bin size classes: 36
Number of thread-cache bin size classes: 41
Number of large size classes: 196
Allocated: 4314440, active: 4362240, metadata: 11125960 (n_thp 0), resident: 15384576, mapped: 23236608, retained: 5599232
Background threads: 4, num_runs: 8, run_interval: 684055750 ns

Sorry to hear you are having this trouble - just wanted to give my bulkload experience here:

  • ~3 billion triples
  • 4 input rdf files
    • 5-10GiB gzip compressed each
    • pulled from minio
  • 4 output groups
  • 16core 32GiB machine the bulk loader was run on, never got above 25GiB memory.
  • finishes in about 1h20m
  • v21.03.1
  • I think default flags other than the output groups
1 Like

Do you really mean 30Gb just in the schema?? That doesn’t seem right. That would make it seem like every triple is a unique predicate, this just seems REALLY high for the amount of predicates and the size of the schema alone. We work with what I think is a really large schema but it is only 145Kb at 4,298 lines in GraphQL format including the auth rules. I can’t imaging a 30Gb schema file.

Yes… I mean 30Gb.

For background, I am exporting a dataset from mongo, and for simplicity, rather than chopping up all sub-documents into their own node/dgraph type, we have decided to perform our initial import and analysis by simply flattening the documents into nodes.

i.e. source document like

{
   "this": {"prop": 1, "attribute": 2},
   "is":  {"prop": 1, "attribute": 2},
   "a":  {"prop": 1, "attribute": 2},
   "doc":  {"prop": 1, "attribute": 2},
}

Would be come a node with the predicates:

  • this.prop
  • this.attribute
  • is.prop
  • is.attribute
  • a.prop
  • a.attribute
  • doc.prop
  • doc.attribute

If we combine this with the fact that array in documents are also flattened into the node, the number of predicates really explodes (each type has its own set of predicates with only a handful being shared across types)

The reason my schema file doesn’t include all these is because my data structures keeping track of them in my mongo export logic were causing my workstation to start swapping heavily and eventually run out of virtual memory at around 80GiB, so I decided to just punt and add to the schema manually as our evaluation dictates.

Anyway, I now realize that this approach was excessively naive and will definitely need to be reconsidered, but I was hoping the dataset would still be useful for analyzing the perfomance/functionality of graph lookups where they do happen (we do have foreign key style references in the mongo docs and those are what will convert to node references in dgraph)

If I get the feedback dgraph will struggle with disproportionately large numbers of predicates… especially in the bulk load situation, I can work at restructuring the input data to be more usable.

Impressive - I read that as 30GB in RDF not schema. For comparison on our production system (with the 3bn triples) we currently have 8422 predicates (and growing), and see no issues in sight on that front.

1 Like

Maybe @MichelDiz has experience with such a large schema? I don’t think there is a hard limit on predicate count or schema size, but just processing a schema that would be over 30Gb is :exploding_head:

I still can’t fathom how many predicates that is. Our whole dataset only uses 0.5Gb.

1 Like

To be clear, and in case it matters, the schema definition I am actually using is not that large.
The one I am using omits most predicates and just includes a small set (10) assigned to about 57 different types and is only 6.9k (562 lines).

However, those predicates omitted from the schema are still being used on nodes/rdf specs.

When I was experimenting manually with a small data set, I was able to populate nodes like this, just the data wasn’t indexed, and I had to explicitly indicate the predicates I wanted to see in queries ( expand(_all_) doesn’t work… which is fine for me)

The reason I mentioned the size of the schema that tried to include everything was just to give a general idea of the number of different predicates.

Also I might mention that there may have been a defect in my export code that wrote predicates (and the types that they were referenced by) twice, but even so, that leaves me with a schema of about 15GB.

1 Like

Right, understood that part. But a schema file is still built containing every predicate with either default or first found type. I assume this is part of the bulk loader process.

OK. Thanks.

Nope, I don’t and I’m impressed as you guys. I can’t imagine such a big schema lol.

So, you say here that you have 256GiB RAM available for that instance that is running the bulk-load? So it uses less than 50% do the available RAM? and runs OOM.

Can you share via DM the data if reproducible? I may send this to an engineer.

Pretty much… the machine/instance/VM has 256GiB ram, though my k8s configs only grant 210GiB ram to the container.

Still, the process only gets up to around 100GiB of usage before it reports out of memory and dumps a go stack trace.

When k8s kill the process due to it stepping out of bounds, it more or less just goes away.

I provided sample data via DM… and I’ll mention that I am working on trying to make the number of predicates more efficient by leveraging facets over mangling predicate names.

Thanks for sharing the dataset over DM. Your dataset has about 109 million unique predicate names in the schema. That’s way too much and explains the OOM that’s happening here.

If you can store the “unique” part of the edge as a value (e.g., 0x1> <id> "abc-123-567" .) instead of part of the predicate name then that’d greatly reduce the schema entries here.


Here’s a heap profile of the bulk loader with your data set during the map phase:

File: dgraph
Build ID: 6fee491f94a8999d0e85fe9ea8ccb3321f7ce0d0
Type: inuse_space
Time: Jul 23, 2021 at 2:07pm (PDT)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) top
Showing nodes accounting for 44.24GB, 99.88% of 44.30GB total
Dropped 73 nodes (cum <= 0.22GB)
Showing top 10 nodes out of 12
      flat  flat%   sum%        cum   cum%
      19GB 42.88% 42.88%       19GB 42.88%  github.com/dgraph-io/dgraph/dgraph/cmd/bulk.(*schemaStore).validateType
   14.15GB 31.95% 74.83%    14.15GB 31.95%  github.com/dgraph-io/dgraph/x.NamespaceAttr
   10.83GB 24.45% 99.28%    10.83GB 24.45%  github.com/dgraph-io/dgraph/dgraph/cmd/bulk.(*shardMap).shardFor
    0.27GB   0.6% 99.88%     0.27GB   0.6%  bytes.makeSlice
         0     0% 99.88%     0.27GB   0.6%  bytes.(*Buffer).Write
         0     0% 99.88%     0.27GB   0.6%  bytes.(*Buffer).grow
         0     0% 99.88%     0.27GB   0.6%  github.com/dgraph-io/dgraph/chunker.(*rdfChunker).Chunk
         0     0% 99.88%    43.99GB 99.31%  github.com/dgraph-io/dgraph/dgraph/cmd/bulk.(*loader).mapStage.func1
         0     0% 99.88%     0.27GB   0.6%  github.com/dgraph-io/dgraph/dgraph/cmd/bulk.(*loader).mapStage.func2
         0     0% 99.88%       19GB 42.89%  github.com/dgraph-io/dgraph/dgraph/cmd/bulk.(*mapper).createPostings

Most of the memory usage is due to the large number of predicates in the schema.