Where is bulk uploader /out folder

Question:

  • Where is /out folder created, bulk upload seems to have run successfully?
    – We run bulk uploader from /dgraph directory on Zero
    – We launch bulk uploader multiple time as our data pipeline keeps writing .rdf.gz data files
    – The /out folder was not found in /dgraph, our data files folder or our schama folder

Below steps we follow…

  • Made sure at least one zero was running
  • Brought one alpha that was blocked with init container (thanks to helm chart)
  • Executed bulk uploader command from /dgraph folder on zero
  • We used a cronjob that wakes up every 1 minute to launch bulk loader command, if there are any files in ${files_in_ready_state} folder
     dgraph bulk -f ${files_in_ready_state} -s ${schemaFile} --format=rdf --xidmap xid --map_shards=3 --reduce_shards=3 --zero=dgraph-dgraph-zero:5080
    
  • Each file was about 250 MB in size

Bulk uploader long file

Starting Bulk uploader
/dgraph
Checking for files to process in /coldstart/upload/pending_vertices folder
Checking for files to process in /coldstart/upload/pending_edges folder
Moving file /coldstart/upload/pending_edges/1644023377.rdf.gz to /coldstart/upload/ready/1644023377.rdf.gz

I0205 02:30:05.379012     107 init.go:110]

Dgraph version   : v21.03.1
Dgraph codename  : rocket-1
Dgraph SHA-256   : a00b73d583a720aa787171e43b4cb4dbbf75b38e522f66c9943ab2f0263007fe
Commit SHA-1     : ea1cb5f35
Commit timestamp : 2021-06-17 20:38:11 +0530
Branch           : HEAD
Go version       : go1.16.2
jemalloc enabled : true

For Dgraph official documentation, visit https://dgraph.io/docs.
For discussions about Dgraph     , visit http://discuss.dgraph.io.
For fully-managed Dgraph Cloud   , visit https://dgraph.io/cloud.

Licensed variously under the Apache Public License 2.0 and Dgraph Community License.
Copyright 2015-2021 Dgraph Labs, Inc.


Encrypted input: false; Encrypted output: false
{
        "DataFiles": "/coldstart/upload/ready",
        "DataFormat": "rdf",
        "SchemaFile": "/coldstart/upload/rdf_schema/patient_with_intermediate.rdf",
        "GqlSchemaFile": "",
        "OutDir": "./out",
        "ReplaceOutDir": false,
        "TmpDir": "tmp",
        "NumGoroutines": 2,
        "MapBufSize": 2147483648,
        "PartitionBufSize": 4194304,
        "SkipMapPhase": false,
        "CleanupTmp": true,
        "NumReducers": 1,
        "Version": false,
        "StoreXids": false,
        "ZeroAddr": "dgraph-dgraph-zero:5080",
        "HttpAddr": "localhost:8080",
        "IgnoreErrors": false,
        "CustomTokenizers": "",
        "NewUids": false,
        "ClientDir": "xid",
        "Encrypted": false,
        "EncryptedOut": false,
        "MapShards": 3,
        "ReduceShards": 3,
        "Namespace": 18446744073709551615,
        "EncryptionKey": null,
        "Badger": {
                "Dir": "",
                "ValueDir": "",
                "SyncWrites": false,
                "NumVersionsToKeep": 1,
                "ReadOnly": false,
                "Logger": {},
                "Compression": 1,
                "InMemory": false,
                "MetricsEnabled": true,
                "NumGoroutines": 8,
                "MemTableSize": 67108864,
                "BaseTableSize": 2097152,
                "BaseLevelSize": 10485760,
                "LevelSizeMultiplier": 10,
                "TableSizeMultiplier": 2,
                "MaxLevels": 7,
                "VLogPercentile": 0,
                "ValueThreshold": 1048576,
                "NumMemtables": 5,
                "BlockSize": 4096,
                "BloomFalsePositive": 0.01,
                "BlockCacheSize": 20132659,
                "IndexCacheSize": 46976204,
                "NumLevelZeroTables": 5,
                "NumLevelZeroTablesStall": 15,
                "ValueLogFileSize": 1073741823,
                "ValueLogMaxEntries": 1000000,
                "NumCompactors": 4,
                "CompactL0OnClose": false,
                "LmaxCompaction": false,
                "ZSTDCompressionLevel": 0,
                "VerifyValueChecksum": false,
                "EncryptionKey": "",
                "EncryptionKeyRotationDuration": 864000000000000,
                "BypassLockGuard": false,
                "ChecksumVerificationMode": 0,
                "DetectConflicts": true,
                "NamespaceOffset": -1
        }
}
Connecting to zero at dgraph-dgraph-zero:5080
Predicate "\x00\x00\x00\x00\x00\x00\x00\x00dgraph.drop.op" already exists in schema
Predicate "\x00\x00\x00\x00\x00\x00\x00\x00dgraph.graphql.p_query" already exists in schema
Predicate "\x00\x00\x00\x00\x00\x00\x00\x00dgraph.graphql.schema" already exists in schema
Predicate "\x00\x00\x00\x00\x00\x00\x00\x00dgraph.graphql.xid" already exists in schema
Predicate "\x00\x00\x00\x00\x00\x00\x00\x00xid" already exists in schema
___ Begin jemalloc statistics ___
Version: "5.2.1-0-gea6b3e973b477b8061e0076bb257dbd7f3faa756"
Build-time option settings
  config.cache_oblivious: true
  config.debug: false
  config.fill: true
  config.lazy_lock: false
  config.malloc_conf: "background_thread:true,metadata_thp:auto"
  config.opt_safety_checks: false
  config.prof: true
  config.prof_libgcc: true
  config.prof_libunwind: false
  config.stats: true
  config.utrace: false
  config.xmalloc: false
Run-time option settings
  opt.abort: false
  opt.abort_conf: false
  opt.confirm_conf: false
  opt.retain: true
  opt.dss: "secondary"
  opt.narenas: 32
  opt.percpu_arena: "disabled"
  opt.oversize_threshold: 8388608
  opt.metadata_thp: "auto"
  opt.background_thread: true (background_thread: true)
  opt.dirty_decay_ms: 10000 (arenas.dirty_decay_ms: 10000)
  opt.muzzy_decay_ms: 0 (arenas.muzzy_decay_ms: 0)
  opt.lg_extent_max_active_fit: 6
  opt.junk: "false"
  opt.zero: false
  opt.tcache: true
  opt.lg_tcache_max: 15
  opt.thp: "default"
  opt.prof: false
  opt.prof_prefix: "jeprof"
  opt.prof_active: true (prof.active: false)
  opt.prof_thread_active_init: true (prof.thread_active_init: false)
  opt.lg_prof_sample: 19 (prof.lg_sample: 0)
  opt.prof_accum: false
  opt.lg_prof_interval: -1
  opt.prof_gdump: false
  opt.prof_final: false
  opt.prof_leak: false
  opt.stats_print: false
  opt.stats_print_opts: ""
Profiling settings
  prof.thread_active_init: false
  prof.active: false
  prof.gdump: false
  prof.interval: 0
  prof.lg_sample: 0
Arenas: 33
Quantum size: 16
Page size: 4096
Maximum thread-cached size class: 32768
Number of bin size classes: 36
Number of thread-cache bin size classes: 41
Number of large size classes: 196
Allocated: 88264, active: 126976, metadata: 5647480 (n_thp 0), resident: 5705728, mapped: 16904192, retained: 4067328
Background threads: 3, num_runs: 3, run_interval: 190320333 ns
--- End jemalloc statistics ---
badger 2022/02/05 02:30:05 INFO: All 0 tables opened in 0s
badger 2022/02/05 02:30:05 INFO: Discard stats nextEmptySlot: 0
badger 2022/02/05 02:30:05 INFO: Set nextTxnTs to 0
I0205 02:30:05.471063     107 xidmap.go:145] Loaded up 0 xid to uid mappings
Processing file (1 out of 50): /coldstart/upload/ready/1644023377.rdf.gz
[02:31:57Z] MAP 01m52s nquad_count:33.98M err_count:0.000 nquad_speed:303.3k/sec edge_count:38.73M edge_speed:345.8k/sec jemalloc: 192 MiB
I0205 02:33:08.632923     107 xidmap.go:365] Writing xid map to DB
[02:33:09Z] MAP 03m04s nquad_count:35.37M err_count:0.000 nquad_speed:192.2k/sec edge_count:40.32M edge_speed:219.1k/sec jemalloc: 0 B
badger 2022/02/05 02:33:22 INFO: [3] [E] LOG Compact 0->6 (5, 0 -> 25 tables with 1 splits). [00001 00002 00003 00004 00005 . .] -> [00006 00007 00008 00009 00010 00011 00012 00013 00014 00015 00016 00017 00018 00019 00020 00021 00022 00023 00024 00025 00026 00027 00028 00029 00031 .], took 2.715s
[02:33:22Z] MAP 03m17s nquad_count:35.37M err_count:0.000 nquad_speed:179.5k/sec edge_count:40.32M edge_speed:204.7k/sec jemalloc: 84 MiB
I0205 02:33:28.635704     107 xidmap.go:367] Finished writing xid map to DB
badger 2022/02/05 02:33:28 INFO: Lifetime L0 stalled for: 0s
badger 2022/02/05 02:33:28 INFO:
Level 0 [ ]: NumTables: 05. Size: 42 MiB of 0 B. Score: 1.00->100.00 StaleData: 0 B Target FileSize: 64 MiB
Level Done
Shard tmp/map_output/000 -> Reduce tmp/shards/shard_0/000
Shard tmp/map_output/002 -> Reduce tmp/shards/shard_2/002
Shard tmp/map_output/001 -> Reduce tmp/shards/shard_1/001
badger 2022/02/05 02:33:28 INFO: All 0 tables opened in 0s
badger 2022/02/05 02:33:28 INFO: Discard stats nextEmptySlot: 0
badger 2022/02/05 02:33:28 INFO: Set nextTxnTs to 0
badger 2022/02/05 02:33:28 INFO: All 0 tables opened in 0s
badger 2022/02/05 02:33:28 INFO: Discard stats nextEmptySlot: 0
badger 2022/02/05 02:33:28 INFO: Set nextTxnTs to 0
badger 2022/02/05 02:33:28 INFO: DropAll called. Blocking writes...
badger 2022/02/05 02:33:28 INFO: Writes flushed. Stopping compactions now...
badger 2022/02/05 02:33:29 INFO: Deleted 0 SSTables. Now deleting value logs...
badger 2022/02/05 02:33:29 INFO: Value logs deleted. Creating value log file: 1
badger 2022/02/05 02:33:29 INFO: Deleted 1 value log files. DropAll done.
Num Encoders: 2
[02:33:29Z] REDUCE 03m24s 0.00% edge_count:0.000 edge_speed:0.000/sec plist_count:0.000 plist_speed:0.000/sec. Num Encoding MBs: 0. jemalloc: 320 MiB
[02:33:30Z] REDUCE 03m25s 0.00% edge_count:0.000 edge_speed:0.000/sec plist_count:0.000 plist_speed:0.000/sec. Num Encoding MBs: 518. jemalloc: 1.6 GiB
Final Histogram of buffer sizes:
 -- Histogram:
Min value: 62037129
Max value: 271945775
Count: 4
50p: 536870912.00
75p: 536870912.00
90p: 536870912.00
[33554432, 67108864) 1 25.00% 25.00%
[268435456, 536870912) 3 75.00% 100.00%
 --

[02:33:31Z] REDUCE 03m26s 0.00% edge_count:0.000 edge_speed:0.000/sec plist_count:0.000 plist_speed:0.000/sec. Num Encoding MBs: 834. jemalloc: 2.2 GiB
[02:33:32Z] REDUCE 03m27s 0.15% edge_count:59.96k edge_speed:19.99k/sec plist_count:59.96k plist_speed:19.99k/sec. Num Encoding MBs: 834. jemalloc: 2.3 GiB
[02:33:33Z] REDUCE 03m28s 5.30% edge_count:2.137M edge_speed:534.1k/sec plist_count:929.6k plist_speed:232.4k/sec. Num Encoding MBs: 834. jemalloc: 2.4 GiB
[02:33:34Z] REDUCE 03m29s 7.87% edge_count:3.173M edge_speed:634.6k/sec plist_count:1.966M plist_speed:393.2k/sec. Num Encoding MBs: 834. jemalloc: 2.4 GiB
[02:33:35Z] REDUCE 03m30s 14.58% edge_count:5.878M edge_speed:979.6k/sec plist_count:3.178M plist_speed:529.7k/sec. Num Encoding MBs: 834. jemalloc: 2.4 GiB
[02:33:36Z] REDUCE 03m31s 19.71% edge_count:7.949M edge_speed:1.135M/sec plist_count:3.442M plist_speed:491.7k/sec. Num Encoding MBs: 315. jemalloc: 1.8 GiB
Finishing stream id: 1
[02:33:37Z] REDUCE 03m32s 20.78% edge_count:8.380M edge_speed:1.047M/sec plist_count:3.735M plist_speed:466.8k/sec. Num Encoding MBs: 315. jemalloc: 2.5 GiB
Finishing stream id: 11
Finishing stream id: 12
badger 2022/02/05 02:33:37 INFO: Table created: 2 at level: 6 for stream: 11. Size: 1.0 MiB
badger 2022/02/05 02:33:37 INFO: Table created: 3 at level: 6 for stream: 12. Size: 1.3 MiB
badger 2022/02/05 02:33:38 INFO: Table created: 1 at level: 6 for stream: 1. Size: 36 MiB
Finishing stream id: 2
[02:33:38Z] REDUCE 03m33s 21.18% edge_count:8.541M edge_speed:949.0k/sec plist_count:3.845M plist_speed:427.2k/sec. Num Encoding MBs: 256. jemalloc: 2.8 GiB
badger 2022/02/05 02:33:38 INFO: Table created: 4 at level: 6 for stream: 2. Size: 1.8 MiB
Finishing stream id: 3
Finishing stream id: 4
Finishing stream id: 5
Finishing stream id: 6
Finishing stream id: 7
badger 2022/02/05 02:33:38 INFO: Table created: 8 at level: 6 for stream: 6. Size: 72 KiB
badger 2022/02/05 02:33:38 INFO: Table created: 7 at level: 6 for stream: 5. Size: 87 KiB
badger 2022/02/05 02:33:38 INFO: Table created: 6 at level: 6 for stream: 4. Size: 78 KiB
badger 2022/02/05 02:33:39 INFO: Table created: 9 at level: 6 for stream: 7. Size: 90 KiB
Finishing stream id: 8
Finishing stream id: 9
[02:33:39Z] REDUCE 03m34s 22.06% edge_count:8.896M edge_speed:889.5k/sec plist_count:4.200M plist_speed:420.0k/sec. Num Encoding MBs: 256. jemalloc: 2.8 GiB
badger 2022/02/05 02:33:39 INFO: Table created: 11 at level: 6 for stream: 9. Size: 85 KiB
badger 2022/02/05 02:33:39 INFO: Table created: 10 at level: 6 for stream: 8. Size: 3.7 MiB
badger 2022/02/05 02:33:39 INFO: Table created: 5 at level: 6 for stream: 3. Size: 24 MiB
[02:33:40Z] REDUCE 03m35s 23.95% edge_count:9.657M edge_speed:877.8k/sec plist_count:4.961M plist_speed:451.0k/sec. Num Encoding MBs: 256. jemalloc: 2.8 GiB
[02:33:41Z] REDUCE 03m36s 25.66% edge_count:10.35M edge_speed:862.2k/sec plist_count:5.652M plist_speed:470.9k/sec. Num Encoding MBs: 256. jemalloc: 2.7 GiB
[02:33:42Z] REDUCE 03m37s 27.03% edge_count:10.90M edge_speed:838.5k/sec plist_count:6.205M plist_speed:477.3k/sec. Num Encoding MBs: 256. jemalloc: 2.7 GiB
[02:33:43Z] REDUCE 03m38s 29.05% edge_count:11.71M edge_speed:835.9k/sec plist_count:6.757M plist_speed:482.2k/sec. Num Encoding MBs: 256. jemalloc: 2.8 GiB
Finishing stream id: 10
Finishing stream id: 14
Finishing stream id: 15
[02:33:44Z] REDUCE 03m39s 29.11% edge_count:11.74M edge_speed:782.6k/sec plist_count:6.775M plist_speed:451.7k/sec. Num Encoding MBs: 0. jemalloc: 2.3 GiB
badger 2022/02/05 02:33:44 INFO: Table created: 13 at level: 6 for stream: 14. Size: 91 KiB
badger 2022/02/05 02:33:44 INFO: Table created: 14 at level: 6 for stream: 15. Size: 89 KiB
badger 2022/02/05 02:33:44 INFO: Table created: 12 at level: 6 for stream: 10. Size: 11 MiB
Finishing stream id: 16
badger 2022/02/05 02:33:45 INFO: Table created: 15 at level: 6 for stream: 16. Size: 15 MiB
[02:33:45Z] REDUCE 03m40s 29.11% edge_count:11.74M edge_speed:733.7k/sec plist_count:6.775M plist_speed:423.4k/sec. Num Encoding MBs: 0. jemalloc: 2.1 GiB
Finishing stream id: 17
Finishing stream id: 18
badger 2022/02/05 02:33:45 INFO: Table created: 17 at level: 6 for stream: 18. Size: 692 KiB
badger 2022/02/05 02:33:45 INFO: Table created: 16 at level: 6 for stream: 17. Size: 21 MiB
Writing split lists back to the main DB now
badger 2022/02/05 02:33:45 INFO: copying split keys to main DB Streaming about 0 B of uncompressed data (0 B on disk)
badger 2022/02/05 02:33:45 INFO: Number of ranges found: 1
badger 2022/02/05 02:33:45 INFO: Sent range 0 for iteration: [, ) of size: 0 B
badger 2022/02/05 02:33:45 INFO: copying split keys to main DB Sent data of size 0 B
badger 2022/02/05 02:33:46 INFO: Table created: 18 at level: 6 for stream: 13. Size: 10 MiB
badger 2022/02/05 02:33:46 INFO: Resuming writes
badger 2022/02/05 02:33:46 INFO: All 0 tables opened in 0s
badger 2022/02/05 02:33:46 INFO: Discard stats nextEmptySlot: 0
badger 2022/02/05 02:33:46 INFO: Set nextTxnTs to 0
badger 2022/02/05 02:33:46 INFO: All 0 tables opened in 0s
badger 2022/02/05 02:33:46 INFO: Discard stats nextEmptySlot: 0
badger 2022/02/05 02:33:46 INFO: Set nextTxnTs to 0
badger 2022/02/05 02:33:46 INFO: DropAll called. Blocking writes...
badger 2022/02/05 02:33:46 INFO: Writes flushed. Stopping compactions now...
badger 2022/02/05 02:33:46 INFO: Deleted 0 SSTables. Now deleting value logs...
badger 2022/02/05 02:33:46 INFO: Value logs deleted. Creating value log file: 1
badger 2022/02/05 02:33:46 INFO: Deleted 1 value log files. DropAll done.
Num Encoders: 2
[02:33:46Z] REDUCE 03m41s 29.11% edge_count:11.74M edge_speed:690.5k/sec plist_count:6.775M plist_speed:398.5k/sec. Num Encoding MBs: 0. jemalloc: 1.6 GiB
[02:33:47Z] REDUCE 03m42s 29.11% edge_count:11.74M edge_speed:652.1k/sec plist_count:6.775M plist_speed:376.4k/sec. Num Encoding MBs: 519. jemalloc: 2.9 GiB
Final Histogram of buffer sizes:
 -- Histogram:
Min value: 142021035
Max value: 273380596
Count: 3
50p: 268435456.00
75p: 536870912.00
90p: 536870912.00
[134217728, 268435456) 1 33.33% 33.33%
[268435456, 536870912) 2 66.67% 100.00%
 --

[02:33:48Z] REDUCE 03m43s 29.11% edge_count:11.74M edge_speed:617.8k/sec plist_count:6.775M plist_speed:356.6k/sec. Num Encoding MBs: 655. jemalloc: 3.2 GiB
[02:33:49Z] REDUCE 03m44s 29.11% edge_count:11.74M edge_speed:586.9k/sec plist_count:6.775M plist_speed:338.7k/sec. Num Encoding MBs: 655. jemalloc: 3.0 GiB
[02:33:50Z] REDUCE 03m45s 30.55% edge_count:12.32M edge_speed:586.6k/sec plist_count:7.356M plist_speed:350.3k/sec. Num Encoding MBs: 655. jemalloc: 3.3 GiB
[02:33:51Z] REDUCE 03m46s 34.95% edge_count:14.09M edge_speed:640.6k/sec plist_count:8.468M plist_speed:384.9k/sec. Num Encoding MBs: 655. jemalloc: 3.2 GiB
[02:33:52Z] REDUCE 03m47s 38.12% edge_count:15.37M edge_speed:668.2k/sec plist_count:9.743M plist_speed:423.6k/sec. Num Encoding MBs: 655. jemalloc: 3.2 GiB
[02:33:53Z] REDUCE 03m48s 40.71% edge_count:16.42M edge_speed:684.0k/sec plist_count:10.79M plist_speed:449.6k/sec. Num Encoding MBs: 655. jemalloc: 3.0 GiB
[02:33:54Z] REDUCE 03m49s 49.54% edge_count:19.98M edge_speed:799.1k/sec plist_count:11.58M plist_speed:463.2k/sec. Num Encoding MBs: 394. jemalloc: 2.6 GiB
[02:33:55Z] REDUCE 03m50s 50.29% edge_count:20.28M edge_speed:779.9k/sec plist_count:11.84M plist_speed:455.5k/sec. Num Encoding MBs: 135. jemalloc: 2.2 GiB
Finishing stream id: 21
[02:33:56Z] REDUCE 03m51s 54.56% edge_count:22.00M edge_speed:814.8k/sec plist_count:12.05M plist_speed:446.5k/sec. Num Encoding MBs: 0. jemalloc: 2.6 GiB
badger 2022/02/05 02:33:56 INFO: Table created: 1 at level: 6 for stream: 21. Size: 11 MiB
Finishing stream id: 24
Finishing stream id: 27
Finishing stream id: 28
Finishing stream id: 29
badger 2022/02/05 02:33:57 INFO: Table created: 4 at level: 6 for stream: 28. Size: 59 KiB
badger 2022/02/05 02:33:57 INFO: Table created: 5 at level: 6 for stream: 29. Size: 69 KiB
badger 2022/02/05 02:33:57 INFO: Table created: 3 at level: 6 for stream: 27. Size: 2.6 MiB
[02:33:57Z] REDUCE 03m52s 54.56% edge_count:22.00M edge_speed:785.6k/sec plist_count:12.05M plist_speed:430.5k/sec. Num Encoding MBs: 0. jemalloc: 2.4 GiB
badger 2022/02/05 02:33:57 INFO: Table created: 2 at level: 6 for stream: 24. Size: 27 MiB
Finishing stream id: 20
Finishing stream id: 22
[02:33:58Z] REDUCE 03m53s 54.56% edge_count:22.00M edge_speed:758.6k/sec plist_count:12.05M plist_speed:415.7k/sec. Num Encoding MBs: 0. jemalloc: 2.4 GiB
badger 2022/02/05 02:33:58 INFO: Table created: 7 at level: 6 for stream: 22. Size: 1.7 MiB
badger 2022/02/05 02:33:58 INFO: Table created: 6 at level: 6 for stream: 20. Size: 12 MiB
Finishing stream id: 23
Finishing stream id: 25
Finishing stream id: 26
Finishing stream id: 30
Finishing stream id: 31
badger 2022/02/05 02:33:59 INFO: Table created: 8 at level: 6 for stream: 23. Size: 21 MiB
badger 2022/02/05 02:33:59 INFO: Table created: 9 at level: 6 for stream: 25. Size: 2.9 MiB
badger 2022/02/05 02:33:59 INFO: Table created: 11 at level: 6 for stream: 30. Size: 84 KiB
badger 2022/02/05 02:33:59 INFO: Table created: 12 at level: 6 for stream: 31. Size: 9.5 KiB
badger 2022/02/05 02:33:59 INFO: Table created: 10 at level: 6 for stream: 26. Size: 254 KiB
Finishing stream id: 32
Finishing stream id: 33
Finishing stream id: 34
Finishing stream id: 35
Finishing stream id: 36
Writing split lists back to the main DB now
badger 2022/02/05 02:33:59 INFO: copying split keys to main DB Streaming about 0 B of uncompressed data (0 B on disk)
badger 2022/02/05 02:33:59 INFO: Number of ranges found: 2
badger 2022/02/05 02:33:59 INFO: Sent range 0 for iteration: [, 040000000000000000001c456e636f756e7465725479706554636f64652e656e636f756e74657200000000000483172b0000000000000001fffffffffffffff6) of size: 0 B
badger 2022/02/05 02:33:59 INFO: Sent range 1 for iteration: [040000000000000000001c456e636f756e7465725479706554636f64652e656e636f756e74657200000000000483172b0000000000000001fffffffffffffff6, ) of size: 0 B
badger 2022/02/05 02:33:59 INFO: copying split keys to main DB Sent data of size 1.1 MiB
[02:33:59Z] REDUCE 03m54s 54.56% edge_count:22.00M edge_speed:733.3k/sec plist_count:12.05M plist_speed:401.8k/sec. Num Encoding MBs: 0. jemalloc: 2.3 GiB
badger 2022/02/05 02:33:59 INFO: Table created: 14 at level: 6 for stream: 33. Size: 99 KiB
badger 2022/02/05 02:33:59 INFO: Table created: 15 at level: 6 for stream: 34. Size: 1.5 MiB
badger 2022/02/05 02:33:59 INFO: Table created: 16 at level: 6 for stream: 35. Size: 1.9 MiB
badger 2022/02/05 02:33:59 INFO: Table created: 19 at level: 6 for stream: 37. Size: 87 KiB
badger 2022/02/05 02:33:59 INFO: Table created: 18 at level: 6 for stream: 40. Size: 515 KiB
badger 2022/02/05 02:33:59 INFO: Table created: 20 at level: 6 for stream: 39. Size: 172 KiB
badger 2022/02/05 02:33:59 INFO: Table created: 13 at level: 6 for stream: 32. Size: 5.3 MiB
badger 2022/02/05 02:33:59 INFO: Table created: 17 at level: 6 for stream: 36. Size: 11 KiB
badger 2022/02/05 02:33:59 INFO: Resuming writes
badger 2022/02/05 02:33:59 INFO: All 0 tables opened in 0s
badger 2022/02/05 02:33:59 INFO: Discard stats nextEmptySlot: 0
badger 2022/02/05 02:33:59 INFO: Set nextTxnTs to 0
badger 2022/02/05 02:33:59 INFO: All 0 tables opened in 0s
badger 2022/02/05 02:33:59 INFO: Discard stats nextEmptySlot: 0
badger 2022/02/05 02:33:59 INFO: Set nextTxnTs to 0
badger 2022/02/05 02:33:59 INFO: DropAll called. Blocking writes...
badger 2022/02/05 02:33:59 INFO: Writes flushed. Stopping compactions now...
badger 2022/02/05 02:33:59 INFO: Deleted 0 SSTables. Now deleting value logs...
badger 2022/02/05 02:33:59 INFO: Value logs deleted. Creating value log file: 1
badger 2022/02/05 02:33:59 INFO: Deleted 1 value log files. DropAll done.
Num Encoders: 2
[02:34:00Z] REDUCE 03m55s 54.56% edge_count:22.00M edge_speed:709.6k/sec plist_count:12.05M plist_speed:388.8k/sec. Num Encoding MBs: 259. jemalloc: 2.5 GiB
[02:34:01Z] REDUCE 03m56s 54.56% edge_count:22.00M edge_speed:687.5k/sec plist_count:12.05M plist_speed:376.7k/sec. Num Encoding MBs: 517. jemalloc: 3.5 GiB
[02:34:02Z] REDUCE 03m57s 54.56% edge_count:22.00M edge_speed:666.6k/sec plist_count:12.05M plist_speed:365.3k/sec. Num Encoding MBs: 1033. jemalloc: 4.0 GiB
Final Histogram of buffer sizes:
 -- Histogram:
Min value: 27462396
Max value: 273001538
Count: 6
50p: 536870912.00
75p: 536870912.00
90p: 536870912.00
[16777216, 33554432) 1 16.67% 16.67%
[268435456, 536870912) 5 83.33% 100.00%
 --

[02:34:03Z] REDUCE 03m58s 54.56% edge_count:22.00M edge_speed:647.0k/sec plist_count:12.05M plist_speed:354.5k/sec. Num Encoding MBs: 1320. jemalloc: 4.4 GiB
[02:34:04Z] REDUCE 03m59s 58.39% edge_count:23.54M edge_speed:672.7k/sec plist_count:13.17M plist_speed:376.3k/sec. Num Encoding MBs: 1320. jemalloc: 4.5 GiB
[02:34:05Z] REDUCE 04m00s 61.05% edge_count:24.62M edge_speed:683.8k/sec plist_count:14.24M plist_speed:395.6k/sec. Num Encoding MBs: 1320. jemalloc: 4.5 GiB
[02:34:06Z] REDUCE 04m01s 64.43% edge_count:25.98M edge_speed:702.1k/sec plist_count:15.60M plist_speed:421.7k/sec. Num Encoding MBs: 1320. jemalloc: 4.4 GiB
[02:34:07Z] REDUCE 04m02s 70.94% edge_count:28.60M edge_speed:752.7k/sec plist_count:16.62M plist_speed:437.4k/sec. Num Encoding MBs: 1062. jemalloc: 4.0 GiB
[02:34:08Z] REDUCE 04m03s 72.53% edge_count:29.24M edge_speed:749.8k/sec plist_count:17.26M plist_speed:442.6k/sec. Num Encoding MBs: 1062. jemalloc: 3.9 GiB
[02:34:09Z] REDUCE 04m04s 74.02% edge_count:29.85M edge_speed:746.1k/sec plist_count:17.86M plist_speed:446.6k/sec. Num Encoding MBs: 1062. jemalloc: 4.2 GiB
[02:34:10Z] REDUCE 04m05s 77.45% edge_count:31.23M edge_speed:761.7k/sec plist_count:18.10M plist_speed:441.5k/sec. Num Encoding MBs: 802. jemalloc: 3.5 GiB
Finishing stream id: 39
badger 2022/02/05 02:34:11 INFO: Table created: 1 at level: 6 for stream: 39. Size: 8.4 MiB
Finishing stream id: 42
[02:34:11Z] REDUCE 04m06s 79.69% edge_count:32.13M edge_speed:765.0k/sec plist_count:18.23M plist_speed:434.1k/sec. Num Encoding MBs: 802. jemalloc: 4.3 GiB
badger 2022/02/05 02:34:11 INFO: Table created: 2 at level: 6 for stream: 42. Size: 12 MiB
Finishing stream id: 43
badger 2022/02/05 02:34:12 INFO: Table created: 3 at level: 6 for stream: 43. Size: 21 MiB
[02:34:12Z] REDUCE 04m07s 81.99% edge_count:33.06M edge_speed:768.8k/sec plist_count:18.37M plist_speed:427.1k/sec. Num Encoding MBs: 802. jemalloc: 3.9 GiB
[02:34:13Z] REDUCE 04m08s 84.25% edge_count:33.97M edge_speed:772.0k/sec plist_count:18.50M plist_speed:420.4k/sec. Num Encoding MBs: 545. jemalloc: 3.4 GiB
Finishing stream id: 40
badger 2022/02/05 02:34:13 INFO: Table created: 4 at level: 6 for stream: 40. Size: 16 MiB
[02:34:14Z] REDUCE 04m09s 85.96% edge_count:34.66M edge_speed:770.2k/sec plist_count:18.60M plist_speed:413.3k/sec. Num Encoding MBs: 545. jemalloc: 3.0 GiB
Finishing stream id: 41
Finishing stream id: 44
Finishing stream id: 45
Finishing stream id: 46
badger 2022/02/05 02:34:14 INFO: Table created: 7 at level: 6 for stream: 45. Size: 112 KiB
badger 2022/02/05 02:34:14 INFO: Table created: 6 at level: 6 for stream: 44. Size: 78 KiB
badger 2022/02/05 02:34:14 INFO: Table created: 8 at level: 6 for stream: 46. Size: 90 KiB
[02:34:15Z] REDUCE 04m10s 88.00% edge_count:35.48M edge_speed:771.3k/sec plist_count:19.02M plist_speed:413.5k/sec. Num Encoding MBs: 545. jemalloc: 3.2 GiB
badger 2022/02/05 02:34:15 INFO: Table created: 5 at level: 6 for stream: 41. Size: 41 MiB
[02:34:16Z] REDUCE 04m11s 90.65% edge_count:36.55M edge_speed:777.6k/sec plist_count:20.09M plist_speed:427.3k/sec. Num Encoding MBs: 545. jemalloc: 3.2 GiB
[02:34:17Z] REDUCE 04m12s 92.87% edge_count:37.45M edge_speed:780.1k/sec plist_count:20.98M plist_speed:437.2k/sec. Num Encoding MBs: 286. jemalloc: 2.7 GiB
[02:34:18Z] REDUCE 04m13s 100.00% edge_count:40.32M edge_speed:822.9k/sec plist_count:21.36M plist_speed:436.0k/sec. Num Encoding MBs: 0. jemalloc: 2.3 GiB
Finishing stream id: 47
Finishing stream id: 48
Finishing stream id: 49
Finishing stream id: 50
badger 2022/02/05 02:34:18 INFO: Table created: 10 at level: 6 for stream: 48. Size: 129 KiB
badger 2022/02/05 02:34:18 INFO: Table created: 11 at level: 6 for stream: 49. Size: 496 KiB
badger 2022/02/05 02:34:18 INFO: Table created: 12 at level: 6 for stream: 50. Size: 91 KiB
Finishing stream id: 51
[02:34:19Z] REDUCE 04m14s 100.00% edge_count:40.32M edge_speed:806.4k/sec plist_count:21.36M plist_speed:427.3k/sec. Num Encoding MBs: 0. jemalloc: 2.7 GiB
Finishing stream id: 52
Finishing stream id: 54
badger 2022/02/05 02:34:20 INFO: Table created: 13 at level: 6 for stream: 51. Size: 20 MiB
badger 2022/02/05 02:34:20 INFO: Table created: 9 at level: 6 for stream: 47. Size: 60 MiB
badger 2022/02/05 02:34:20 INFO: Table created: 15 at level: 6 for stream: 54. Size: 88 KiB
Finishing stream id: 55
Writing split lists back to the main DB now
badger 2022/02/05 02:34:20 INFO: copying split keys to main DB Streaming about 0 B of uncompressed data (0 B on disk)
badger 2022/02/05 02:34:20 INFO: Number of ranges found: 1
badger 2022/02/05 02:34:20 INFO: Sent range 0 for iteration: [, ) of size: 0 B
badger 2022/02/05 02:34:20 INFO: copying split keys to main DB Sent data of size 0 B
[02:34:20Z] REDUCE 04m15s 100.00% edge_count:40.32M edge_speed:790.6k/sec plist_count:21.36M plist_speed:418.9k/sec. Num Encoding MBs: 0. jemalloc: 1.8 GiB
badger 2022/02/05 02:34:20 INFO: Table created: 17 at level: 6 for stream: 53. Size: 2.5 MiB
badger 2022/02/05 02:34:20 INFO: Table created: 16 at level: 6 for stream: 55. Size: 3.6 MiB
badger 2022/02/05 02:34:20 INFO: Table created: 14 at level: 6 for stream: 52. Size: 30 MiB
badger 2022/02/05 02:34:20 INFO: Resuming writes
badger 2022/02/05 02:34:20 INFO: Lifetime L0 stalled for: 0s
badger 2022/02/05 02:34:20 INFO:
Level 0 [ ]: NumTables: 01. Size: 2.7 KiB of 0 B. Score: 0.00->0.00 StaleData: 0 B Target FileSize: 64 MiB
Level 1 [ ]: NumTables: 00. Size: 0 B of 10 MiB. Score: 0.00->0.00 StaleData: 0 B Target FileSize: 2.0 MiB
Level 2 [ ]: NumTables: 00. Size: 0 B of 10 MiB. Score: 0.00->0.00 StaleData: 0 B Target FileSize: 2.0 MiB
Level 3 [ ]: NumTables: 00. Size: 0 B of 10 MiB. Score: 0.00->0.00 StaleData: 0 B Target FileSize: 2.0 MiB
Level 4 [ ]: NumTables: 00. Size: 0 B of 10 MiB. Score: 0.00->0.00 StaleData: 0 B Target FileSize: 2.0 MiB
Level 5 [B]: NumTables: 00. Size: 0 B of 13 MiB. Score: 0.00->0.00 StaleData: 0 B Target FileSize: 4.0 MiB
Level 6 [ ]: NumTables: 18. Size: 126 MiB of 126 MiB. Score: 0.00->0.00 StaleData: 0 B Target FileSize: 8.0 MiB
Level Done
badger 2022/02/05 02:34:20 INFO: Lifetime L0 stalled for: 0s
badger 2022/02/05 02:34:20 INFO:
Level 0 [ ]: NumTables: 01. Size: 2.7 KiB of 0 B. Score: 0.00->0.00 StaleData: 0 B Target FileSize: 64 MiB

Level Done
badger 2022/02/05 02:34:20 INFO: Lifetime L0 stalled for: 0s
badger 2022/02/05 02:34:21 INFO:
Level 0 [ ]: NumTables: 01. Size: 2.8 KiB of 0 B. Score: 0.00->0.00 StaleData: 0 B Target FileSize: 64 MiB

Level Done
badger 2022/02/05 02:34:21 INFO: Lifetime L0 stalled for: 0s
badger 2022/02/05 02:34:21 INFO:
Level 0 [ ]: NumTables: 00. Size: 0 B of 0 B. Score: 0.00->0.00 StaleData: 0 B Target FileSize: 64 MiB

Level Done
badger 2022/02/05 02:34:21 INFO: Lifetime L0 stalled for: 0s
badger 2022/02/05 02:34:21 INFO:
Level 0 [ ]: NumTables: 01. Size: 1.1 MiB of 0 B. Score: 0.00->0.00 StaleData: 0 B Target FileSize: 64 MiB

Level Done
badger 2022/02/05 02:34:21 INFO: Lifetime L0 stalled for: 0s
badger 2022/02/05 02:34:21 INFO:
Level 0 [ ]: NumTables: 00. Size: 0 B of 0 B. Score: 0.00->0.00 StaleData: 0 B Target FileSize: 64 MiB

Level Done
[02:34:21Z] REDUCE 04m15s 100.00% edge_count:40.32M edge_speed:779.5k/sec plist_count:21.36M plist_speed:413.0k/sec. Num Encoding MBs: 0. jemalloc: 0 B
Total: 04m15s

We now see our /out directory, we had to specify /out directory location. Below is our command

dgraph bulk -f ${files_in_ready_state} -s ${schemaFile} --format=rdf --xidmap xid --store_xids --out /coldstart/out --map_shards=3 --reduce_shards=3 --zero=dgraph-dgraph-zero:5080

I verified multiple times, unless --out parameter specified /out directory wasn’t created.
I’m using v21.03.1

Can someone confirm this?

We resolved mystery around missing /out folder. Updating just in case if anyone falls in the same trap.

  • We were using cronjob to run bulk loader
    – We could have ‘kubectl exec -it’ to the zero and launched the bulk loader manually
    – Running bulk uploader as CronJob for big data sets works out be very reliable
    – Just one caution, the /out folder will be deleted once the cronjob finishes the execution
  • Fix was supply external folder to bulk loader with --out parameter. I our case, we used Azure_File volume mount