About bulk load failed with 10000-thread limit

Hello,
I encountered such exception

REDUCE 16h10m38s [49.63%] edge_count:10.86G edge_speed:372.7k/sec plist_count:2.717G plist_speed:93.27k/sec
runtime: program exceeds 10000-thread limit
fatal error: thread exhaustion

runtime stack:
runtime.throw(0x1312620, 0x11)
        /usr/local/go/src/runtime/panic.go:605 +0x95
runtime.checkmcount()
        /usr/local/go/src/runtime/proc.go:525 +0xa4
runtime.mcommoninit(0xc44e512380)
        /usr/local/go/src/runtime/proc.go:545 +0x9f
runtime.allocm(0xc420407900, 0x0, 0xca00000000)
        /usr/local/go/src/runtime/proc.go:1344 +0x99
runtime.newm(0x0, 0xc420407900)
        /usr/local/go/src/runtime/proc.go:1637 +0x39
runtime.startm(0xc420407900, 0x1a84300)
        /usr/local/go/src/runtime/proc.go:1728 +0x13f
runtime.handoffp(0xc420407900)
        /usr/local/go/src/runtime/proc.go:1755 +0x55
runtime.retake(0x1ba4f743bcbce6, 0xd0000002e)
        /usr/local/go/src/runtime/proc.go:3985 +0x135
runtime.sysmon()
        /usr/local/go/src/runtime/proc.go:3913 +0x1fe
runtime.mstart2()
        /usr/local/go/src/runtime/proc.go:1182 +0x11e
runtime.mstart()
        /usr/local/go/src/runtime/proc.go:1152 +0x64

goroutine 1 [runnable]:
github.com/dgraph-io/dgraph/dgraph/cmd/bulk.(*reducer).run(0xc59b1d0340)
        /home/pawan/go/src/github.com/dgraph-io/dgraph/dgraph/cmd/bulk/reduce.go:38 +0x1e9
github.com/dgraph-io/dgraph/dgraph/cmd/bulk.(*loader).reduceStage(0xc4203d8ab0)
        /home/pawan/go/src/github.com/dgraph-io/dgraph/dgraph/cmd/bulk/loader.go:294 +0x12c
github.com/dgraph-io/dgraph/dgraph/cmd/bulk.run()
        /home/pawan/go/src/github.com/dgraph-io/dgraph/dgraph/cmd/bulk/run.go:163 +0xa7f
github.com/dgraph-io/dgraph/dgraph/cmd/bulk.init.0.func1(0xc4200c2fc0, 0xc42012cb00, 0x0, 0x10)
        /home/pawan/go/src/github.com/dgraph-io/dgraph/dgraph/cmd/bulk/run.go:44 +0x52
github.com/dgraph-io/dgraph/vendor/github.com/spf13/cobra.(*Command).execute(0xc4200c2fc0, 0xc42012c900, 0x10, 0x10, 0xc4200c2fc0, 0xc42012c900)

why there could more than 10000 threads?
No SSD for bulk load. And because the ref is as big as 461G, it took more than 16hours.

Thank you very much.

@chen Can you please share goroutine dump, Unless goroutine gets blocked these many threads shouldn’t be created.

seems I missed grouting dump.

I now tried with 1/4 size of origin big data rdf.
I only have 3 servers, so during bulk load, the parameter is --reduce_shards 3 --shufflers 3 --map_shards 3

Actually I do not know if the reduce_shards can be as large as 40, and how to start server if there are 40 shards?
Under my idea, I will try to scp 3 shards to different machine ,and start the server one by one to make a cluster of 3 server

Please help me out.

Your HDD is slow in terms of disk seeks. That’s why Go is hitting this limit. You can do a couple of things:

  1. Increase the number of threads to say 10K. In bulk loader code main function, you can add this.

https://golang.org/pkg/runtime/debug/#SetMaxThreads

  1. Modify the bulk loader codebase to set a higher ValueLogThreshold, say 2048 bytes here. So, that most of your values are stored in the LSM tree, and not in value log.
  1. You could

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.