Bulk loader fail

One million nodes have been imported and cannot be retrieved!

version:image

Cluster date:

But I can only retrieve what I just created

Please, share the steps taken.

Steps:

  1. Start zero node and load data in bulk

  2. Copy the P directory in the out directory to the first alpha node

  3. Start the first alpha node and see the following log:

Creating snapshot at index

  1. Start the other two alpha nodes, and the first node appears the following log twice

Stream snapshot: OK

Have you started from scratch?

What configs you have used in the bulk load?

That’s normal.

Yes, I deleted all the files in the w folder before starting the zero node

I am using rdf.gz About one million nodes are loaded in the format file. The schema format is:

< type >: String @ index (hash)

<source>:[uid] @reverse .

<relation>:[uid] @reverse .

<namespace>:string @index(hash) .

<source_ id>:default .

<tenant_ id>:string @index(hash) .

<create_ time>:default .

< dgraph.cors >:[string] @index(exact) @upsert .

< dgraph.type >:[string] @index(exact) .

<identity_ id>:string @index(hash) .

<source_ type>:default .

< dgraph.drop.op >:string .

< dgraph.graphql.xid >:string @index(exact) @upsert .

< dgraph.graphql.schema >:string .

< dgraph.graphql .p_ query>:string .

< dgraph.graphql .p_ sha256hash>:string @index(exact) .

< dgraph.graphql.schema_ history>:string .

< dgraph.graphql.schema_ created_ at>:datetime .

type < dgraph.graphql {

dgraph.graphql.schema

dgraph.graphql.xid

}

type < dgraph.graphql.history {

dgraph.graphql.schema_ history

dgraph.graphql.schema_ created_ At

}

type < dgraph.graphql.persisted_ query> {

dgraph.graphql .p_ query

dgraph.graphql .p_ sha256hash

}```

After a period of time, only the first alpha node can retrieve and process, but the other nodes still can’t. I debugged the P folder of the other two nodes, and there was no imported data

Finally, copy the created directory directly to each alpha node to retrieve the results.

This is necessary when using bulk loader.

Well, because I don’t have a large amount of data, I use the for small dataset method.

reference resources: https://dgraph.io/docs/deploy/fast-data-loading/bulk-loader/