Bulk loader fail

musiciansLyf · January 5, 2021, 12:26pm

One million nodes have been imported and cannot be retrieved!

musiciansLyf · January 5, 2021, 12:28pm

version:

musiciansLyf · January 5, 2021, 12:29pm

Cluster date:

musiciansLyf · January 5, 2021, 12:30pm

But I can only retrieve what I just created

MichelDiz · January 5, 2021, 4:11pm

Please, share the steps taken.

musiciansLyf · January 6, 2021, 5:41am

Steps:

Start zero node and load data in bulk
Copy the P directory in the out directory to the first alpha node
Start the first alpha node and see the following log:

Creating snapshot at index

Start the other two alpha nodes, and the first node appears the following log twice

Stream snapshot: OK

MichelDiz · January 6, 2021, 5:45am

Have you started from scratch?

What configs you have used in the bulk load?

That’s normal.

musiciansLyf · January 6, 2021, 6:19am

Yes, I deleted all the files in the w folder before starting the zero node

I am using rdf.gz About one million nodes are loaded in the format file. The schema format is:

< type >: String @ index (hash)

<source>:[uid] @reverse .

<relation>:[uid] @reverse .

<namespace>:string @index(hash) .

<source_ id>:default .

<tenant_ id>:string @index(hash) .

<create_ time>:default .

< dgraph.cors >:[string] @index(exact) @upsert .

< dgraph.type >:[string] @index(exact) .

<identity_ id>:string @index(hash) .

<source_ type>:default .

< dgraph.drop.op >:string .

< dgraph.graphql.xid >:string @index(exact) @upsert .

< dgraph.graphql.schema >:string .

< dgraph.graphql .p_ query>:string .

< dgraph.graphql .p_ sha256hash>:string @index(exact) .

< dgraph.graphql.schema_ history>:string .

< dgraph.graphql.schema_ created_ at>:datetime .

type < dgraph.graphql {

dgraph.graphql.schema

dgraph.graphql.xid

}

type < dgraph.graphql.history {

dgraph.graphql.schema_ history

dgraph.graphql.schema_ created_ At

}

type < dgraph.graphql.persisted_ query> {

dgraph.graphql .p_ query

dgraph.graphql .p_ sha256hash

}```

musiciansLyf · January 6, 2021, 6:23am

After a period of time, only the first alpha node can retrieve and process, but the other nodes still can’t. I debugged the P folder of the other two nodes, and there was no imported data

musiciansLyf · January 6, 2021, 6:29am

Finally, copy the created directory directly to each alpha node to retrieve the results.

Valdanito · January 6, 2021, 7:49am

This is necessary when using bulk loader.

musiciansLyf · January 6, 2021, 8:19am

Well, because I don’t have a large amount of data, I use the for small dataset method.

reference resources: https://dgraph.io/docs/deploy/fast-data-loading/bulk-loader/

Topic		Replies	Views
Cannnot find the data after bulk load Users kind:question	3	411	July 12, 2021
Bulkload fails with no error message Dgraph	6	597	May 7, 2020
Getting error while doing live loading Dgraph kind:question , dgraph	17	709	August 25, 2021
Dgraph bulk loader with smaller dataset sync not working from leader to the followers Dgraph kind:question	7	697	May 18, 2023
Documentation issue with bulk loader Dgraph dgraph	2	522	January 24, 2022

Bulk loader fail

Related topics