Bulk upload command

porsche · February 4, 2022, 4:07am

I wanted to make sure bulk upload processes is correct. Below are the steps…

Our cluster config:
Zeros: 3 (48 GB, 1 X 1.98 SSD disk)
Alpha: 9 (56 GB, 2 X 1.98 SSD disk)
Groups: 3

All the servers have one extra managed file disk mounted (ReadWriteMany) for file copy.

Prepare RDF data file and schema
Bring up the zero’s
Block Alpha’s with initContainer flag

Launch bulk uploader from one of the zeros

dgraph bulk -f /coldstart/upload/pending -s /coldstart/upload/rdf-schema/my_schema.rdf -format=rdf --store_xids --xidmap xid --map_shards=3 --reduce_shards=3 --http localhost:8000 --zero=localhost:5080
 
parameters and flags
----------------------------
 RDF data files location: /coldstart/upload/pending
 Schema file: /coldstart/upload/rdf-schema/my_schema.rdf 
 format: rdf
 --store_xids    (this is required to store xid?)
 --xidmap xid    (This the attribute name to store?)
 --map_shards=3 
 --reduce_shards=3 
 --http localhost:8000 
 --zero=localhost:5080

Remarks
------------
We will launch bulk uploader multiple times till all the data files are uploaded

Below is the typical type of our objects

 type Student {
  studentId: String! @id
  courses: [Course] @hasInverse(field: student)
  xid: String!  @search(by: [hash])
 }

Below is our RDF doc

 <_:my.org/Student/10101/Course/201/Event/1> <Course.eventId> "1" .
 <_:my.org/Student/10101/Course/201/Event/1> <Course.timestamp> "2022-01-01T00:00:02.298240" .
 <_:my.org/Student/10101/Course/201/Event/1> <Course.student> <_:my.org/Student/10101> .
 <_:my.org/Student/10101> <Student.studentId> "10101" . 
 <_:my.org/Student/10101> <Student.courses> <_:my.org/Student/10101/Course/201/Event/1> .
 <_:my.org/Student/10101/Course/201/Event/1> <Course.codeId> <_:my.org/CourseTcode/201> .
 <_:my.org/CourseTcode/201> <CourseTcode.course> 
 <_:my.org/Student/10101/Course/201/Event/1>

Topic		Replies	Views
How to copy out/0/p from zero to alpha (blocked with init) Dgraph dgraph , kind:bug , area:bulk-loader	1	657	February 5, 2022
Can we run bulk uploader multiple times? Dgraph kind:question , dgraph , area:bulk-loader	2	569	February 5, 2022
Where is bulk uploader /out folder Dgraph dgraph , area:bulk-loader	3	669	February 6, 2022
Distributed bulk loader Dgraph bulkloader , dgraph	0	589	February 10, 2022
Bulk uploader not making equal shards Dgraph dgraph	9	860	March 16, 2022

Bulk upload command

Related topics