Bulk upload command

I wanted to make sure bulk upload processes is correct. Below are the steps…

Our cluster config:
Zeros: 3 (48 GB, 1 X 1.98 SSD disk)
Alpha: 9 (56 GB, 2 X 1.98 SSD disk)
Groups: 3

All the servers have one extra managed file disk mounted (ReadWriteMany) for file copy.

  1. Prepare RDF data file and schema

  2. Bring up the zero’s

  3. Block Alpha’s with initContainer flag

  4. Launch bulk uploader from one of the zeros

    dgraph bulk -f /coldstart/upload/pending -s /coldstart/upload/rdf-schema/my_schema.rdf -format=rdf --store_xids --xidmap xid --map_shards=3 --reduce_shards=3 --http localhost:8000 --zero=localhost:5080
     
    parameters and flags
    ----------------------------
     RDF data files location: /coldstart/upload/pending
     Schema file: /coldstart/upload/rdf-schema/my_schema.rdf 
     format: rdf
     --store_xids    (this is required to store xid?)
     --xidmap xid    (This the attribute name to store?)
     --map_shards=3 
     --reduce_shards=3 
     --http localhost:8000 
     --zero=localhost:5080
    
    Remarks
    ------------
    We will launch bulk uploader multiple times till all the data files are uploaded
    
  5. Below is the typical type of our objects

     type Student {
      studentId: String! @id
      courses: [Course] @hasInverse(field: student)
      xid: String!  @search(by: [hash])
     }
    
  6. Below is our RDF doc

     <_:my.org/Student/10101/Course/201/Event/1> <Course.eventId> "1" .
     <_:my.org/Student/10101/Course/201/Event/1> <Course.timestamp> "2022-01-01T00:00:02.298240" .
     <_:my.org/Student/10101/Course/201/Event/1> <Course.student> <_:my.org/Student/10101> .
     <_:my.org/Student/10101> <Student.studentId> "10101" . 
     <_:my.org/Student/10101> <Student.courses> <_:my.org/Student/10101/Course/201/Event/1> .
     <_:my.org/Student/10101/Course/201/Event/1> <Course.codeId> <_:my.org/CourseTcode/201> .
     <_:my.org/CourseTcode/201> <CourseTcode.course> 
     <_:my.org/Student/10101/Course/201/Event/1>