How to ingest nodes and create forward and reverse

tahseen · July 4, 2024, 9:24pm

Hello everyone,

I am currently researching graph database solutions for a large dataset, approximately 1TB. The data model includes entities such as:

Schools
    Schools have classes
    Classes have subjects
    Classes have students
    Subjects have teachers

The challenge is that each data source is different. My initial plan was to preprocess and join the data before ingestion into Dgraph and then start the insertion process. However, I came across a tutorial on YouTube (https://www.youtube.com/watch?v=wWvRjYmiWgw) which recommends creating all nodes and predicates first, and then establishing relationships/edges. I would appreciate expert opinions on this approach.

Additionally, I am seeking advice on the best way to ingest large datasets, considering my data is in JSON format.

Thank you.

matthewmcneely · July 7, 2024, 5:04pm

Hey @tahseen,

I think either approach would work. One advantage to joining the data prior to loading is the elimination of the extra time to stitch edges (and possibly introduce errors).

For a terabyte of data, you’ll definitely want to use the Bulk Loader. Have a look at the vlg repo, specifically the section on data loading: vlg/notes/3. Data Importing.md at main · dgraph-io/vlg · GitHub. And there may be other things in this repo you might find useful as you attempt your import.

Topic		Replies	Views
Is Dgraph Suitable for Large-Scale Ingestion and Querying at Billions of Nodes and Edges? Dgraph kind:question , dgraph , question	5	95	December 26, 2024
Loading close to 1M edges/sec into Dgraph - Dgraph Blog Blog	3	1463	November 15, 2018
Data from neo4j to dgraph Dgraph kind:question , area:bulk-loader	2	704	August 19, 2020
Optimal way to ingest MongoDb records to Dgraph Dgraph	4	355	May 28, 2021
Data Ingestion very slow Users	6	1083	October 25, 2018

How to ingest nodes and create forward and reverse

Related topics