Hello everyone,
I am currently researching graph database solutions for a large dataset, approximately 1TB. The data model includes entities such as:
Schools
Schools have classes
Classes have subjects
Classes have students
Subjects have teachers
The challenge is that each data source is different. My initial plan was to preprocess and join the data before ingestion into Dgraph and then start the insertion process. However, I came across a tutorial on YouTube (https://www.youtube.com/watch?v=wWvRjYmiWgw) which recommends creating all nodes and predicates first, and then establishing relationships/edges. I would appreciate expert opinions on this approach.
Additionally, I am seeking advice on the best way to ingest large datasets, considering my data is in JSON format.
Thank you.