Moved from GitHub dgraph/4920
Posted by marvin-hansen:
Experience Report
A buddy asked me about a quick & simple way to setup a DB that can be integrated into an online system. So I suggested Dgraph. Setup easy, GraphQL schema easy, data import didn’t worked out.
What you wanted to do
I wanted to load a ~15 GB data sample from a ~200 GB data set, all in CSV format.
Because Dgraph doesn’t support CSV, the first attempt was to convert CSV into RDF.
Converting ~15GB into RDF, well, good luck. We got plenty out of memory errors with various tools.
Next best, convert to JSON mutation, so let’s do some Python scripting. Had to split the sample file into disjoint files to make it manageable…
https://docs.dgraph.io/mutations/#json-mutation-format
What you actually did
We selected a different DB because converting data simply took way too long, and quite frankly doesn’t scale so well to >100 GB. Nobody has time to script every damn data import from standard file format supported virtually everywhere.
Please support the existing standards out there to make everyone’s life better.
Why that wasn’t great, with examples
It’s self-evident.
What would be a truly great solution?
Support import from CSV files…
A simple command-line would be great.
Truly great would be a simple UI console that can import, export, and query data. For example, Heidi does just that. Nothing fancy, just import, export, query.