Moved from GitHub dgraph/3819
Posted by dmsolow:
Json-lines (http://jsonlines.org/) is a commonly used format for storing a large number of JSON objects in a file. It’s better than a single JSON array of objects because it makes it easy to read a file object by object without loading the entire thing into memory.
Popular big data processing frameworks like Apache Spark write JSON-lines natively (
df.write.json("out.json") writes a JSON-lines file for each partition)
Support would probably be trivial to add for Dgraph and it would help people easily integrate Dgraph into existing ETL workflows.