What I want to do
I’m building a frontend where 100s of concurrent users are working with data spreadsheets at the same time.
One feature is that users can paste/upload csv data (on average 1000-10000 rows and 5 columns = many triples) into a table for some further sorting/filtering etc.
Imagine a bunch of users try to upload such data at the same time, from a JavaScript frontend.
- What are our options for integrating this quickly into Dgraph?
Could we shoot it directly via DQL mutations (it’s maybe 500kb – 1mb of data!) or should we for example create some kind of worker that outsources this csv-import to a Python live loader?
- How do we ensure consistent upload and not overloading Dgraph with concurrent inserts?
If multiple users do this at the same time, I assume we need some kind of queuing to not overload the importer? How could we devise a bullet-proof system that holds up to many concurrent csv uploads while feeling snappy/near real-time to end users?
Thanks for any thoughts and ideas.