From spreadsheet to online table to dgraph storage

Daniel · November 9, 2021, 11:48am

What I want to do

I’m building a frontend where 100s of concurrent users are working with data spreadsheets at the same time.

One feature is that users can paste/upload csv data (on average 1000-10000 rows and 5 columns = many triples) into a table for some further sorting/filtering etc.

Imagine a bunch of users try to upload such data at the same time, from a JavaScript frontend.

What are our options for integrating this quickly into Dgraph?

Could we shoot it directly via DQL mutations (it’s maybe 500kb – 1mb of data!) or should we for example create some kind of worker that outsources this csv-import to a Python live loader?

How do we ensure consistent upload and not overloading Dgraph with concurrent inserts?

If multiple users do this at the same time, I assume we need some kind of queuing to not overload the importer? How could we devise a bullet-proof system that holds up to many concurrent csv uploads while feeling snappy/near real-time to end users?

Thanks for any thoughts and ideas.

pshaddel · November 9, 2021, 1:18pm

Hello @Daniel
In my personal experience, I implemented a custom converter and in one of the entities that had more than 10 columns I was inserting 1000 row each time(without rest time) and there was no problem since we had enough resources and 3 alphas. If you want to write more you can add more alpha nodes as far as I’m concerned. Also I think dgraph cloud is more tolerant to these massive writes.

Daniel · November 9, 2021, 1:27pm

Thank you @pshaddel. Did your approach also take concurrent uploads into account?

pshaddel · November 9, 2021, 4:26pm

No we did not upload anything to dgraph. If you want to write the data of the csv files into dgraph database, you need a lambda function to handle that.

Topic		Replies	Views
Realtime streaming graph data Dgraph	4	1046	December 20, 2019
[Feature request] Support data import from CSV file Dgraph dgraph , status:accepted , kind:feature , area:import-export	4	1065	August 24, 2022
How to do a large mutate transaction without killing other database connections Dgraph	3	303	August 5, 2020
Bulk upload with dgraph-js Dgraph Cloud / Slash GraphQL	6	463	June 19, 2021
Upload this CSV file – timings with or without uid mapping Dgraph	2	301	November 15, 2021

From spreadsheet to online table to dgraph storage

What I want to do

Related Topics