Batch insertion in dgraph

shahaan_ali · November 18, 2019, 12:26pm

I am dumping data in dgraph and the data structure is in json form. I have to loop over the json array and to get individual object details and then after applying certain filters/checks on it I have to dump the details in the dgraph. This is gonna take great amount of time because I will mostly have more then 100 to 1000+ objects in json.

What will the fastest way to do this ?What about batch insertion in dgraph.? How to handle the mentioned scenario.

I am a beginner so any help will be appreciated.
Thankyou

prashant-shahi · November 18, 2019, 3:38pm

Hello @shahaan_ali,

There are multiple ways that you could go about it.

What will the fastest way to do this?

Using Bulk Loader.

What about batch insertion in dgraph?

For batch insertion, you could use either Bulk Loader or Live Loader. Both of the loaders support both RDF and JSON datasets.

If you are starting a new Dgraph cluster, then I would recommend using Bulk Loader; otherwise, you would have to use Live Loader for an already running cluster.

How to handle the mentioned scenario?

For each JSON entries(and nested entries), you would need to set value of the uid predicate with a unique identifier value of that entity.

Also, if you are using Type System in your schema, you can set type by setting the value of the dgraph.type predicate for that node. A node can have multiple types.

shahaan_ali · November 19, 2019, 7:35am

Hey,
As far i understand by reviewing the the provided links ; bulk loader loads data by running the command which includes the schema, data file and other params. I am concerned about saving the data which i get from a running script and want to save it using code. If you could provide any python based solutions of the mentioned scenario i will really appreciate it.

Below is the json example which I am getting which may contains a large number of objects in the mentioned form.

Thanks

[{
	'id': '100023',
	'name': 'Faiz Ali',
	'profile': 'https://www.facebook.com/profile.php?id=100023',
	'picture_link': 'https://scontent.fisb1-1.fna.fbcdn.net/v/t1.0-1/p40x40/30440726_18089952_o.jpg?_nc_cat=108&_nc_oc=AQmprMoO4Nk65PO95Y1SQjmOsqofdKDV80uVmdkg0aC6IEIpmiWWSJn7oHpNqLuyxzg&_nc_ht=scontent.fisb1-1.fna&oh=512fe306a0d4cafcb480e097c8bd04d4&oe=5E41EAD0',
	'city': '',
	'edu': '',
	'work': '',
	'cityn': [],
	'edun': [],
	'workn': []
}, {
	'id': 'zeeshaaawn.qmughal',
	'name': 'Zen Sahal',
	'profile': 'https://www.facebook.com/zghal?fref=pb&__tn__=%2Cd-a-R&eid=ARD00TXVW6zsNuDDsHw_dBTXq9ngx0nPKbHj&hc_location=profile_browser',
	'picture_link': 'https://scontent.fisb1-1.fna.fbcdn.net/v/t1.0-1/p40x40/7235887_457342973969956864_o.jpg?_nc_cat=110&_nc_oc=AQk0_2dMsm_5fJqb7W3Za0HQt-t_Vbo_o02iWYIv8_BSLJdIP-V3Xv_x3Th04aMiP0o&_nc_ht=scontent.fisb1-1.fna&oh=eed9858a9a1f256aba3df5c4d364b841&oe=5E4F0936',
	'city': '',
	'edu': 'F.G College for men',
	'work': 'Private',
	'cityn': [],
	'edun': [{
		'id': 'f.gcollegeformenh-9i',
		'name': 'F.G College for men '
	}],
	'workn': [{
		'id': 'private',
		'name': 'Private'
	}]
}]

prashant-shahi · November 19, 2019, 3:27pm

Here is a python script which should add Dgraph meta informations(uid and type system) to your existing JSON dataset.

I have also included schema file w.r.t. provided sample data.

Do let me know if how does the above solution works for you or if you need any help.

Topic		Replies	Views
How to batch import json data Dgraph Clients	1	565	August 11, 2020
Batch import large amount of data Dgraph kind:question	5	987	January 19, 2021
How to import json file Dgraph kind:question	1	486	August 21, 2020
Complex nested json data to dgraph Dgraph dgraph , pydgraph	1	511	November 27, 2023
Bulk importing data from CSVs/JSON using pre-determined UIDs Dgraph	2	496	April 8, 2021

Batch insertion in dgraph

Related topics