Support for Spatial 2D, 3D geometries in Dgraph

muralaris · August 23, 2021, 6:28am

@mrjn : I have the following questions to make a decision

I wanted to check if Dgraph has extensive support for the geospatial data storage and retrieval in it. Does it support 2d and 3d geometries and also the coordinate reference systems (CRS) as supported in neo4j like Spatial values - Neo4j Cypher Manual
If available, how does it differ on the performance with neo4j on the spatial data storage and retrieval.
Does dgraph have support for spatial indexing using s2 geometry ?

Do let us know.
-Murali

iluminae · August 23, 2021, 3:27pm

Geographic information (lat, long) is supported, and specifically I see index handling for MultiPolygon, Polygon, Point, types. geo tutorial
dgraph is a very different architecture than neo4j, so it is probably hard to compare apples to apples here. But the data is indexed with a geo index, so that will be as fast as many of the other indices available.
An s2 index implementation is what is used to index the geometry data. (code)

muralaris · August 24, 2021, 11:19am

Thanks for the quick response.
@iluminae @mrjn
Question 1:
We have a requirement to store a connected historical data (immutable) graph.
Which model do you suggest for a better balance in query and ingest performance. ?

Can the historical attribute values be stored as nodes with historical dates as the edges/relationships.
or
Any other model you recommend ??

Question 2:
What is the sizing config on Dgraph cloud to handle the data with about 3 Billion nodes and 3.5 Billion relations.

Total Nodes: 3.2 Billion (Each node has about ~3-4 attributes ,One special node type has about 200 attributes, one of the attribute would be of type ‘geo’)
Relationships: 3-3.5 Billion edges (1-2 attributes)

Please let us know of your thoughts.
-Murali

iluminae · August 24, 2021, 8:16pm

I wont know how to design your schema really but let me give you some advice on how dgraph internally manages storage - which may help you in drawing conclusions with respect to ingestion and performance:

Dgraph stores data by tablet, which is synonymous with a predicate. (an ‘attribute’ key as you have written above - sometimes getting the jargon to all match up is half the battle)
Therefore, dgraph does not store anything per ‘node’ or ‘edge’. A node is just a unique ID some triples share as a subject.
So, if you have 1M predicates ‘on a node’ vs. 3 predicates ‘on a node’, dgraph does not care, and will be equally performant at query time (specifically on querying X things ‘on a node’ in either pattern)
Conversely, if you have a huge graph with billions of values, and it only has 5 different predicates (attribute keys, if you will) total, that will give you terrible performance, since the storage is by predicate.
As an extension of the above, indicies will also be huge corresponding to the predicate being indexed.
A well balanced huge database with billions of (key,value)s should be well balanced across a good number of predicates. What is a good number? That may take some work to find out.

I highly suggest you read the whitepaper before designing a database of this magnitude. It is certainly possible to do (I have ~4Bn triples in my current production dgraph) but you should not go in without understanding exactly how dgraph performs operations as to best design your database.

Good luck!

muralaris · September 9, 2021, 11:40am

Thanks.I shall go through the same.

Topic		Replies	Views
Why doesn't dgraph support LineString type in geo spatial? Dgraph kind:question	7	480	September 8, 2021
Dgraph vs Datomic vs Neo4j etc Dgraph	8	3087	July 16, 2018
Neo4j vs Dgraph - The numbers speak for themselves - Dgraph Blog Blog	19	3818	July 4, 2021
Neo4j vs Dgraph - The numbers speak for themselves - Dgraph Blog Blog	10	2947	February 5, 2017
From neo4j to dgraph Dgraph	9	1438	December 13, 2021

Support for Spatial 2D, 3D geometries in Dgraph

Related topics