I have a graph of customer_id, device_id, phone_number, email nodes, where each customer_id node also has a label attribute. Each customer_id can have 0 or more device_ids, phone_numbers, or emails with has_device, has_phone_number, has_email relationships.
I now want to query a single customer_id and return all other nodes in the graph that are connected in any way to this customer_id. In other words, I want to get the entire subgraph/cluster of nodes that a single customer_id is connected to.
Currently I am using the following query which works:
I am brand new to Dgraph and GraphQL, and so I am not sure if this query is the right way to do this, or if it is very inefficient, etc. Could you please recommend whether or not this query is ok for what I want, or are there better ways?
Thank you @kevin.obrien.
You don’t require to create separate nodes for email and phone_number. You can just give them as predicates to the customer_id or devide_id nodes. Then you can search for customer_id = 123456 and you can get all the devices connected to it and values of all of its predicates like label, emails and phone_number etc.
I don’t understand why emails and phone_numbers would be treated differently to device_ids. They are all different types of nodes connected to customer_ids with has_x relationships. Could you write a sample query so I understand exactly what you mean please? Thanks
I think @kevin.obrien wants to do a graph search and needs all connected nodes and not just connected at level 1. The current query in the questions works since it runs the same query recursively on all the output nodes. But this can quickly time-out or not return anything if the sub-graph is very big. See here.
@kevin.obrien from the query, I am trying to understand that do you want customers who share same phone number, email or device?
device_id is representing a device which is a different entity. Whereas email and phone_number are just values associated with the customer or the device.
@Anurag, yes I want all connected nodes, not just at level 1. In general terms, I want to return all nodes (customers, devices, phone numbers, email addresses) that have a path from the queried node to.
@Neeraj, actually in my case, there are customers sharing the same phone number or email addresses (fake, fraudulent accounts) and this is one of the problems we are trying to tackle.
Hi @kevin.obrien, Your query is correct and should work fine unless the graph explodes very quickly to very large number of nodes in which case you might want to restrict the depth by using depth paramater.
The below query identifies the two subgraphs based on whether $name1 is node g or node h
query(func: eq(name, $name1) ) @recurse {
name
relation
~relation
}
Well, @kevin.obrien , the way you are doing it is the expected one for this scenario. Another way would be to write the same query in several blocks, but the data order could be wrong. Or you can do a single static query representing the whole tree you have.
A static query would be something like 1ms faster (that is, not worth it) in most cases. But it would certainly be much faster in gigantic Schemas. It is a trade-off.
TIP:
To make your query smaller, you can use the Type Trick. Taking @Anurag’s sample. You can do this