"query_edge_limit" in K-shortest path queries(using DQL)

Here is data sample in our application

{
  set {
    _:a <name> "Alice" .
    _:a <dgraph.type> "Person" .
    _:b <name> "Bob" .
    _:b <dgraph.type> "Person" .
    _:c <name> "Tom" .
    _:c <dgraph.type> "Person" .
    _:d <name> "Mallory" .
    _:d <dgraph.type> "Person" .
    _:a  <work_in> "Baidu".
    _:google <company_name> "Google" .
    _:apple <company_name> "Apple" .
    _:amazon <company_name> "amazon" .
    _:google <dgraph.type> "Company" .
    _:apple <dgraph.type> "Company" .
    _:amazon <dgraph.type> "Company" .
    _:a <work_in> _:google .
    _:a <work_in> _:amazon .
    _:b <work_in> _:google .
    _:c <work_in> _:apple .
    _:c <work_in> _:amazon
    .....
  }
}

We assume that there are many people and many corporate entities, and we want to find out whether two people have ever worked in the same company Or have a common colleague.

Assuming that each company may have thousands to tens of thousands of employees (approximately this number in practice)

Here is the query we use to find if two Person(0x2, 0x5)

{
 path as shortest(from: 0x2, to: 0x5, numpaths:8, depth:4) {
  work_in
  ~work_in
 }
 path(func: uid(path)) {
   name
 }
}

So the shortest path is probably like this

Person -> Company -> Person 

Or

Person -> Company -> Person -> Company ->Person

We set the depth to 4 just to meet the above 4-hop requirement.

In practical applications, we often encounter query_edge_limit errors.

Later I found there is a query_edge_litmit can be set in alaph nodes(It seems that this parameter is not mentioned in the document, I hope we can put this parameter in “More about alpha” in the future)

--query_edge_limit uint            Limit for the maximum number of edges that can be returned in a query. This applies to shortest path and recursive queries. (default 1000000)

After we increase this parameter, the operating efficiency is still very slow.

This kind of query from one point to another should be very efficient. Why we need to query so much edges?

I want to know the principle of this K-shortest queries function, and whether this type of search can be optimized?

1 Like