Is there a good way to implement graph algorithms?

Currently, I have been using Dgraph for a time, all is fine but I really want to find a way to apply those graph algorithms.
I know there is one algorithm(K shortest path) has been established. But how can I apply other algorithms?
The only way I know is to use pydgraph to connect dgraph, then write query to get json data, finally handle those results and apply functions(like pagerank) writen by myself to get output. Something just like this:

client = pydgraph.DgraphClient(client_stub)
def d_Pagerank(relation,beta):

    query = """{{
    bladerunner(func: has({})) {{
    res = client.txn(read_only=True).query(query)
    res = json.loads(res.json)['bladerunner']
    res = np.array([[i['name'],i['friend'][0]['name']] for i in res])
    k1,k2,k3 = np.unique(res,return_inverse=True,return_index=True)
    k3 = k3.reshape(res.shape)
    weights = [1]*len(k1)
    G = sparse.csr_matrix((weights, (k3[:,0], k3[:,1])), shape=(len(k1), len(k1)))

    return pagerank(G, beta)

By this I can successfully get results:

CPU times: user 5.63 ms, sys: 1.94 ms, total: 7.57 ms
Wall time: 11 ms
array([0.25, 0.25, 0.25, 0.25])

But, it will kill me if I have write each algorithm by myself (I need to consider memory and distributed problems) :tired_face:

Noticing there is already someone push an image including some algorithms how-about-doing-some-graph-compute-in-a-query .But I am afraid it is too old and may not stable.

Else build spark-dgraph-connector, I think this should be a decent way to apply agorithms from spark for dgraph. However, I am totally new for Spark and don’t know how to use such connector. Also, I have no one I can turn to… :broken_heart:

Therefore, I really want to know is there anyone can help me out about how to implement those algorithms in Dgraph?

Hi @jokk33,
I am unable to follow your question. Currently in Dgraph you can do a shortest path query and do basic traversal across nodes using @recurse.

Which algorithms do you want to apply?

Remember Dgraph is a database and not graph processing library, so natively supporting all the graph algorithms has never been an intention. Let’s try to break down your problem and understand which queries you can write in Dgraph and which queries you would need to do on the client-side.

My bad, sorry for my expression.
I mean I want to achieve some basic graph algorithms by Dgraph,including Path Finding; Centrality; Community Detection; Link Prediction etc.

I totally understand Dgraph is a database and not graph processing library, so I wanna ask is there a good way to perform calculation after query data from Dgraph. As above-mentioned GitHub - G-Research/spark-dgraph-connector: A connector for Apache Spark and PySpark to Dgraph databases., I think in such way it should be good, connecting Dgraph and do algorithms like pagerank from Spark. But I am not fimiliar with that. So I am trying to find some examples of applying algorithms from Dgraph.

To be more precise, for example, pagerank. I can query data from Dgraph easily and take those data to python, write pagerank function and then do calculation.
In this way, I can get correct result. It is useful for small data, but if my data is big, such calculation will run out of memory, because this calcutation is not distributed. While using spark it should work(distributed calculation), like this

All in all, I just wanna to know is there an afternative way to do query and calculation together or is there any tutorial to use such connector?

1 Like

Hmm… Thanks for the updated information. I don’t see many clients supporting Dgraph natively as an ingestion for their graph processing algorithms, and I do see a need for that. The Apache Spark GraphX which you pointed out was developed by our community.

We can look into extending support via such clients but that would take sometime before we can have a solution.

In the meanwhile, I see from the above algorithms which you mentioned, we can write some queries that can help you solve them natively in Dgraph.

Can you tell me:

  1. Data and Schema which you are looking at?
  2. Which exact queries you want to run (wrt your data)? eg. If you have a data of people and friends we can potentially run Community Detection using some queries.


My real data will be a little complicate, but if you can give me an example of Community Detection and ‘pagerank’ will save my life!!
Could you please teach me how to run Community Detection query or pagerank query for below sample data:

  set {
    # friends
    _:a <friend> _:b . 
    _:b <friend> _:c . 
    _:c <friend> _:d  . 
    _:d <friend> _:a  . 
    _:a <friend> _:e . 
    _:a <friend> _:f . 
    _:f <friend> _:e .
    _:f <friend> _:g .    
    _:a <name> "Alice" .
    _:b <name> "Bob" .
    _:c <name> "Tom" .
    _:d <name> "Mallory" .
    _:e <name> "John" .
    _:f <name> "Drake" .
    _:g <name> "Travis" .

You can adjust the sample graph structure, I just need to know how can I write such queries.

For community detection, do you want all conneted components or you want all connected nodes for a given starting node?

The latter is easy with recurse query. Take a look at this post.

For pagerank, I will have to understand more about that :slight_smile:

did it work? :upside_down_face: