Is there any way to calculate overlap rate of edges between each two nodes?

The schema is like below:

 set {
    _:x1 <gid> "test1" .
    _:x2 <gid> "test2" .
    _:x3 <gid> "test3" .
    _:x4 <gid> "test4" .
    _:x5 <gid> "test5" .

    _:y1 <userid> "u1" .
    _:y2 <userid> "u2" .
    _:y3 <userid> "u3" .
    _:y4 <userid> "u4" .
    _:y5 <userid> "u5" .
    _:y6 <userid> "u6" .
    _:y7 <userid> "u7" .
    _:y8 <userid> "u8" .
    _:y9 <userid> "u9" .
    _:x1 <impress> _:y1 .
    _:x1 <impress> _:y2 .
    _:x1 <impress> _:y3 .
    _:x1 <impress> _:y4 .
    _:x1 <impress> _:y5 .
    _:x2 <impress> _:y1 .
    _:x2 <impress> _:y2 .
    _:x2 <impress> _:y3 .
    _:x2 <impress> _:y8 .
    _:x3 <impress> _:y1 .
    _:x3 <impress> _:y2 .
    _:x3 <impress> _:y8 .
    _:x4 <impress> _:y6 .
    _:x4 <impress> _:y7 .
    _:x5 <impress> _:y6 .
    _:x5 <impress> _:y7 .    

Given overlap rate = intersection/union
So, how can I use query to calculate overlap rate of each two nodes(from x1 to x5).
For example
x1 connect to [y1,y2,y3,y4,y5]
x2 connect to [y1,y2,y3,y8]
x1 and x2 union is [y1,y2,y3,y4,y5,y8]
x1 and x2 intersection is [y1,y2,y3]
So the rate is 3/6.
The question is how can I write a loop query to calculate rate bewteen each two nodes?
[x1,x2] : 3/6
[x1,x3]: 2/6
[x1,x4]: 0/7

[x4,x5]: 2/2

1 Like

Hi, it’s surprisingly quite difficult to do graph-y stuff in dgraph. I’m working on adding some tips and tricks that I found to be quite useful for doing graph stuff (like indegrees, outdegrees, etc)

Here’s what I have so far for your problem (I had to index gid as terms, and add @reverse to impress):

  var(func: allofterms(gid, "test1")){
   # A as uid
    B as impress{uid}
  var(func: allofterms(gid, "test2")){
   # C as uid
    D as impress{uid}
  var(func: uid(B,D) ){
    U as count(userid)
  var(func:uid(B)) @filter(uid(D)){
    I as count(userid)      
  var(){II as sum(val(I))}
  var(){UU as sum(val(U))}
    aa : math(II*1.0/UU*1.0)
    intersection : sum(val(I))
    union: sum(val(U))

I get the following values as a result:

"data": {
    "q": [
        "aa": 0.5
        "intersection": 3
        "union": 6

The *1.0 is a trick to promote the count into floats such that division will be a float division