Query to detect a missing edge

Hello, I’m trying to write a specific query, and just cannot work out how to achieve this one. Any help appreciated!

For every Edge1 in my database, this exact pattern should exist. However in some cases Edge3 can be missing, so I’m trying to find all Node3-Node4 pairs where Edge3 is missing (or even better, add the link in directly).

Ideally I need to iterate through all Edge1’s in a single query, as there are many millions of them and I do not want to issue a query for each edge.

This is the best query I managed to come up with:

upsert {
  query {
    A as var(func:has(Node1.Edge1)) {~Node4.Edge4 {B as uid}}
	
	var(func:uid(A)) {
	  Node1.Edge1 {
		~Node3.Edge2 {
          C as uid
          ~Node4.Edge3 @filter(uid(B))
           {D as uid}
		}
	  }
    }
  }

  mutation @if(not eq(len(D), 1)) {
    set {
      uid(B) <Node4.Edge3> uid(C) .
    }
  }
}

However this has the issue that is will cartesian all B’s and C’s together from all results, it is not limited to just this one loop/pattern.

I also tried standard queries to just return the B/C pairs with missing Edge3’s, but all had this cartesian problem whatever I tried.

Hi, the problem looked interesting, so I decided to have some fun with it! Did you manage to solve it?

Ideally I need to iterate through all Edge1’s in a single query (& add the link in directly)

I don’t think it’s possible using upsert, as there isn’t really any loops in queries, (and for millions of nodes at once, server resource usage may be an issue). I’d solve problems like this through multiple upserts.

upsert {
  query {
    d(func:has(Node1.Edge1)) {~Node4.Edge4 {D as uid}}
    d1(func:uid(D), first: 1) @filter(NOT has(Node4.Edge3)) { D1 as uid}
    c(func:uid(D1)) { Node4.Edge4 { Node1.Edge1 { ~Node3.Edge2 { C as uid }}}}
    c2(func:uid(C)) @filter(NOT has(~Node4.Edge3)) {C2 as uid}
  }
  mutation @if(ge(len(C2), 1) AND eq(len(D1), 1)) {
    set {
      uid(D1) <Node4.Edge3> uid(C2) .
    }
  }
}

Probably can be improved through codegen, by batching multiple queries & mutation blocks at once, and playing with first: 2, offset: 1 etc.

Ideally I need to iterate through all Edge1’s in a single query

However it may be possible to just do a query. I modified the query above and used uid_in to do some filtering. Not sure it will work on your real data.

  query {
    var(func:has(Node1.Edge1)) {~Node4.Edge4 {D as uid}}
    var(func:uid(D), first: 1000) @filter(NOT has(Node4.Edge3)) { D1 as uid}
    c(func:uid(D1)) { uid Node4.Edge4 { Node1.Edge1 { ~Node3.Edge2 @filter(NOT uid_in(~Node4.Edge3, uid(D1))) { uid }}}}
  }