Using @cascade in an upsert query can result in *all* nodes being updated, despite "first"

What I want to do

I want to update only one node returned in the query block of the upsert while using @cascade, but the request mutates (sets or deletes) all matching nodes instead.

What I did

This happens 100% of the time, but only with @cascade.

Set initial values:

{
  set {
    <_:blank-0> <a> "Value 0" .
    <_:blank-1> <a> "Value 1" .
    <_:blank-2> <a> "Value 2" .
  }
}

Attempt to mutate one row:

upsert {
  query {
    tmp as var(func: has(a), first: 1) @normalize @cascade {
      a: a
    }
  }
  
  mutation {
    set {
      uid(tmp) <a> "Changed value" .
    }
  }
}

Read values to see that they were all changed:

query {
  tmp(func: has(a)) {
    a
  }
}

returns:

  "data": {
    "tmp": [
      {
        "a": "Changed value"
      },
      {
        "a": "Changed value"
      },
      {
        "a": "Changed value"
      }
    ],
  ...
  }

Clean up before running the test again:

upsert {
  query {
    tmp as var(func: has(a), first: 1)
  }
  
  mutation {
    delete {
      uid(tmp) <a> * .
    }
  }
}

If you remove @cascade from the upsert you’ll see that only one node is updated (although obviously that changes the semantics).

Dgraph metadata

dgraph version
Dgraph version   : v21.03.0
Dgraph codename  : rocket
Dgraph SHA-256   : b4e4c77011e2938e9da197395dbce91d0c6ebb83d383b190f5b70201836a773f
Commit SHA-1     : a77bbe8ae
Commit timestamp : 2021-04-07 21:36:38 +0530
Branch           : HEAD
Go version       : go1.16.2
jemalloc enabled : true

What about this query?

upsert {
  query {
    q(func: has(a), first: 1) @normalize @cascade {
      tmp as uid
      a
    }
  }
  
  mutation {
    set {
      uid(tmp) <a> "Changed value" .
    }
  }
}

Yeah, that upsert does the same thing, updates all of the nodes.

qq, what is the reason to use cascade and normalize? This looks like a bug(I haven’t tested myself), but the reason is important.

I’m using @cascade to exclude nodes that don’t have all of the predicates I’m querying and with predicates that contain specific values (doing some @filter(uid_in(pred, uid(abc))) stuff), ultimately finding the first matching node and then applying a mutation in the same request.

By the way, I just found a workaround:

upsert {
  query {
    q(func: has(a), first: 1) @normalize @cascade {
      tmp as uid
      a
    }
    
    x as var(func: uid(tmp), first: 1)
  }
  
  mutation {
    set {
      uid(x) <a> "Changed value" .
    }
  }
}

Also, You don’t need the normalize in an upsert query ever.

The first param in both blocks has no effect. The problem was between the DQL query and the mutation. Between query blocks should be fine.

I use @normalize because I also read the predicates returned by the request and need (well want) them in a particular format, the same format that I use for regular ol’ read requests.

Okay, but you used “var” - you couldn’t see any result from it. Anyway, I’m glad you solved it.

Cheers.

Sorry, didn’t see this reply until now. Sure, in this specific case I didn’t need a result from it, but I do in my real use case, in a much larger query with more information than I thought would be useful to show the bug.

I don’t think I’ve solved the issue. I’m not sure what I did is 100% correct but I am pretty sure the originally reported behavior is incorrect – that is, deleting and updating data other than that which was intended (kind of frightening, in fact). If it is the intended, correct behavior I think it’d be worth putting a warning about it in the @cascade documentation (font size 72pt+).

FWIW: I just noticed a bug in the cleanup query I pasted in the initial post:

upsert {
  query {
    tmp as var(func: has(a), first: 1)
  }
  
  mutation {
    delete {
      uid(tmp) <a> * .
    }
  }
}

^ That query correctly deletes one node as you’d expect.

upsert {
  query {
    tmp as var(func: has(a), first: 1) @cascade
  }
  
  mutation {
    delete {
      uid(tmp) <a> * .
    }
  }
}

^ That query deletes all nodes.