Would like support of eq/le/lt/ge/gt in mutation conditional upsert other than existing len function only

Moved from GitHub dgraph/5281

Posted by dilipkumar2k6:

What version of Dgraph are you using?

dgraph:v20.03.0

Have you tried reproducing the issue with the latest release?

Already using latest version

What is the hardware spec (RAM, OS)?

MAC OS, 32GB RAM

Steps to reproduce the issue (command/config used to run Dgraph).

# Insert Node A with timestamp
{
	set {
    _:nodeA <dgraph.type> "NodeA" .
    _:nodeA <nodeAId> "sb1" .
    _:nodeA <timestamp> "2020-04-20T22:15:19.794Z" .
  }
}
# Query nodeA too make sure it exist, use same query as upsert to test that its working
{
        var(func: eq(nodeAId, "sb1")) {
            existingTimestamp as timestamp
          }
       nodeA (func: le(val(existingTimestamp), "2020-04-20T22:15:19.794Z")) {
        uid
        timestamp
      }
}

# Run upsert on NodeB
upsert {
        query {
	var(func: eq(nodeBId, "eb12")) {
            nodeBId as uid
          }

          var(func: eq(nodeAId, "sb1")) {
            existingTimestamp as timestamp
          }
        }
        # create NodeB  node if doesn't exist based on nodeA timestamp
        mutation @if(le(val(existingTimestamp), "2020-04-20T22:15:19.794Z")) {
          set {
            uid(nodeBId) <createdAt> "2020-04-23T00:41:48.497Z" .
            uid(nodeBId) <dgraph.type> "NodeB" .
          }
        }
      }

Expected behaviour and actual result.

conditional upsert should work find on field other than uid.
Sample example given at following document is using uid only. In my case, I want to apply val on the predicate uid and then compare with corresponding value, but its not working.
https://dgraph.io/docs/master/mutations/#example-of-multiple-mutation-blocks

2 Likes

MichelDiz commented :

I think it works only with Length function. Correct me If I’m wrong. But I’m pretty sure about it, as I saw that code borning.

So move your Check to an intermediate block. e.g:

upsert {
  query {
    var(func: eq(nodeBId, "eb12")) {
      nodeBId as uid
    }

    v as var(func: eq(nodeAId, "sb1"))
    @filter(le(timestamp, "2020-04-20T22:15:19.794Z"))
  }

  # create NodeB  node if doesn't exist based on nodeA timestamp
  mutation  @if(eq(len(v), 1)) {
    set {
      uid(nodeBId) <createdAt> "2020-04-23T00:41:48.497Z" .
      uid(nodeBId) <dgraph.type> "NodeB" .
    }
  }
}

There is something curious about its mutation. You say “create NodeB node if doesn’t exist based on nodeA timestamp”. But you are using an existing UID from “eb12”. I was confused by this.

Upsert Block will create a node if it doesn’t exist. I didn’t understand why you are collecting an existing UID and saying that you are going to create a new one.

dilipkumar2k6 commented :

@MichelDiz Thx for your reply. I only shared snippet of logic. We have need to check for existing timestamp to decide if needs to create new node or not.

Moving timestamp check to intermediate block option should work, will try.

I think it works only with Length function. Correct me If I'm wrong.

Yes, we also observed this. Can we fix it? This will help to write neat code and also bring more power on conditional upsert.

shaneu commented :

I am also facing a similar issue. Consider a type File

type File {
  fileId
  timestamp
  lines
  ... other fields elided ...
}

If I wanted to upsert values based on events that might be out of order, I’d like to be able to perform a query as follows

// incoming event message
{
  "fileId": "1234a",
  "timestamp": "2020-04-23T22:15:19.794Z",
  "lines": 45
}
upsert {
  query {
    var(func: eq(fileId, "${fileId}")) @filter(lt(timestamp, "${timestamp}")) {
      fileUid as uid
      existingLines as lines
    }
  }

  mutation @if(not(eq(val(existingLines), "${lines}"))) {
    set {
      uid(fileId) <lines> "${lines}" .
    }
  }
}

Now the funny thing is the above will actually update the value of <lines>, but not for the reason we’d like. The reason the above will mutate the field is because eq(val(existingLines), "${lines}") will always evaluate too false, and not will flip it too true. Of course this isn’t the desired behavior.

I understand that len seems to be the only operator one can use on a variable in an @if statement, but I think supporting the full range of comparison operations that are available in @filter and queries would provide more power for conditional mutations, and lead to cleaner code.

As I see it now, the only option to achieve the behavior I’m looking for would be to add a query for every conditional upsert I want to make like var(func: eq(fileId, "${fileId}")) @filter (le(timestamp, "${timestamp}") AND not(eq(lines, "${lines}"))), but I’d have to perform a similar check for every field i’d like to conditionally upsert, which outside of this contrived example would lead to a very large and complex upsert operation.

Please let us know if there is any way to support a more robust set of functions for conditional mutations, thank you in advance for you consideration.

MichelDiz commented :

I understand perfectly that it is a logical leap to think that the mutation block parameter field would accept inequality functions and other functions in GQL+-. However, it sounds a little strange from the design point of view. If it weren’t, I think Aman (who worked on that code) would easily support it. And he wouldn’t need to create a new function (length was created just for Upsert Block) just for that.

See, parameters at root is used to evaluate the entity itself. A mutation does not have an entity, it is used to create an entity not to find one. So it doesn’t make sense for a parameter in something that doesn’t yet exist.

If we are going to adopt any GQL+- function in the Upsert mutation block. The first parameter will always be external (using val) reference. It’s okay to support that. But it could be done in the query field itself. And use length.

That’s my two cents about what happened and what I think.

If you really want this, you should upvote (like) the main comment (the issue itself). To show interest. Without movement, your issue goes unnoticed or not prioritized.

@dilipkumar2k6 I think you should update your issue showing that you understand that there is no support for this and wishes for it. This makes it easier for the engineer, so he doesn’t have to read every conversation and understand what’s going on. No need to erase what was there before, just add a note.

By the way, this conversation should have been done at http://discuss.dgraph.io/. Because there we have a better search engine than Github, this would help other users to understand and participate in the discussion.

Anyway, I will accept it in the backlog.

Cheers.

dilipkumar2k6 commented :

@MichelDiz thx for your response. I updated issue to reflect what you suggested.
On side note; my team would like to do PR for this feature if that’s feasible option and manageable effort.

MichelDiz commented :

Oh, that’s really great! Your PR will be welcome!

@lgalatin FYI about the contribution from @dilipkumar2k6. As I have accepted it, maybe we should have a separated thing or remove it from Jira.

lgalatin commented :

Once @dilipkumar2k6 PR is submitted and reviewed and accepted, the Jira ticket will automatically close. We will not work on it (thank you @dilipkumar2k6 ).

insanitybit commented :

I believe I have a use case for this. I have nodes with fields that should only increase/ decrease.

When I get new data I have to first check “Is the node’s property higher than the new data, or lower” and then perform the logic to update if necessary.

There are also properties that should never update - I want to avoid performing a write if the property is already set.