Is it possible to mutate with query like in upsert?

Upsert syntax is really useful however it inserts data if query result is empty.
Is there a similar syntax to support bulk mutations based on query results?

Second question: Is it safe to bulk update 1 million predicates in one query or shall we execute updates in batches?

PS: RDMS users are very familiar with single UPDATE statement combined with a query to update entire table at once. It would be nice to have a similar functionality with Dgraph.

For example: Mark all messages read.
SQL: UPDATE Messages SET isRead = true WHERE msgOwner = XXX

You can just do an upsert

upsert  {
    query{
        q(func: type(message)) @filter(eq(msgOwner, "xxx")) { v as uid }
    }
    mutate {
        uid(v) <isRead> "true" . 
    }
}

Hi @chewxy
I don’t want to create new record. If query result is empty this inserts new record.
I want same functionality without an insert.

It won’t insert new records. To insert new records the mutation would have something that looks like

_:VARIABLENAME <isRead> true .

The subtle difference is the _: . That creates a new node. But if you have the UIDs already, then it just creates/sets the isRead predicate

Below query inserts a new record. I tested and confirmed.

upsert  {
    query{
        q(func: type(message)) @filter(eq(message-read, false))
        {
          v as uid 
        }
    }
    mutation {
      set {
      	uid(v) <dgraph.type> "message" .
      	uid(v) <isRead> "true" . 
      }
    }
}

I found a workaround with conditional mutation.

upsert  {
    query{
        q(func: type(message)) @filter(eq(message-read, false))
        {
          v as uid 
        }
    }
    mutation @if(gt(len(v),0))  {
      set {
      	uid(v) <dgraph.type> "message" .
      	uid(v) <isRead> "true" . 
      }
    }
}

However, the upsert is not a good naming as it confuses query writer.
I would like to have an update mode without giving an if mutation condition ensuring there is no insert.

Um, no. It should work. The following are my schema and initial dataset:

Schema:

<isread>: bool .
<raw>: string .

type <message> {
	isread
	raw
}

Initial Dataset:

{

  set {
   _:a <dgraph.type> "message" .
   _:a <isread> "false" .
   _:a <raw> "Hello there" .
  }

}

Upsert that I made:

$ curl -H "Content-Type: application/rdf" -X POST localhost:8080/mutate?commitNow=true -d $'upsert  {
    query{
        q(func: type(message)) @filter(eq(isread, false))
        {
          v as uid 
        }
    }
    mutation {
      set {
      uid(v) <isread> "true" . 
      }
    }
}' | jq

Response:

{
  "data": {
    "code": "Success",
    "message": "Done",
    "queries": {
      "q": [
        {
          "uid": "0x7"
        }
      ]
    },
    "uids": {}
  },
  "extensions": {
    "server_latency": {
      "parsing_ns": 31365,
      "processing_ns": 2097074,
      "encoding_ns": 34441,
      "assign_timestamp_ns": 834844,
      "total_ns": 3248930
    },
    "txn": {
      "start_ts": 4751,
      "commit_ts": 4752,
      "preds": [
        "1-isread"
      ]
    }
  }
}

And when I query, I find that the values have been updated

  q(func: type(message)){
    expand(_all_)
  }

Regarding the name “Upsert Block”

I see what the confusion is now. I sorta agree with you that it’s not a great name.

It’s called an “upsert block” because you use it to update/insert predicates (also nodes). In fact, inserting nodes is a special case, requiring the special format _:varname for the subject. It’s basically a combination of query and mutation blocks.

Now, there’s no concept of “record” in Dgraph. See this blog post for more information of how data is stored in graphs in Dgraph.

We can of course have vague notions of "record"s. In a mutation

set {
    _:foo <pred1> "pred1 value for foo" .
    _:foo <pred2>  "pred2 value for foo" .

    _:bar <pred1> "pred1 value for bar" .
    _:bar <pred2> "pred2 value for bar" .
}

We can mentally group foo as a “record” and bar as a “record”. But really they’re not real records in the same way as they are in a SQL database.

But we’re in a graph database regime, so we need to think of things in terms of graphs.

The upsert query you used inserts a new predicate value.
You query by type that is why you can’t see the new data.
Do a query without dgraph.type filter and you will see the inserted data.

Here, my initial value (note the UID):

My upsert:

Results:

No new predicates were inserted.

p/s: these are the same queries as the one above. Feel free to copy/paste and try it on your own

Hi @chewxy
I think we have a misunderstanding here. Apologize for my poor definition in the question.
The upsert only inserts data if the filter condition does not match any record.
So in your case your filter matches some records and you don’t get any inserts.
If there was no message with isread = false in the initial case,
your upsert query would insert data. This is how it works.

The confusion comes in play when we just want to update the data based on some query.
To do so, we can use conditional mutation as a workaround like I posted above.
I believe, it would be better if we have an update syntax exactly like upsert but ensuring no insert at all.

Oh yes you are right. In that case you will want to use the @if directive, which I reckon you’ve discovered.

Again, I agree with you that the name is unfortunate.

On the latter bit I will raise an issue.

1 Like