Improve Facets mutation handling

Moved from GitHub dgraph/1996

Posted by willcj33:

Currently, if I write a facet like so <n1> <pred> <n2> (fromMerge:true) . and then afterwards, another process “upserts” the same two nodes and edge like so <n1> <pred> <n2> ., the fromMerge boolean facet is removed.

If I don’t alter the facets in any way, would it make sense to not delete them? It would make managing edge facets easier on us, or at least allow an option to NOT overwrite the current facets.

This is currently an issue for us because when the fromMerge is written, there are specific things that only ever happen in that context that no other process that upserts items knows about. Thus I continue to lose that facet, which is used for specific and critical actions down the road.

Using 1.0.1 and the Golang Client.

MichelDiz commented :

I believe you should instantiate the data from Facet. For the idea behind the Facet is to have a “status” of that information. As an extra, but not as a immutable. So if you change it, the status should be changed too. Or in that case, deleted. if you want to keep or refresh the Facet you should control this in the application as if it were a pipe.

That’s my two cents.

pawanrawal commented :

This is more complicated than I expected. My advice would be to fetch the facets first before upserting for now. Implementing this properly will take some time.

manishrjain commented :

So, if we have round brackets, then merge. If no round brackets, delete facets.

jimanvlad commented :

Is this likely to be implemented base don @manishrjain 's suggestion above?

manishrjain commented :

We’ll definitely work on this, no exact timeline yet.

MichelDiz commented :

I think an “append” operation should solve the case. Perhaps in cases of KV overwriting we should have an alert preventing the mutation. Or maybe just overwrite it.

Who self-assign? @srfrog , @codexnull , @gitlw , @mangalaman93, @martinmr ?

AlexanderCHarrington commented :

Has there been any decision on this?

lelvisl commented :

Some updates???

campoy commented :

Hi all,

Sorry for the delay! I’m trying to better understand this feature request.
If I understood correctly when you insert the same edge multiple times you’d like to sometimes specify that the previous facets should be merged with the new ones.

What we have now

Currently, if we have a predicate met between harry and voldemort which would have been created with the following mutation:

{
    set {
        _:harry <name> "Harry" .
        _:voldemort <name> "Voldemort" .
        _:harry <met> _:voldemort (inYear=2001) .
    }
}

Currently, in order to add a new facet, e.g. likes=false, we need to first fetch the previous facets attached to the predicate so we can then merge them on the client side and send them all back in a new mutation that would look something like the following:

{
    set {
        <0xb> <met> <0xc> (inYear=2001, likes=false) .
    }
}

So if we now run the following query, we’ll get both the inYear and likes facets:

{
  q(func:eq(name, "Harry")) {
    uid
    name
    met @facets {
      uid
      name
    }
  }
}

Which returns:

{
  "extensions": {...},
  "data": {
    "q": [
      {
        "uid": "0xb",
        "name": "Harry",
        "met": [
          {
            "uid": "0xc",
            "name": "Voldemort",
            "met|inYear": 2001,
            "met|likes": false
          }
        ]
      }
    ]
  }
}

Proposal

Fetching all facets before adding a new one can be sometimes complicated, requires transactions, and seems to not be the easiest way to add one more facet to an existing predicate.

We would like to be able to provide syntax for the following cases:

  1. we want to remove all previous facets
  2. we want to replace all previous facets with a new set
  3. we want to keep the existing facets and add a new set

Currently, (1) is simply done by not passing any facets.
Similarly, (2) is done by passing a list of facets in the mutation.

So we need to provide a new syntax for (3). Since the N-Triples specification does not provide any support for facets, I would recommend simply adding a + right before the parentheses grouping the facets.

So given the previous dataset, where only the inYear facet exists between Harry (0xb) and Voldemort (0xc), we have four different cases:

  • <0xb> <met> <0xc> .: the resulting predicate has no facets, as before.
  • <0xb> <met> <0xc> (likes=false) .: the resulting predicate has only the likes facet, as before.
  • <0xb> <met> <0xc> +(likes=false) .: the resulting predicate has both inYear and likes facets.
  • <0xb> <met> <0xc> +() .: the resulting predicate keeps the same facets as before, just inYear.

And now, in JSON

For completion, the corresponding JSON mutation would look like the following:

{
    "set": [
      {
        "name": "Harry",
        "met": {
          "name": "Voldemort",
          "met|inYear": 2001
        }
      }
    ]
}

In order to add a new likes facet, the mutation would now look like the following:

{
    "set": [
      {
        "uid": "0xb",
        "met": {
          "uid": "0xc",
          "met|likes": false
        }
      }
    ]
}

This mutation would act as <0xb> <met> <0xc> (likes=false) ., removing the previous facets.
So how do we indicate that the previous facets should be kept in JSON format?

We have two options, the first one is simply not to support this use case in JSON. This might be a decent temporary solution but eventually, we’ll have to fix it.

The first solution that comes to mind is to use the same syntax we have now for facets pred|facet but combine it in a way that makes it so the change would be backwards compatible. Facet names can’t contain the character @, so let’s use it to add a pseudo facet @keep with a boolean value (set to false by default).

When this is set to true, the previous facets will be conserved similarly to the +() syntax for RDF.

So in order to add a new likes facet, we could write:

{
    "set": [
      {
        "uid": "0xf",
        "met": {
          "uid": "0x10",
          "met|likes": false,
          "met|@keep": true
        }
      }
    ]
}

This would act exactly as <0xb> <met> <0xc> +(likes=false) ., and the resulting predicate would have both likes and inYear facets.

What do you all think?

mangalaman93 commented :

A few questions -

  • Why not use + in json format as well? (That also leads to the side note, below)
  • What about Upsert on facets?
  • What about technical challenges in doing this? I’m wondering this makes sense, why didn’t we do this in the first place? Were there technical challenges that we faced?

Side Note -
As far as I know, we do not have well defined rules for variables names, predicate names, facet names etc. We should materialize a definition and update the implementation to follow those rules.

MichelDiz commented :

RFC: Improve Facets with Upsert Block

I think that the upsert would be very useful.

e.g - Here we copy rating, something, and somethingElse and add 2 new values to the facet.

upsert {
      query {
        me(func: eq(email, "aman@dgraph.io")) {
          v as uid
          edge as someEdge @facets(r as rating, t as something, e as somethingElse)
        }
      }
      mutation {
          set {
            uid(v) <someEdge> uid(edge) (weight=0.1, updated_at=2019-06-02T13:01:09, val(r), val(t), val(e)) .
          }
      }
    }

We could have a “Spread operator” (Recurse) to save time. In some cases that there are many values in facets.
e.g

upsert {
      query {
        me(func: eq(email, "aman@dgraph.io")) {
          v as uid
          edge as someEdge @facets(r as ...)
        }
      }
      mutation {
          set {
            uid(v) <someEdge> uid(edge) (weight=0.1, updated_at=2019-06-02T13:01:09, val(r)) .
          }
      }
    }

The three dots would be the operator (r as ...). It will copy and paste the previous values together with the new values.