Facets format in mutation requests and query responses

Problem Statement:

We had identical JSON format for facets in mutation requests and query responses upto v1.1.0. But we didn’t have the ability to fetch facets on scalar list predicates(see this issue).
To support fetching facets on value list predicates, we changed the query response format as it is in versions after v1.1.0. Now response format after fetching facets on all predicate types is uniform.
We brought facets for a predicate outside of predicate in responses for all types(compare facets responses for versions upto v1.1.0 and after v1.1.0 as shown below).

NOTE: In this document, request refers to JSON mutation requests and response refers to JSON query responses.

Request/Response format comparisons upto and after v1.1.0

Scalar predicate:

  1. RDF mutation request:
_:1 <name> "Alice" (since="birth") .
  1. Equivalent JSON mutation request:
{
  "set": {
    "uid": "_:1",
    "name": "Alice",
    "name|since": "birth"
  }
}
  1. Facet query:
{
   q(func: has(name)) {
      name @facets
   }
}
  1. Response upto v1.1.0:
{
  "data": {
    "q": [
      {
        "name|since": "birth",
        "name": "Alice"
      }
    ]
  }
}
  1. Response after v1.1.0:
{
  "data": {
    "q": [
      {
        "name|since": "birth",
        "name": "Alice"
      }
    ]
  }
}

Scalar list predicate:

  1. RDF mutation request:
_:1 <nickname> "Joshua" (kind="official") .
_:1 <nickname> "David" .
_:1 <nickname> "Josh" (kind="friends") .
  1. Equivalent JSON mutation request:
{
  "set": [
    {
      "uid": "_:1",
      "nickname": "Joshua",
      "nickname|kind": "official"
    },
    {
      "uid": "_:1",
      "nickname": "David"
    },
    {
      "uid": "_:1",
      "nickname": "Josh",
      "nickname|kind": "friends"
    }
  ]
}
  1. Facet query:
{
   q(func: has(nickname)) {
      nickname @facets
   }
}
  1. Response upto v1.1.0:
    We did not support fetching predicates on scalar list predicates upto v1.1.0. Hence below repsone doesn't have facets.
{
  "data": {
    "q": [
      {
        "nickname": [
          "David",
          "Josh",
          "Joshua"
        ]
      }
    ]
  }
}
  1. Response after v1.1.0:
{
  "data": {
    "q": [
      {
        "nickname|kind": {
          "1": "friends",
          "2": "official"
        },
        "nickname": [
          "David",
          "Josh",
          "Joshua"
        ]
      }
    ]
  }
}

UID predicate:

  1. RDF mutation request:
_:1 <name> "San Francisco" .
_:1 <state> _:2 (capital=false) .
_:2 <name> "California" .
  1. Equivalent JSON mutation request:
{
  "set": {
    "uid": "_:1",
    "name": "San Francisco",
    "state": {
      "uid": "_:2",
      "name": "California",
      "state|capital": false
    }
  }
}
  1. Facet query:
{
   q(func: has(state)) {
      name
      state @facets {
         name
      }
   }
}
  1. Response upto v1.1.0:
{
  "data": {
    "q": [
      {
        "name": "San Francisco",
        "state": {
          "name": "California",
          "state|capital": false
        }
      }
    ]
  }
}
  1. Response after v1.1.0:
{
  "data": {
    "q": [
      {
        "name": "San Francisco",
        "state": {
          "name": "California"
        },
        "state|capital": false
      }
    ]
  }
}

UID list predicate:

  1. RDF mutation request:
_:1 <name> "Alice" .
_:1 <speaks> _:2 (fluent=true) .
_:1 <speaks> _:3 (fluent=false) .
_:2 <name> "Spanish" .
_:3 <name> "Chinese" .
  1. Equivalent JSON mutation request:
{
  "set": {
    "uid": "_:1",
    "name": "Alice",
    "speaks": [
      {
        "uid": "_:2",
        "name": "Spanish",
        "speaks|fluent": true
      },
      {
        "uid": "_:3",
        "name": "Chinese",
        "speaks|fluent": false
      }
    ]
  }
}
  1. Facet query:
{
  q(func: uid(0x1)) {
    name
    speaks @facets {
      name
    }
  }
}
  1. Response upto v1.1.0:
{
  "data": {
    "q": [
      {
        "name": "Alice",
        "speaks": [
          {
            "name": "Spanish",
            "speaks|fluent": true
          },
          {
            "name": "Chinese",
            "speaks|fluent": false
          }
        ]
      }
    ]
  }
}
  1. Response after v1.1.0:
{
  "data": {
    "q": [
      {
        "name": "Alice",
        "speaks": [
          {
            "name": "Spanish"
          },
          {
            "name": "Chinese"
          }
        ],
        "speaks|fluent": {
          "0": true,
          "1": false
        }
      }
    ]
  }
}

Issue raised because of above changes:

Above changes in response formats has created other issues. Request and response formats are not compatible. This means users have to maintain two Go structs at client side. This has affected our user experience. Currently we have 3 github issues listed to address this. I have tried to quote users here:

Issue: #4798

I realize the facets response format has recently changed, but I don’t understand how I can now unserialize it into my data structures : facets are now independent objects attached to the parent node while my Go facet properties are defined into the child node.

Issue: #4581

The JSON input and output are not permutable

Issue: #4907

Thank you for helping me raising the issue upon.
I don’t really see the issue and reason why we change the facets response to the way it is now. But at a dgraph’s user perspective, I think it would make more sense to have same data struct for creating or querying facets value.
And putting facets value inside the object would give a better understandable view to anyone even who new to dgraph just like me. Despite the fact that how it’s physically stored behind.

From above github issues, our users’ expectations are:

  • Have same request/response format for facets.
  • Have backward compatibility with previous versions for facets responses(if possible).

Probable solutions

Solution #1 - Have facets requests and responses format as per new response format(after v1.1.0):

Hence request format for all types will look like as follows:

  1. Scalar predicate
    Current request format:
{
  "set": {
    "uid": "_:1",
    "name": "Alice",
    "name|since": "birth"
  }
}

New request format:

{
  "set": {
    "uid": "_:1",
    "name": "Alice",
    "name|since": "birth"
  }
}
  1. Scalar list predicate
    Current request format:
{
  "set": [
    {
      "uid": "_:1",
      "nickname": "Joshua",
      "nickname|kind": "official"
    },
    {
      "uid": "_:1",
      "nickname": "David"
    },
    {
      "uid": "_:1",
      "nickname": "Josh",
      "nickname|kind": "friends"
    }
  ]
}

New request format:

{
  "set": {
      "uid": "_:1",
      "nickname": ["Joshua", "David", "Josh"],
      "nickname|kind": {
         "0": "official",
         "2": "friends"
      }
   }
}
  1. UID predicate
    Current request format:
{
  "set": {
    "uid": "_:1",
    "name": "San Francisco",
    "state": {
      "uid": "_:2",
      "name": "California",
      "state|capital": false
    }
  }
}

New request format:

{
  "set": {
    "uid": "_:1",
    "name": "San Francisco",
    "state": {
      "uid": "_:2",
      "name": "California",
    },
    "state|capital": false
  }
}
  1. UID list predicate
    Current request format:
{
  "set": {
    "uid": "_:1",
    "name": "Alice",
    "speaks": [
      {
        "uid": "_:2",
        "name": "Spanish",
        "speaks|fluent": true
      },
      {
        "uid": "_:3",
        "name": "Chinese",
        "speaks|fluent": false
      }
    ]
  }
}

New request format:

{
  "set": {
    "uid": "_:1",
    "name": "Alice",
    "speaks": [
      {
        "uid": "_:2",
        "name": "Spanish"
      },
      {
        "uid": "_:3",
        "name": "Chinese"
      }
    ],
    "speaks|fluent": {
      "0": true,
      "1": true
    }
}

Pros:

  1. This will have same request and response format.
  2. This supports fetching facets on all types of predicates.

Cons:

  1. This changes the request format for facets, hence becomes a breaking change. However we can think about making this as backward compatible by supporting old request format also.
  2. Response format will not be backward compatible with version before v1.2.0 which is also the current case.
  3. This is not the most preferred solution. Users want facets to be present inside node.

Solution #2 - Have facets requests and response format as old format(upto v1.1.0):

This is most preferred solution(Manish, Michel and our users are in favour of it).
response format for all types will look like as follows:

  1. Scalar predicate
    Current response format:
{
  "data": {
    "q": [
      {
        "name|since": "birth",
        "name": "Alice"
      }
    ]
  }
}

New response Format:

{
  "data": {
    "q": [
      {
        "name|since": "birth",
        "name": "Alice"
      }
    ]
  }
}
  1. Scalar list predicate
    Current response format:
{
  "data": {
    "q": [
      {
        "nickname|kind": {
          "1": "friends",
          "2": "official"
        },
        "nickname": [
          "David",
          "Josh",
          "Joshua"
        ]
      }
    ]
  }
}

New response Format:

Need to decide one format here. 
  1. UID predicate
    Current response format:
{
  "data": {
    "q": [
      {
        "name": "San Francisco",
        "state": {
          "name": "California"
        },
        "state|capital": false
      }
    ]
  }
}

New response Format:

{
  "data": {
    "q": [
      {
        "name": "San Francisco",
        "state": {
          "name": "California",
          "state|capital": false
        },
      }
    ]
  }
}
  1. UID list predicate
    Current response format:
{
  "data": {
    "q": [
      {
        "name": "Alice",
        "speaks": [
          {
            "name": "Spanish"
          },
          {
            "name": "Chinese"
          }
        ],
        "speaks|fluent": {
          "0": true,
          "1": false
        }
      }
    ]
  }
}

New response Format:

{
  "data": {
    "q": [
      {
        "name": "Alice",
        "speaks": [
          {
            "name": "Spanish", 
            "speaks|fluent": true
          },
          {
            "name": "Chinese",
            "speaks|fluent": true
          }
        ]
      }
    ]
  }
}

Pros:

  1. Again same request and response format here.
  2. Response format will be same for all version expect versions from v1.2.0 to now.

Cons:

  1. This breaks our current response format. Hence becomes a breaking change.
  2. This doesn’t have any way as of now to support scalar list response. Any alternate way can be thought here and this can be treated as exceptional case. But again response and request structures won’t be same, which is the whole purpose of this exercise.

Solution #3 - Have facets requests and response format as old format(upto v1.1.0) except for scalar list(Hybrid approach):

We have already seen in solution #2, request and response format are same for all predicates types except for scalar list. Solution #2 doesn’t have any way to represent request/response format for scalar list type, such that both are compatible with each other. Hence we can take some middle way. We can represent request/response format for scalar list type in same format as proposed in solution #1.
Hence there will not be any change in response format for scalar list type.

Current request Format:

{
  "set": [
    {
      "uid": "_:1",
      "nickname": "Joshua",
      "nickname|kind": "official"
    },
    {
      "uid": "_:1",
      "nickname": "David"
    },
    {
      "uid": "_:1",
      "nickname": "Josh",
      "nickname|kind": "friends"
    }
  ]
}

New request Format:

{
  "set": {
      "uid": "_:1",
      "nickname": ["Joshua", "David", "Josh"],
      "nickname|kind": {
         "0": "official",
         "2": "friends"
      }
   }
}

Pros:

  1. Same request and response format.
  2. Has all properties from solution #2 and fixes issues with solution #2.

Cons:

  1. Breaking as both request and response format. Breaking request format for only scalar list type but breaking response format for all type except scalar list.

Solution #4 - Leave this in current state, which is having two different format for requests and responses.

Pros:

  1. This does not changes anything and hence no breaking change.

Cons:

  1. If we leave request/response in current format, users has two maintain two different Go structs at client side, which is not a good user experience.

Summary

Breaking request format Breaking response format Facets inside node Way to represent Scalar list response Same request/request format
Solution #1 Scalar list, UID, UID list x x
Solution #2 x Scalar list, UID, UID list x
Solution #3 Scalar list UID, UID list
Solution #4 x x x x
1 Like

The hybrid approach (solution #3) sounds best because people were happy with how facets worked before. They only did not work for scalar lists so if we can find a way for them to work for scalar lists like you do in your solution, that should be good enough.

3 Likes

I agree

This format looks good, but wouldn’t it be nice to have the list index in the list itself? for didactic reasons and to avoid possible typos.

e.g:

{
  "set": {
      "uid": "_:1",
      "nickname": [{"0":"Joshua", "1": "David", "2":"Josh"}],
      "nickname|kind": {
         "0": "official",
         "2": "friends"
      }
   }
}

If it is not possible it is fine, the documentation needs to be specific on that.

1 Like

@MichelDiz, since values in scalar list predicate are automatically assigned index in increasing order. I think mentioning indexes explicitly can be avoided.

1 Like

BTW, see this user post Order by facet on scalar predicate - It is in the same topic, but I kind of new feature request based in list type with facets.

Between the proposed solutions, I think the best one is #3 for the same reason that pawan said.

I tried to write the different cases we have as the most Goish structures I can to see what kind of json output it will make in a way where there is only one way to manage facets.
I also wanted to illustrate all cases with multiple facets.

For this proposal to work, I need a new reserved keyword:
As uid key define a dgraph node,object key will define an object with facets

Case Scalar

type Person struct {
    UID string `json:"uid,omitempty"`
    Name string `json:"person.name"`
    Nickname *struct {
        Object string `json:"object"`
        Kind *string `json:"person.nickname|kind,omitempty"`
        InUsage *bool `json:"person.nickname|inusage,omitempty"`
    } `json:"person.nickname,omitempty"`
}
{
    "uid": "_:1",
    "person.name": "emile",
    "person.nickname": {
        "object": "milmil",
        "person.nickname|kind": "friends",
        "person.nickname|inusage": true,
    }
}

Case Scalar list

type Person struct {
    UID string `json:"uid,omitempty"`
    Name string `json:"person.name"`
    Nicknames []struct {
        Object string `json:"object"`
        Kind *string `json:"person.nickname|kind,omitempty"`
        InUsage *bool `json:"person.nickname|inusage,omitempty"`
    } `json:"person.nickname,omitempty"`
}
{
    "uid": "_:1",
    "person.name": "emile",
    "person.nickname": [
        {
            "object": "milmil",
            "person.nickname|kind": "friends",
            "person.nickname|inusage": true,
        },
                {
            "object": "whatever",
            "person.nickname|kind": "nobody",
            "person.nickname|inusage": false,
        }
    ]
}

Case UID

type Person struct {
    UID string `json:"uid,omitempty"`
    Name string `json:"person.name"`
    Nickname *struct {
        Object string `json:"object"`
        Kind *string `json:"person.nickname|kind,omitempty"`
        InUsage *bool `json:"person.nickname|inusage,omitempty"`
    } `json:"person.nickname,omitempty"`
}

type Nickname struct {
    UID string `json:"uid,omitempty"`
    Nickname string `json:"nickname.nickname"`
}
{
    "uid": "_:1",
    "person.name": "emile",
    "person.nickname": {
        "object": {
            "uid": "_:2",
            "nickname.nickname": "mimil"  
        },
        "person.nickname|kind": "friends",
        "person.nickname|inusage": true,
    }
}

Case UID list

type Person struct {
    UID string `json:"uid,omitempty"`
    Name string `json:"person.name"`
    Nickname []struct {
        Object string `json:"object"`
        Kind *string `json:"person.nickname|kind,omitempty"`
        InUsage *bool `json:"person.nickname|inusage,omitempty"`
    } `json:"person.nickname,omitempty"`
}

type Nickname struct {
    UID string `json:"uid,omitempty"`
    Nickname string `json:"nickname.nickname"`
}
{
    "uid": "_:1",
    "person.name": "emile",
    "person.nickname": [
        {
            "object": {
                "uid": "_:2",
                "nickname.nickname": "mimil"  
            },
            "person.nickname|kind": "friends",
            "person.nickname|inusage": true,
        },
                {
            "object": {
                "uid": "_:3",
                "nickname.nickname": "whatever"  
            },
            "person.nickname|kind": "nobody",
            "person.nickname|inusage": false,
        }
    ]
}
2 Likes

Hybrid approach so facets can be marshaled into structs. That is the most important feature to have in this issue imo.

1 Like

tl;dr; Solution #3

My code heavily relies on facets on edges working like they did in 1.1.0, and I’d like to upgrade :stuck_out_tongue:

Scalar list format

I don’t currently use scalar lists, however I strongly agree on the updated mutation format, that solution #3 brings (ignoring the facet response for now).

When looking at this mutation:

{
  "set": [
    {
      "uid": "_:1",
      "nickname": "Joshua"
    },
    {
      "uid": "_:1",
      "nickname": "David"
    }
  ]
}

I would obviously expect the second node to overwrite the first one, since this is how it works with uid nodes. Having Scalar Lists have a “list mutation” format:

{
  "set": [
    {
      "uid": "_:1",
      "nickname": ["Joshua"]
    },
    {
      "uid": "_:1",
      "nickname": ["David"]
    }
  ]
}

seems good to me, as it allows to see at first glance, that it adds to a list.
I understand, that list delete mutations would also change. Currently deletion:

{
   "uid": "0xd", #UID of the list.
   "testList": "Apple"
}

New delete format for scalar lists:

{
   "uid": "0xd", #UID of the list.
   "testList": ["Apple"]
}

Do I understand correctly?

Scalar list facets

I like the way it is solved in solution #3. I think this topic was exhausted in issue #4081, which arrived at the current format for scalar list facets.


However I would just like to digress and say, that representing scalar lists the same way as edges/uid-lists would be one solution to the problem, albeit a bit hacky one:

{
  "name": [
    {
      "name": "Alice Smith",
      "since": "1990-01-01"
    },
    {
      "name": "Alice Baker",
      "since": "2019-01-01"
    }
  ]
}

Mutation would look like this:

{
  "set": [
  {
    "uid": "_:1",
    "name": [
      {
        "name": "Alice Smith",
        "since": "1990-01-01"
      }
    ]
  },
  {
    "uid": "_:1",
    "name": [
      {
        "name": "Alice Baker",
        "since": "2019-01-01"
      }
    ]
  }
  ]
}

Now that I look at it, it could cause confusion if it’s actually an edge or not. The mutation parser would also not know this, when using dgraph as schemaless. However in that case, why have scalar list facets at all and not represent them as nodes, with all the power that it gives us… Modeling it as nodes would also solve the problem of the person that MichelDiz mentioned a few posts up.

Hi @ppp225, thanks for providing above feedback. I have clarified few things below:

Overwriting is not happening currently for scalar list. Thats why both values for nickname are getting accepted.

Better way to write above mutation after new format would be:

{
  "set": [
    {
      "uid": "_:1",
      "nickname": ["Joshua", "David"]
    }
  ]
}

In case of deletion both the mutations are valid. First mutation just deletes “Apple”, while second mutation has capability to delete multiple values in “testList”.

Yes, your understanding it correct, parser identifies facets field by “|” in the name.

In some case it leads to creating some extra nodes, some users want to avoid this.

If I get a vote, it would be #3. I never wanted this feature and it’s causing big problems for me, as I am reliant on a super critical bugfix which is only available in a version where the breaking change to facets is introduced.

From my perspective the feature was introduced hastily, which is very concerning when you’re trying to build robust services relying on dgraph stability. I would appreciate deeply if the excellent development team could improve on preserving backwards compatibility when developing new and exciting features.

Hi Robin, thanks for the comment. We have gone with #3.
Can you tell us what is the bug fix you are mentioning here? If possible, we can cherry-pick it to the major version you are using currently and do a patch release.

Thanks for feedback in general. I will convey this to our product team.

Hi Ashish. I believe you were the one who lead the bug fix. It was the inconsistency bug that I believe was related to some regressions when optimizations done to recursive queries was worked on. I had an internal discussion with the dev team over slack.

I also requested in slack to have the bug fix be cherry picked into a stable commit prior to all these facet changes, so this would really be great.

Thank you so much for the follow up and on the great work you’re doing.