Groupby does not support predicate of list type

Moved from GitHub dgraph/4170

Posted by honglicheng:

kind: [string]

_:A <kind> "dog" .
_:A <kind> "animal" .
_:B <kind> "cat" .
_:B <kind> "animal" .
@groupby(kind){
    count(uid)
}

i want to get :

{kind: "dog", count: 1}, {kind: "cat", count: 1}, {kind: "animal", count: 2}

but only get:

{kind: "dog", count: 1}, {kind: "cat", count: 1}

it seems that only the first item in the list can be grouped.

prashant-shahi commented :

Hello @honglicheng,
Could you mention the Dgraph version and the complete query that you used?

honglicheng commented :

version: 1.1.0

{
    q(func: has(kind)) @groupby(kind){
        count(uid)
    }
}

MichelDiz commented :

Hi there,
Internally we have a discussion about improving the “group by” function. This case would be something “different” tho (but not so much). But thinking about this issue, I come to a conclusion that if we improve the “group by” function this query should look like:

{
 var(func: has(kind)) @groupby(kind) {
    T as count(uid)
  }

foreach(func: foreach(in: T, title: kind)) {
  name
  age
  total : val(T)
}

}

Desirable Result

{
  "data": {
    "q": [
      {
        "dog": [
              {
                "total": 1
              },
              {
                "uid": "0x1",
                "name": "Bingo",
                "age": "3"
              }
      ]
      },
      {
        "animal": [
              {
                "total": 2
              },
              {
                "uid": "0x1",
                "name": "Bingo",
                "age": "3"
              },
              {
                "uid": "0x3",
                "name": "Angry Purr",
                "age": "1"
              }
      ]
      },
      {
        "cat": [
              {
                "total": 1
              },
              {
                "uid": "0x3",
                "name": "Angry Purr",
                "age": "1"
              }
      ]
      }
    ]
  }

So just ran into this very confusing little gem. Is there any talk about changing the @groupby here in 2021 land? Seems like an invalid result for operation on lists, would rather have seen an error than a seemingly random (though consistent) result.

Can we get some grouping that uses all the elements of a list? My use case is how many of each type are in my system. Basically q(func: has(ns.type)) @groupby(ns.type) { count(uid) } - but currently the result is whatever is “first” in the list of types is the key of the group, even if that node has a type that is also the key of another group.

@ibrahim @pawan this seems doable (if we hash the list by doing a concatenative hash across the elements or something). Thoughts?

Just to share - we had to stay away from @groupby for this reason and we had to make a query to get the types, then construct a query like this to get the counts for each type:

query{
   FirstType (func: eq(proj.type,"FirstType")) {cnt: count(uid)}
   SecondType(func: eq(proj.type,"SecondType")){cnt: count(uid)}
   ThirdType (func: eq(proj.type,"ThirdType")) {cnt: count(uid)}
}

I don’t know how groupby works internally. I’ll have to dig deeper before I can comment on this.