Is it possible to select distinct non-uid predicate?

ondrowan · August 2, 2018, 2:13pm

Given following schema:

user: string .
group_owner: uid @reverse .
group_type: string .

and following dataset:

{
    set {
        _:user1 <user> "user1" .
        _:user2 <user> "user2" .
        _:a1 <group_type> "a" .
        _:a1 <group_owner> _:user1 .
        _:a2 <group_type> "a" .
        _:a2 <group_owner> _:user2 .
        _:b1 <group_type> "b" .
        _:b1 <group_owner> _:user1 .
        _:b2 <group_type> "b" .
        _:b2 <group_owner> _:user2 .
        _:c <group_type> "c" .
    }
}

Is it possible to return distinct group_type from groups owned by multiple users and then paginate them? I tried the following:

{
  var(func: uid(0x10a90e, 0x10a90f)) {
    ~group_owner {
    	gt as group_type
    }  
  }
  
  q(func: uid(gt), first: 2) {
    group_type
  }
}

Which lists even duplicates. I also tried to @groupby(group_type) and then count/max, but that doesn’t seem to be a way forward if I want to also paginate the results. Is this even currently possible in Dgraph?

MichelDiz · August 2, 2018, 7:58pm

To do reverse you need to set your schema as @reverse

here => group_owner: uid @reverse .

Your mutation is confusing to read just do:

{
    set {
        _:user1 <user> "user1" .
        _:user2 <user> "user2" .
        _:a1 <group_type> "a1" .
        _:a1 <group_owner> _:user1 .
        _:a2 <group_type> "a2" .
        _:a2 <group_owner> _:user2 .
        _:b1 <group_type> "b1" .
        _:b1 <group_owner> _:user1 .
        _:b2 <group_type> "b2" .
        _:b2 <group_owner> _:user2 .
        _:c <group_type> "c1" .
    }
}

Actually I did not quite understand what u wanted, I spent some time trying to understand and simulate at the same time. But I do not think I’ve reached to any conclusion.

This was what I tried:

{
  GET_USER as var(func: has(user))

 test(func: uid(GET_USER)) {
    uid
    user
    ~group_owner (first:2) {
    	group_type    } 
}
}

other Q.

{
  test(func: has(group_owner)){
    uid
    group_type
    group_owner (first:2) { 
       uid 
       user
    }
  }
}

ondrowan · August 2, 2018, 8:15pm

I wanted to achieve the same as SELECT DISTINCT would accomplish in SQL. I want to select all groups where given users are group_owner, take all group_type from these groups and return ones with unique name. So with my original dataset (don’t change it, else it doesn’t make sense), I want to return only a, b, c. With the query from original post, it returns a, a, b, b, c.

MichelDiz · August 2, 2018, 11:14pm

Can you check this discussion to see if help you? Get output of uid(a,b,...) as intersection not union or digg more about SQL here on discuss.

From your example " _:c <group_type> “c1” ." isn’t linked with any user. So you’ll never get C.

In fact from your example I can not think of anything in relation to what you say. In other cases I would try to use “uid_in”, but in your Set I can not use it as you need. Then a set closer to reality would help to follow the logic and analyze whether can or can’t do.

So, your q returns this:

"data": {
    "q": [
      {
        "group_type": "a2"
      },
      {
        "group_type": "b1"
      },
      {
        "group_type": "b2"
      },
      {
        "group_type": "a1"
      }
    ]

The q I presentend returns this (the obvious):

"test": [
      {
        "uid": "0x2723",
        "user": "user1",
        "~group_owner": [
          {
            "group_type": "b1"
          },
          {
            "group_type": "a1"
          }
        ]
      },
      {
        "uid": "0x2724",
        "user": "user2",
        "~group_owner": [
          {
            "group_type": "a2"
          },
          {
            "group_type": "b2"
          }
        ]
      }
    ]
  }

Cheers

ondrowan · August 3, 2018, 2:30pm

Something is being lost in translation, so I’ll try to rephrase the question.

I have nodes with people. Each person can have assigned one or more skills through skill_owner. Skill name is always a string.

person: string @index(exact) .
skill_name: string .
skill_owner: uid @reverse .

Let’s assign skills to 2 people. Michel with Java and Dgraph. Ondrej with Python and Dgraph.

{
  set {
    _:michel <person> "Michel" .

    _:m_skill_1 <skill_name> "Java" .
    _:m_skill_1 <skill_owner> _:michel .
    
    _:m_skill_2 <skill_name> "Dgraph" .
    _:m_skill_2 <skill_owner> _:michel .
    
    _:ondrej <person> "Ondrej" .
    
    _:o_skill_1 <skill_name> "Python" .
    _:o_skill_1 <skill_owner> _:ondrej .
    
    _:o_skill_2 <skill_name> "Dgraph" .
    _:o_skill_2 <skill_owner> _:ondrej .
  }
}

Now I’m assigning a team and need to know what unique skills it’s gonna have. My other requirement is that those unique skills have to be paginable.

In SQL the query I need would be:

SELECT DISTINCT skill.name
FROM skill
LEFT JOIN person
ON person.id = skill.owner
WHERE person.name = 'Ondrej' OR person.name = 'Michel'
LIMIT 10

Which returns:

Dgraph
Java
Python

In Dgraph, I tried this:

{
  var(func: eq(person, ["Ondrej", "Michel"])) {
    ~skill_owner {
    	s as skill_name
    }  
  }
  
  q(func: uid(s), first: 10) {
    skill_name
  }
}

Which returns:

"data": {
  "q": [
    {
      "skill_name": "Java"
    },
    {
      "skill_name": "Dgraph"
    },
    {
      "skill_name": "Python"
    },
    {
      "skill_name": "Dgraph"
    }
  ]
},

As you can see, it returns Dgraph twice. I need only unique skill names, so I can paginate them properly.

MichelDiz · August 3, 2018, 3:47pm

Now I get it!

So:


{
  set {
    
 # First we set Unique nodes for Skills
    _:Skill_Java <skill_name> "Java" .
    _:Skill_Dgraph <skill_name> "Dgraph" .
    _:Skill_Python <skill_name> "Python" .

 #Now we assign the users to that skills using your way to relate those
    _:michel <person> "Michel" .
    _:Skill_Java <skill_owner> _:michel .
    _:Skill_Dgraph <skill_owner> _:michel .
    
    _:ondrej <person> "Ondrej" .
    _:Skill_Python <skill_owner> _:ondrej .
    _:Skill_Dgraph <skill_owner> _:ondrej .

  }
}

And the same q:

{
  var(func: eq(person, ["Ondrej", "Michel"])) {
    ~skill_owner {
    	s as skill_name
    }  
  }
  
  q(func: uid(s), first: 10) {
    skill_name
  }
}

{
  "data": {
    "q": [
      {
        "skill_name": "Dgraph"
      },
      {
        "skill_name": "Python"
      },
      {
        "skill_name": "Java"
      }
    ]
  }

ondrowan · August 7, 2018, 12:50pm

Thanks. That was my backup plan, but hoped it’d be possible to do this with strings as well. Are there are any plans to introduce this functionality in the future?

MichelDiz · August 7, 2018, 2:01pm

Can you exemplify what you mean? We have several types of indexing for strings, how would that use, that you mention?

ondrowan · August 7, 2018, 2:23pm

With your solution, it means I have to create unique nodes with skill names and use uids in the query to get the uniqueness. It means a slight complication which could be prevented if I could just list all unique strings (without creating any extra nodes) as I originally wanted. The same as SELECT DISTINCT would do in SQL.

MichelDiz · August 7, 2018, 4:57pm

Okay, I see. But this is how the paradigm of graphs databases works. I believe that this logic you want would never be implemented other than via user application (means your app).

Graphs DBs are completely distinct from what SQL would be. The idea is that something that isn’t related via edges is a “new/diff thing/obj”. That is, you would always have two “dgraphs” in the query response. Dgraph understands that each result is unique, even though the string is the same.

This could even be implemented, but has no strength to be. For it is another paradigm. This could even generate a lot of useless chunks in the way. Making a little hell to clean the DB or unnecessary disk usage.

Topic		Replies	Views
Remove Duplicates from Results GraphQL graphql , dgraph	3	3066	September 1, 2020
Select distinct values / deduplicate Dgraph kind:question , dql	2	1505	April 27, 2021
Count distinct values of predicate: @groupby + count performance question Dgraph kind:question , dgraph	1	544	March 13, 2023
Type-defined predicates seem to accept all types of UIDs, not just the defined type Dgraph kind:question	3	444	July 16, 2022
Querying on predicate by UID Users	4	1571	July 29, 2017

Is it possible to select distinct non-uid predicate?

Related topics