Labelled subgraph matching in DQL

jackwaudby · March 12, 2021, 10:22am

Hi,

I’m trying to perform subgraph matching in DQL, where a query counts the number of times a subgraph matches some pattern.

For example, consider a social network in which people makes friendships and comment on each other’s posts. (Example graph below). I want to count the number of times a person has replied to a post made by a friend (pattern attached).

Is it possible to express such queries in DQL?

_:person1 <name> "1" .
_:person1 <dgraph.type> "Person" .
_:person2 <name> "2" .
_:person2 <dgraph.type> "Person" .
_:person3 <name> "3" .
_:person3 <dgraph.type> "Person" .
_:person4 <name> "4" .
_:person4 <dgraph.type> "Person" .
_:person5 <name> "5" .
_:person5 <dgraph.type> "Person" .
_:person1 <knows> _:person2 .
_:person1 <knows> _:person3 .
_:person1 <knows> _:person4 .
_:person2 <knows> _:person3 .
_:person3 <knows> _:person4 .
_:post1 <name> "1" .
_:post1 <dgraph.type> "Post" .
_:post2 <name> "2" .
_:post2 <dgraph.type> "Post" .
_:comment1 <name> "1" .
_:comment1 <dgraph.type> "Comment" .
_:comment2 <name> "2" .
_:comment2 <dgraph.type> "Comment" .
_:comment3 <name> "3" .
_:comment3 <dgraph.type> "Comment" .
_:comment4 <name> "4" .
_:comment4 <dgraph.type> "Comment" .
_:comment5 <name> "5" .
_:comment5 <dgraph.type> "Comment" .
_:comment6 <name> "6" .
_:comment6 <dgraph.type> "Comment" .
_:comment3 <reply_of> _:comment2 .
_:comment4 <reply_of> _:comment3 .
_:comment5 <reply_of> _:comment4 .
_:comment1 <reply_of> _:post1 .
_:comment2 <reply_of> _:post1 .
_:comment6 <reply_of> _:post2 .

Thanks,

Jack

littleone · March 12, 2021, 1:57pm

I guess dgraph doesn’t support this, even does, it well be an expensive cost.
in my opinion, you should add a redundant predicate to make it filterable , perhaps you should present you schema definition , I am not sure what’s you want to do .

iluminae · March 13, 2021, 6:42am

seems like you would be able to do this pretty easily with using vars as filters:

query {
  me as var(func: type(Person)) @filter(eq(name,"1"))
  var(func: uid(me)) {
    knows {
      //has_creator isnt in your sample RDF but should have @reverse index as well
      // postsFromFriends is a variable that will have the UIDs of posts made by people I know
      postsFromFriends as ~has_creator 
    }
  }
  var(func: uid(me)) {
    made_comment { // or whatever the edge is to replies.
   	   postsILikedByFriends as reply_of @filter(uid(postsFromFriends))
    }
  }
  q(func: uid(postsILikedByFriends)) {
    count(uid)
  }
}

(if I understand the question correctly)

jackwaudby · March 15, 2021, 4:50pm

Hi @iluminae

Thanks for your response and pointing out I’d missed the has_creator edges from the sample RDF - I will amend this.

You start your query by filtering for the person with name 1 and then matching the pattern from there. Doing so would miss matches of the pattern that start from, for example, person 2. Is there a way to write a DQL query to get all matches of the pattern? Without having to manually filter for a specific persons each time?

Jack

jackwaudby · March 15, 2021, 4:52pm

Hi @littleone,

I’m more interested in whether such a query is possible rather than the associated costs, for now, anyway

Thanks for the pointers, I’ll add the schema to the sample graph.

Jack

verneleem · March 15, 2021, 5:14pm

I don’t believe so right now without having

Maybe you could give your use case and feedback on that topic to give it more life.

iluminae · March 15, 2021, 10:21pm

Sorry I was working from this sentence:

So starting with a “person”, whom I assume you knew what single person you wanted to start with. In dgraph and more generally graphDBs, you must identify a part of the graph to start with - the more nodes you start with, the more your fan-out will be when you follow relationships. But, you can certainly start that first block with a wider query but you would have to change the multiple VAR block pattern I chose there, as the filters are rather specific for that use case.

Maybe something like this can get you moving in the right direction? (might need some work to fit the real dataset when you add the edges that were missing above)

query {
  q(func: type(Person)) { # start with.. everyone but the more filtering here the better
    knows {
      postsFromFriends as ~has_creator # posts made by people I know
    }
    made_comment { # comments I made...
      reply_of @filter(uid(postsFromFriends)) { #...which were replies to posts made by my friends
         c as count(uid) #count them
      }
      s as sum(val(c)) # sum the counts 
    }
    commentsOnFriendsPosts: sum(val(s)) # here is your actual result per person
  }
}

Topic		Replies	Views
Count Queries in GraphQL GraphQL rfc , area:graphql	22	19833	March 19, 2021
How to return "edges" count and whether "edge" between two nodes exists? Dgraph graphql , schema , dgraph	4	1509	January 26, 2022
DQL group by query Dgraph dql	8	1835	October 18, 2021
Complex Friend Recommendation App Development kind:question	1	648	December 15, 2022
Dgraph Day Workshop - Migrate SQL schemas to Dgraph types and build a social network Dgraph Day workshop	0	580	April 2, 2021

Labelled subgraph matching in DQL

Related topics