I’m trying to perform subgraph matching in DQL, where a query counts the number of times a subgraph matches some pattern.
For example, consider a social network in which people makes friendships and comment on each other’s posts. (Example graph below). I want to count the number of times a person has replied to a post made by a friend (pattern attached).
I guess dgraph doesn’t support this, even does, it well be an expensive cost.
in my opinion, you should add a redundant predicate to make it filterable , perhaps you should present you schema definition , I am not sure what’s you want to do .
seems like you would be able to do this pretty easily with using vars as filters:
query {
me as var(func: type(Person)) @filter(eq(name,"1"))
var(func: uid(me)) {
knows {
//has_creator isnt in your sample RDF but should have @reverse index as well
// postsFromFriends is a variable that will have the UIDs of posts made by people I know
postsFromFriends as ~has_creator
}
}
var(func: uid(me)) {
made_comment { // or whatever the edge is to replies.
postsILikedByFriends as reply_of @filter(uid(postsFromFriends))
}
}
q(func: uid(postsILikedByFriends)) {
count(uid)
}
}
Thanks for your response and pointing out I’d missed the has_creator edges from the sample RDF - I will amend this.
You start your query by filtering for the person with name 1 and then matching the pattern from there. Doing so would miss matches of the pattern that start from, for example, person 2. Is there a way to write a DQL query to get all matches of the pattern? Without having to manually filter for a specific persons each time?
So starting with a “person”, whom I assume you knew what single person you wanted to start with. In dgraph and more generally graphDBs, you must identify a part of the graph to start with - the more nodes you start with, the more your fan-out will be when you follow relationships. But, you can certainly start that first block with a wider query but you would have to change the multiple VAR block pattern I chose there, as the filters are rather specific for that use case.
Maybe something like this can get you moving in the right direction? (might need some work to fit the real dataset when you add the edges that were missing above)
query {
q(func: type(Person)) { # start with.. everyone but the more filtering here the better
knows {
postsFromFriends as ~has_creator # posts made by people I know
}
made_comment { # comments I made...
reply_of @filter(uid(postsFromFriends)) { #...which were replies to posts made by my friends
c as count(uid) #count them
}
s as sum(val(c)) # sum the counts
}
commentsOnFriendsPosts: sum(val(s)) # here is your actual result per person
}
}