Intersect version of uid(…)

EnricoMi · July 28, 2020, 1:25pm

I want to get the intersection of a set of uid vars. This is similar to Get output of uid(a,b,...) as intersection not union, but I did not want to reopen that lengthy discussion.

I can turn uid(…), which does the union of its vars, into an intersection with a @filter as follows:

pred1 as var(func: has(<dgraph.graphql.schema>))
pred2 as var(func: has(<dgraph.graphql.xid>))
pred3 as var(func: has(<dgraph.type>))

result (func: uid(pred1,pred2,pred3)) @filter(has(<dgraph.graphql.schema>) AND has(<dgraph.graphql.xid>) AND has(<dgraph.type>)) {
  uid
  <dgraph.graphql.schema>
  <dgraph.graphql.xid>
  <dgraph.type>
}

The uid vars pred1, pred2 and pred3 provide the set of uids already. Applying a @filter and repeating the has operations seems redundant and might not be as performant in its implementation as a uid_intersect(pred1,pred2,pred3) might be:

pred1 as var(func: has(<dgraph.graphql.schema>))
pred2 as var(func: has(<dgraph.graphql.xid>))
pred3 as var(func: has(<dgraph.type>))

result (func: uid_intersect(pred1,pred2,pred3)) {
  uid
  <dgraph.graphql.schema>
  <dgraph.graphql.xid>
  <dgraph.type>
}

And uid_intersect(…) makes this query much more handy and readable. Imagine more vars here.

Any thoughts on that?

MichelDiz · July 28, 2020, 4:58pm

Have you tried cascade? I think (from what I got in your question) that it might work for you.

pred1 as var(func: has(<dgraph.graphql.schema>))
pred2 as var(func: has(<dgraph.graphql.xid>))
pred3 as var(func: has(<dgraph.type>))

result (func: uid(pred1,pred2,pred3)) @cascade {
  uid
  <dgraph.graphql.schema>
  <dgraph.graphql.xid>
  <dgraph.type>
}

amaster507 · July 29, 2020, 4:14am

I have a use case for this:

I want to do filtering at different levels. For instance I need to answer questions like, “Show me contacts (that have an address that have cities that are in X, Y, Z) and (has events that are in the seven days OR has tasks that have occurrences that are not completed and due in the next seven days.)” In order to fulfill similar request, I have to run the filtering logic at a higher level in my UI and get all of the UIDS that fulfill the separate parts and then do the conjunction logic and then pass these UIDS to the filter where I actually get the full graph to use with pagination. Right now my UI has to do much of this logic and it would be nice to unload at least the last part to the database layer.

The cascade is getting better, especially with parameterized cascade coming soon to the main releases. But the cascade cannot support nested logic like OR and only covers some AND logic.

EnricoMi · July 29, 2020, 8:54am

Yes, @cascade would work for that example, but not for this:

pred1 as var(func: has(<dgraph.graphql.schema>))
pred2 as var(func: has(<dgraph.graphql.xid>))
pred3 as var(func: has(<dgraph.type>))

result (func: uid(pred1,pred2,pred3)) @cascade {
  uid
}

And I need the intersection no matter what is in the body of result.

EnricoMi · August 6, 2020, 5:45pm

@MichelDiz so do you agree that uid_intersect would be a concise addition to the GraphQL± language?

MichelDiz · August 11, 2020, 4:46pm

Sorry for the late reply. You have marked me with the wrong nick.

Well, I think the cascade work well. I can’t see how that would work well different from the cascade. See, if the query doesn’t have anything to compare, it would have any arbitrary comparison and maybe it wouldn’t be expected result as it would be with explicitly “params”.

Let’s take your last query into account. What would be the rules? The node with more predicates? So, all the other nodes would be parameterized by it? Or it would infer all possible nodes and gather all possible predicates? but in that case, maybe you would never have any result.

For me, the cascade is the best solution for now.

Cheers.

EnricoMi · August 13, 2020, 7:33pm

Sorry, I am confused. Let’s look at this query again:

pred1 as var(func: has(<dgraph.graphql.schema>))
pred2 as var(func: has(<dgraph.graphql.xid>))
pred3 as var(func: has(<dgraph.type>))

result (func: uid(pred1,pred2,pred3)) @filter(has(<dgraph.graphql.schema>) AND has(<dgraph.graphql.xid>) AND has(<dgraph.type>)) {
  uid
}

I read this query as:

find the set of uids that have <dgraph.graphql.schema>, memorize them as pred1
find the set of uids that have <dgraph.graphql.xid>, memorize them as pred2
find the set of uids that have <dgraph.type>, memorize them as pred3
for all uids in pred1 union pred2 union pred3 that have all three predicates, return the uid

This cannot be simplified with @cascade.

The uid in conjunction with the @filter(has(…) AND has(…) AND has(…)) implements the intersection of the uid sets pred1, pred2 and pred3, but a shorter notation would be great:

pred1 as var(func: has(<dgraph.graphql.schema>))
pred2 as var(func: has(<dgraph.graphql.xid>))
pred3 as var(func: has(<dgraph.type>))

result (func: uid_intersect(pred1,pred2,pred3)) {
  uid
}

Which I read as:

find the set of uids that have <dgraph.graphql.schema>, memorize them as pred1
find the set of uids that have <dgraph.graphql.xid>, memorize them as pred2
find the set of uids that have <dgraph.type>, memorize them as pred3
for all uids in pred1 intersect pred2 intersect pred3, return the uid

I am not sure what you refer to with “What would be the rules?”. And I think @cascade is not applicable in this more general query. Please provide some more explanation.

Thanks,
Enrico

MichelDiz · August 13, 2020, 8:06pm

I know that smaller queries doing magic is great. But there are other concerns about this. A big one is that the team is focussed right now on fixing bugs and GraphQL specs. Nothing beyond this will be worked in the mid term.

Another point is about how the Query system works. The variables won’t get the predicate used on has func. As it is just a map of uids. Unless it is a value variable. Which you get by expanding the body of the has func e.g:

pred1 as var(func: has(<dgraph.graphql.schema>)) {
   realPred1 as <dgraph.graphql.schema>
}

This intersect func needs a lot of contexts to make it work as you want. It needs to infer each block with a complex contextualization that would come in the map - I agree that making things easier write is good, but right now in my opinion cascade does the work.

The rules that need to be applied - in this case, you answered that it would be the predicate used as a parameter in has func. They are not implicit in the variable. The predicates used as parameters are not embedded in the variable. It might be necessary to create a new type of variable for this. Because making it more contextualized will lead to more use of memory unnecessarily for other queries.

It is necessary to discuss the pros and cons when going to simplification.

You said before that it works. But it isn’t desirable cuz you have to write more lines on the query. That is a small cost instead of wait for support for this.

Anyway, feel free to open a request at Dgraph - Discuss Dgraph - My comments aren’t a block. Just giving you suggestions and discussing about the topic.

Cheers.

EnricoMi · August 13, 2020, 8:52pm

I understand that depending on how the query language is implemented a simple change might be easy and straight forward or a heavy refactoring. This seems to be the latter, and it adds only little expressiveness to the language. Thanks for the insights.

abhimanyusinghgaur · August 14, 2020, 5:38pm

@EnricoMi, off-topic, but one thing for sure with these special predicates is that, there will only be one node containing both these predicates. So, you should be getting only one uid in your result for the given query at present.

Loic_Veillard · May 20, 2022, 7:38pm

I have another use case for this:

For example lets say I’m doing an upsert, and I have an array of valid tasks: uid(validTasksId) where validTasksId is a DQL var that i fill in the query of my upsert

Now imagine i’m receiving modifications to 10 tasks from my frontend, where 5 are included in the validTasksId array and 5 are not.

request=[id:0x1, title:"new", id:0x2, title:"new2", ..., id:0x10, title:"new10"]

And I modify them in JSON format like this:

request.map(task=> (
{'Person.tasks': 
     {'uid: task.id // here is wwhere i want the uid_intersection(`${task.id},validTasksId)`
      'Task.title: task.title} // I want this to happen only if task.id is included in uid(validTasksId)
});

TLDR: So in my request i have this 10 tasks, but in the upsert I only want to send the 5 that are defined as “legal” in my query prior to the mutation. I’m not able to do request.filter(task => task.id included in valid ids) because this is an upsert, and I have the id array as a DQL variable

Topic		Replies	Views
Get output of uid(a,b,...) as intersection not union Users	18	3092	June 8, 2018
Combine results from query blocks Dgraph	2	407	November 11, 2020
Filter for several UIDs Dgraph	3	1603	April 3, 2018
How to union query blocks Dgraph kind:question , dgraph	2	185	July 1, 2024
Trouble with uid_in Users	3	738	February 11, 2018

Intersect version of uid(…)

Related topics