Organizations can create items, items can be public and items can be shared with other organizations which should implicitly allow the other organization to see the entire tree from the shared item up (its ancestors). So its parents, grandparents, etc. up to the root node.
Items belonging to the organization are easy, as are public items and items explicitly shared with the organization. However, I’m struggling with how to most effectively model the implicit ancestor access.
The best idea I’ve had so far is to add a hasImplicitItems: [Item!] predicate to the organization (with a visibleTo: [Organization!] @hasInverse(field: hasImplicitItems)) and every time an item is shared with another organization walk the tree and create a relationship between each ancestor and the target organization. The biggest issues I see are:
Potentially large numbers of relationships as the dataset grows
Actually walking the tree in an efficient manner (though I can’t see a way around that walk no matter how the actual relationship is tracked and stored)
Storing the root node on each item and finding the shortest path between them wouldn’t work, as it would potentially exclude some of the branches. I feel like traversing the tree every time and trying to find a path from the node being accessed to a node the organization has explicit access to is also a non-starter.
Anyway, would love to hear any thoughts or suggestions… thanks!
Hi @dhartweg and welcome to Dgraph community
I should say that what you explained was really interesting use case.
I think your idea to solve this issue is somehow de-normalization of data instead of looking up in your tree structure(which is a common approach when you are not using a relational database).
As far as I’m concerned there should not be a problem because of large numbers of relationships specially if you are using dgraph cloud. They are working on a 1 TB dataset.
I think if you are going to query your data a lot more than you write then the solution you mentioned is the best fit.
Thanks for the welcome! Ive been looking for a long time to find a backend for my side project and haven’t been able to find anything that checks all of the boxes… but I think Dgraph might!
Anyway, thanks for taking a look and validating that large datasets aren’t an issue. Going to keep exploring the best way to store those relationships and (very likely) recurse through the entire tree to do so.
I have similar needs. Based on the raft of Dgraph GraphQL auth/schema/validation issues I’ve found exploring thee discuss forum, I’m not sure the Dgraph generated GraphQL mutations are fit for purpose in any situation where there are data integrity requirements.
All workaround seem to be to use DQL directly in your backend, losing one of the key selling points of DGraph as a direct interface, or requiring use of lamdbas which is an extra thing to maintain, and I haven’t found anyone mentioning how those lambda implementations can be tested.
Hi @dhartweg. Your solution would require to re-evaluate the hasImplicitItems each time the tree changes. That’s a pain.
I have the exact problem in my side-project. I solve it by just going down the tree (with a max depth). Then using @normalize to flatten the result in an array, and then programatically find a shared item. This tells me then if the requested item is accessible by the requester.
For sure, it would either require re-evaluating hasImplicitItems when part of the tree changes, or preventing anything from being un-shared once it had been shared, or some other tradeoff with what folks can actually do.
Calculating, storing and maintaining the relationships is definitely a pain, but I feel like doing what could potentially be a large amount of work on every query would be less sustainable in the long run. Not to mention, can we even use DQL with authorization? If not (which I believe is the case right now), that means we would have to have two avenues to use to get the full picture of Items an Organization can have access to.
For me, keeping the source (so the data = graph) clean is key. Then errors can only exist in the logic around it. But when the source gets polluted, no amount of code can fix it. Yes that will require more processing time (if that’s what you mean by work). But I will take my chances there.
As far as I know, auth is completely up to you in DQL…
Great points! I’m going to be on the lookout for ways to either not have to deal with the implicit access or otherwise deal with it (maybe a completely separate path where one explicitly states they want to search through those items).
Your’e right, you roll your own auth in DQL, I was talking about it from a GraphQL api standpoint since that’s the route I have planned for most (if not all) of the data.
If you could write an auth rule using DQL instead of only with GraphQL something like this request might be possible. But right now a deep tree structure would be almost impossible to write in GraphQL. Almost because it could be possible to a limited depth, but not an unlimited depth and definitely not efficient.