A very common use case is building multi-tenant applications, which segregate data on a per site/team/user basis. The problem is that indexes are always global (and there is now a greater performance penalty for using them globally due to transactions).
For example, let’s say I’m building a Slack clone and I want each team’s messages to be separated (i.e. you will only ever see messages from your team). There is only one thread, so I build a very simple schema as follows:
team_name: string @index(term) .
team_messages: uid .
message_value: string .
user_teams: uid @reverse .
Let’s say I have the userId (normally I have this assuming they are logged in), so to get a user’s sites I can simply run the following query:
root (func: uid(<userId>)) {
user_teams {
team_name
team_messages {
message_value
}
}
}
That’s all good, but I also want to be able to search for a message. I will only ever do this on a per site basis, but I only have the option to add a global index. So I need to add:
message_value: string @index(fulltext) .
Now, for every single message added across the entire platform, the global index is updated (even though I will never use it at the global level) and transactions will fail if the message_value is updated with the same keys. Given that there could be a lot of messages being added across the entire platform, this could cause a real bottleneck.
A possible solution
If we could specify in the schema to only apply an index when it is a child of the team_messages predicate (as we will only every query on a fulltext filter /from/ team_messages), that might help Dgraph to manage performance.
team_messages > message_value: string @index(fulltext) .
And then when running the query I can only use it as a filter on team_messages (but that’s the only place I need it):
root (func: uid(<userId>)) {
user_team {
team_name
team_messages @filter(anyoftext(predicate, "nice message")) {
message_value
}
}
}
Effectively, you would be creating a separate index each time team_messages predicate is used.