is it possible to join two indexes, of the same type, together, and search through them with the index capability? else I’ll make a feature request for that one, since it’s a very easy feature to implement (since you just create a new index under the hood, sure you duplicate values, but it is what it is, maybe with more sophisticated algorithms you can speed that up, but that ‘shallow’ solution is already more than enough)
//
better explained what i mean with joint indexes:
e.g you have two different predicates, hotel.rating and restaurant.rating
schema:
hotel.rating: int @index(int) .
restaurant.rating: int @index(int) .
and now you want to find all businesses that have a rating above 4 stars, first you want to get 25 businesses listed. Now you have to do ugly things, getting 12 restaurants and 13 hotels and mixing them in the result. things get uglier if you want the top 10 businesses, then you would need to get
5 restaurants and 5 hotels, and then you have to sort them again and deliver the top 10. But what if the top 10 are actually all hotels? to solve that problem you have to not get 5 hotels 5 restaurants, but 10 of each. you have to overfetch. and then sort. this is not cool
so what I basically want, is that hotel.rating
and restaurant.rating
predicates, are basically duplicated into one single business.rating
so that my schema basically looks like this:
hotel.rating: int @index(int) .
restaurant.rating: int @index(int) .
business.rating: int @index(int) .
#which is just a merge of the two aboves
this is a very simple thing.
I know I could solve it if I always add to restaurant nodes also a second predicate business.rating which would be just a duplicate of restaurant.rating , so whenever I mutate or compute a new rating, I change both of them. This is a good solution. I could also just change only the restaurant.rating predicate, and let a post-mutation hook change for me the business.rating ;this would be a good solution too
(BTW: just removing hotel.rating & restaurant.rating and ONLY using business.rating is NOT a solution, because if a user would query for the best 10 restaurants, dgraph couldnt use the index anymore, which would slow down things)
getting that handled automatically, which is no problem to implement into dgraph since this is such an easy feature, would be really cool and neat.
if this feature does not exist yet, it would be really cool and neat to have it. since it won’t be that hard to implement it. and it would make things more convenient.
disclaimer: I know I know things wouldn’t matter anymore if a user does not want top10 restaurants, but want top10 restaurants within 5km radius. Then to improve speed you would query first for restaurants within 5 km, and then filter them by their rating. This whole restaurant hotel thing is just an example since it is easy to understand what I exactly mean since my english sux
BTW2: this future would be also extremely useful for merged fulltext search. e.g you have a social media site about cats and dogs. you would want to use for dogs and for cats an own index. if a user wants to search cats with fulltext search e.g “small cute cats eating” then you don’t want get results with dogs. So if you had ONLY a general search for both cats and dogs, you would need to filter that the noderesult is about cats. this increases latency. So you want 3 fulltext indexes: One for cats, one for dogs, and one for both.
Yes I know this would be a trade of server ressources(disk to store the index + CPU to update the index) for a more quick search. But this is not a problem if I want to give my users the best possible experience no matter what the $$ cost is.
BTW3: better use case for that feature: e.g you have big text that you make a hash equal index with. You don’t want to store the same texts about cats twice just to get two times the same hash, one for cat search one for general search.