How to do Negative Lookups in Dgraph?

Hi Team,

I want to highlight comments or posts which were liked by the logged in user.

While going through the Highscalability blog, Reddit lessons learned…, I found that they use Bloom Filter.

I searched for Bloom filter in Dgraph, but couldn’t find a succinct answer to:

  1. How to use Bloom filter in Dgraph?
  2. If it’s part of Badger DB, how to access that feature via Dgraph?
  3. Does Dgraph implement an even better approach?

If there’s a documentation I didn’t reach, please point me in that direction.

Hi @abhijit-kar

Badger keeps the Bloom filters for each SST. These are enabled by default (see default value BloomFalsePositive: 0.01). Badger allows to load bloom filters lazily instead of loading all at once at startup which would be expensive. There is a PR for an option for disabling the bloom filter Support fully disabling the bloom filter by damz · Pull Request #1319 · dgraph-io/badger · GitHub.

However, these filters are internal to Badger and not accessible in Dgraph.

Thank you Naman, I understand what you are saying.

Then what would be the solution to the problem of highlighting already liked comments/posts by user?

Hi @abhijit-kar, I am looking over the possible way (if any exists). As per my understanding, you are looking at which posts the user has liked from the given list of say 100 posts.

type User {
     name string .
     liked_posts [uid] .
}

Hi Naman,

Getting which posts User has liked all at once is easy.

What I am trying to achieve is, say you are a logged in User in Reddit and you visit the homepage, which has a listing of top posts.

You liked a few and then refreshed the page, or visited the page in your mobile Reddit app.

Reddit will show top posts, and will highlight the ones you already liked.

That’s what I am trying to achieve.

Thanks Abhijit for elaborating.

Badger creates the bloom filter lookup for the keys (<predicate, subject>) and with the schema of the above kind, we won’t be able to achieve what you are trying to achieve. We would have to fetch all of the liked posts anyway and the do filtering.

Efficient lookup on the liked_posts would require a bloom filter for each row, I mean for each of the <liked_post, uid> too, which is not supported currently by the badger.

I hope that answers your concern to some extent. Feel free for further discussion.