Full text search story

How is full text search with dgraph ? Trying to see if it can replace solr in following hypothetical scenario.

news database:
mysql and solr for search. data is duplicated and about 40 million news. adding two million news articles every day
each news has tags (location, type (crime, entertainment etc), keywords (buried in description - where we use solr for search )

How is full text search capability of solr ? how many languages we can tokenise ?
will it be better if full text search is not used and insted we find out keywords from text field and create edges for the keywords ? ( example location , technology ( java, golang etc ) can be created as edges ( has_java, has_go etc ). will that be faster to search ? possibly i would like to search all news articles from hongkong (location_hongkong) and has_go edge. will that be faster than creating full text search ?
how will you advise to model this schema ?

More on full-text search, including the list of languages supported: https://docs.dgraph.io/query-language/#full-text-search

We use GitHub - blevesearch/bleve: A modern text indexing library for go to create the full-text indexes so we should be able to support whatever new features/languages are added to that library.

If you mostly intend to search by looking at tags and keywords, you’ll get much better performance by creating edges for them. Full-text indexes are expensive to create and take up a lot of space.

With regards to the possible schema here’s one possibility.

text: string .
keywords: string @index(term) .
location: geo @index(geo) .
type: string @index(term) .

Of course, I am not sure what your exact use-case is but having the article metadata as edges will be much optimal than indexing a raw string and using search to look for it.

1 Like

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.