Question about String indexing

Luscha · June 28, 2020, 3:11pm

I was wondering the efficiency of String indexing of dgraph
Currently i store all node of the database with a generated id (normally it’s an hash the properties) and i use it to perform upsert.
Integrating GraphQL I started to feel it as a useless overhead, in particular for nodes containing localized text.

Given the schema

interface Metadata {
  id: String! @id @search(by: [hash])
  createdAt: DateTime!
  modifiedAt: DateTime!
  generation: Int!
  version: Int!
}


type Text implements Metadata {
  text: String! @search(by: [trigram, term, fulltext])
"""
  en-US, ...
  """
  localization: [String!]! 
}

would affect the performance to change the id field to the text itself?

type Text {
  text: String! @search(by: [trigram, term, fulltext]) @id
"""
  en-US, ...
  """
  localization: [String!]! 
}

Does dgraph slow down when using a trigram, term, fulltext indexed string as unique id?

gja · June 29, 2020, 7:52am

Hi @Luscha,

As per my understanding, I believe that they should be stored in separate indexes, so you should not see any slow down. However, from a domain point of view, you might not want to keep the text itself as an ID, as you would have a tough time updating the text.

Tejas

pawan · June 29, 2020, 11:33am

Mutations would take more time with more indexes added to any field as more indexes have to be kept up to date. At the same time queries using indexes would become faster. Are you using all kind of queries with your id field, like regex, allofterms, alloftext so as to need all those indexes?

Luscha · July 1, 2020, 3:06pm

I would like to build something like a search engine, so I guess I need all those indexes to perform a complete and reliable search query.
The question’s focus was more about using an indexed string as the @id of a node than the efficency of the indexing itself:

Option 1:
Get the text → perform an hash → use the hash as id → perform an upsert / query of the node.
Option 2:
Get the text → use the full text to search the node associated to the text → perform an upsert / query of the node.

pawan · July 2, 2020, 12:56pm

Using an indexed string as an @id should be ok to do what you want to do here. The runtime of both the approaches would be similar as both would require searching a string field.

Topic		Replies	Views
Optimizing Indexing in Dgraph - Dgraph Blog Blog	0	977	January 29, 2019
Strategy for partial text search Dgraph schema	9	1482	January 18, 2022
Question about schema design - string literal object vs node object Dgraph	6	616	May 18, 2023
Slow filtering of string field in complex query Dgraph	6	405	August 24, 2021
String escaping and language fulltext search question Issues kind:question , dgraph	4	462	July 29, 2021

Question about String indexing

Related topics