Using "regex function"/"trigram index" on sets(lists)

I can’t find the accurate way to use this index on arrays. Not to say that this regular expressions seems to be so specific that they end losing the flexibility of using them (they end looking more like a oneofterms or a anyofterms so they lose their purpose/functionality). For example:

I have this schema:

mutation {
    schema {
        group: string @index(exact,term) .
        name: string @index(exact,term) .
        array_str: [string] @index(trigram) .
    }
}

Then I add this data (I don’t know if there is a shortcut to massively add all the “array_str” elements, please tell me if there is one):

mutation {
  set {
    _:root <group> "test" .
    _:root <name> "root" .
    _:root <array_str> "first" .
    _:root <array_str> "second" .
    _:root <array_str> "third" .
    _:root <array_str> "fourth" .
  }
}

But making queries like the following won’t work at all:

{
  root(func:regexp(array_str, /.*rd$/))@filter(eq(group,"test") and regexp(array_str,/.*rd$/)){
    expand(_all_)
  }
}

RETURNS: Regular expression is too wide-ranging and can’t be executed efficiently.

{
  root(func:regexp(array_str, /[a-z]+rd$/))@filter(eq(group,"test") and regexp(array_str,/[a-z]+rd$/)){
    expand(_all_)
  }
}

RETURNS: Regular expression is too wide-ranging and can’t be executed efficiently.

{
  root(func:regexp(array_str, /third/))@filter(eq(group,"test") and regexp(array_str,/third/)){
    expand(_all_)
  }
}

RETURNS: NOTHING - Showing 0 nodes and 0 edges (I will expect the just inserted node as it contains “third”)

Any suggestions to make this work? Seems like right now this index doesn’t provides hard regex functionality and is just another way of implementing “anyofterms” and “oneofterms”.

Thank you!

Hey @RafaARV,

This error is thrown if your regexp returns more than a million results. Executing a query like this can cause a huge memory spike, something better avoided.

I don’t know if there is a shortcut to massively add all the “array_str” elements, please tell me if there is one

Maybe not in this one, but in the upcoming v0.9 release onwards, we’re going to move all mutations to JSON and then you would be able to set all the elements of array directly.

Make the regexp more specific so it generates less than a million results.

Thank you Manish,

This error is thrown if your regexp returns more than a million results.

Make the regexp more specific so it generates less than a million results.

But this was a clean database with only one node (“_:root”), it is not even close to the million of results not even that possibility exists yet (only one node in database).
Making that calculation only by prediction is too arbitrary because most regex operators will result simply banned, for example:

  • Operators like “*” or “+” will be banned as they could reach infinity.

Shouldn’t the efficiency of the queries relay on how much knowledge the user has in making efficient queries among its data and not simply ban the opportunity of making them?

Thank you!.

There’s no prediction in the system. It runs the regexp, and only if it ends up matching more than a million results, would it throw this error. If there’s only one result, something weird is going on. Can you please file a bug?

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.