Error: Regular expression is too wide-ranging and can’t be executed efficiently

Hi!

I’m seeing the error message in the title when trying to do a ‘starts with’ type query using a Regex, and the number of characters provided is less than 3.
E.g.
query(func: regexp(description, /^AA.*$/i))

Is there any other way to achieve the same result in Dgraph (using ‘starts with’ matching with less than 3 chars)? Other than padding all the data with an arbitrary char, or running through all possible permutations for the 3rd char, are there any other alternatives? Our result set is fairly small, approx 1000 odd nodes in this instance, so we don’t think performance is too much of a limiting factor at this stage.

I have seen a similar issue raised previously - but it’s pretty old and got closed, so checking if there really isn’t any other alternative. It seems like it would be a fairly common use case?

Thanks!

1 Like

i would also like to know what exactly constitutes a “too wide-ranging” regex. Does it look at the results? Does it do it put the regex in some undocumented tester? I can make a very long regex which might do less as far as matching, how can I know what is a qualified regex?

I looked at the code for this and it seems like for tokens which are less than 3 chars, we have to match everything. Since a trigram index is used which breaks down the input into three character units, we can’t use the index to compute regex for two character expressions. So regexp expressions which are less than three chars are considered too wide ranging.

We can add the option to do regexp search without using the index. That won’t scale very well if there is a large amount of data and would not work for the root function but can work as a filter.

5 Likes

I see! Thanks for the explanation. I think the ability to search without using the index is useful (even if not always performant). I did manage to get a query working which uses the same regex but as a filter (rather than as the root function), so will run with this for our current implementation.

1 Like