Strange regexp behavior


(Sergei Myshliakov) #1

Hi,

Let’s suppose I want to search users by second,first and middle names and I want

a search result to be refreshed on an every typed character.

In neo4j I do :

val q = fn.length match {
case 1 =>
s"match (ud:UserData)<-[:usr_usd]-(u:User)-[:usr_acc]->(a:Account) where ud.sname=~’${fn(0)}.’ " +
s"return u,ud,a limit $limit"
case 2 =>
s"match (ud:UserData)<-[:usr_usd]-(u:User)-[:usr_acc]->(a:Account) where ud.sname=~’${fn(0)}.’ " +
s"and ud.fname=~’${fn(1)}.’ " +
s"return u,ud,a limit $limit"
case _ =>
s"match (ud:UserData)<-[:usr_usd]-(u:User)-[:usr_acc]->(a:Account) where ud.sname=~’${fn(0)}.’ " +
s"and ud.fname=~’${fn(1)}.’ and ud.mname=~’${fn(2)}.’ " +
s"return u,ud,a limit $limit"
}

Is it possible to do that in dgraph ? I couldn’t find any solution.

Thanks in advance


(Michel Conrado) #2

I do not understand the neo4j query language. At first everything you can do with regexp is described in the documentation. From what I understand of your question, you would need “name”, “second name” and “middle name” in distinct predicates and do something like as below. There is no regexp offset. So you need a predicate for each.

{
  directors(func: regexp(second.name@en, /^Spielberg.*$/)) 
  @filter(regexp(name@en, /^Steven*$/) and regexp(middle.name@en, /^some*$/))  {
    name@en
  }
}

(Sergei Myshliakov) #3

Hi Michel,

Your query is not what i want. Here is my query for sql:

def getUsers(fn: Array[String]) = {
val q = fn.length match {
case 1 =>
val sname = fn(0)
s"select * from users as u where u.sname like ${sname}%"
case 2 =>
val sname = fn(0)
val fname = fn(1)
s"select * from users as u where u.sname like ${sname}% and u.fname like ${fname}%"
case _ =>
val sname = fn(0)
val fname = fn(1)
val mname = fn(2)
s"select * from users as u where u.sname like ${sname}% and u.fname like ${fname}% and u.mname like ${mname}%"

here is request to db

}

}


(Michel Conrado) #4

Hi,
Can you show what are you actually doing in Dgraph?
Sharing what you’re doing in Dgraph would be more easy to understand what is this “stranger regexp behavior”. Share examples of mutation, queries and results would be nice.

Dgraph doesn’t have a “select *” - but a similar (but not with the same propose) is "expand(_all_).

{
  q(func: eq(sname, "some name")) {
    uid
    expand(_all_) { expand(_all_) }
  }
}
{
  q(func: eq(sname, "some name")) @filter(eq(fname, "some name")) {
    uid
    expand(_all_)
  }
}
{
  q(func: eq(sname, "some name")) @filter(eq(fname, "some name") AND eq(mname , "some name") ) {
    uid
    expand(_all_)
  }
}

This only can be achieve by https://docs.dgraph.io/query-language/#regular-expressions


(Sergei Myshliakov) #5

Hi Michel,

What “i am actually doing” i described in my first message but can repeat once again.
In my front-end(browser) i have an input field to put in user’s full name(second first middle) to search users and a table to see a result(list of users or maybe empty list). Further, i want to see a result on every character i put in. The query(neo4j cypher) in my first message is from a real project. I couldn’t find how to do that for dgraph, so i asked if it is possible. Sorry.

<<This only can be achieve by https://docs.dgraph.io/query-language/#regular-expressions>>

Yes, i read it. But the regexp only works if patterns are longer than 2 characters and this is not suitable for my case. What is a reason not to have a “normal” regexp?


(Michel Conrado) #6

Dgraph shows an error for that

“message”: “: regular expression is too wide-ranging and can’t be executed efficiently”

I means that you can’t perform an expensive query at Root. Dgraph’s regexp design is based on indexing. That’s why is limited.

If regexp is to wide-ranging (like: .*), or more than 1000000 values are
returned for trigram, execution is stopped, because of performance
reasons.

Check https://github.com/dgraph-io/dgraph/commit/a62fec22a47f6bd310783076881bca9a862f05d9

Also there are changes incoming to regexp that I think will come in V1.1 https://github.com/dgraph-io/dgraph/commit/a4756c206adc19b5ba640373407b30ef3a01f030

it has:

return x.Errorf( "Attribute %v does not have trigram index for regex matching. "+
"Please add a trigram index or use has/uid function with regexp() as filter.", attr)	}

Not sure, but maybe you gonna be able to use the “normal” regexp but first adding the nodes using “has” to the query context and then filtering them by regexp.


(Sergei Myshliakov) #7

Hi Michel,

Thanks for your comprehensive answer. I have to think.


(system) closed #8

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.