Can not regexp match predicates with newlines

Moved from GitHub dgraph/4694

Posted by insanitybit:

What version of Dgraph are you using?

Dgraph version : v1.1.0
Dgraph SHA-256 : 7d4294a80f74692695467e2cf17f74648c18087ed7057d798f40e1d3a31d2095
Commit SHA-1 : ef7cdb28
Commit timestamp : 2019-09-04 00:12:51 -0700
Branch : HEAD
Go version : go1.12.7

Name: pydgraph
Version: 2.0.2
Summary: Official Dgraph client implementation for Python
Home-page: GitHub - dgraph-io/pydgraph: Official Dgraph Python client
Author: Dgraph Labs
Author-email: contact@dgraph.io
License: Apache License, Version 2.0
Requires: grpcio, protobuf
Required-by:

Have you tried reproducing the issue with the latest release?

This is the latest ‘standalone’ release.

What is the hardware spec (RAM, OS)?

16GB RAM, 8 core intel.

Ubuntu 18.04.

Steps to reproduce the issue (command/config used to run Dgraph).

docker run --rm -it -p 8000:8000 -p 8080:8080 -p 9080:9080 dgraph/standalone:latest

Create a node with a predicate ‘foo’ and a value ‘00\n0’ .

Attempt to regexp match the newline character.

Here is an example query I have:

node_key = "e3e70682-c209-4cac-629f-6fbed82c07cd"

query = f"""
    {{
        q0(func: eq(node_key, "{node_key}"))
        @filter(regexp(arguments, /(?sm)^00./))
        {{
                uid,  
                expand(_all_)
        }}
    }}
    """
json.loads(local_client.txn(read_only=True).query(query).json)

Or
@filter(regexp(arguments, /(?sm)^00\n/))
@filter(regexp(arguments, /00\n/))

etc

Expected behaviour and actual result.

I expect to see the predicate returned, because ‘.’ should match newline with the s/m flags. Or, with an explicit \n I should also see a value returned, but do not.

I can regex just ‘00’ and get the value back, but otherwise I can find no way to actually regexp when newlines are involved.

1 Like

MichelDiz commented :

I don’t know if this issue is valid. For Strings in Dgraph don’t have the concept of new line. Unless you’ve escaped it, but that’s different.

Please, provide a sample so I can reproduce the issue on my side.

Cheers.

lgalatin commented :

Issue seems related to: #5131

insanitybit commented :

That issue seems related, I did try using ‘i’.

{
  set{
    <0x4e21> <example> "foo\nbar" .
  }
}
{
  q(func: has(example)){
    uid,
    example,
  }
}

Shows:

 "q": [
      {
        "uid": "0x4e21",
        "example": "foo\nbar"
      }
    ]

Things that work:

regexp(example, /.*bar/)
regexp(example, /foo.*/)

Things I have tried, that I might expect to work, that do not work:

regexp(example, /foo\\\n/)
regexp(example, /foo\n/)
regexp(example, /foo.*bar/)

martinmr commented :

This is not caused by the same issue than #5131. I ran the queries with the bug fix for that other ticket and newlines are still not recognized.

martinmr commented :

Still unclear if this is a bug or Dgraph has never supported new lines in regex expressions but marking the ticket as accepted.

insanitybit commented :

I think that this example in particular should clearly work, no? That certainly feels like a bug, .* not matching a newline?
regexp(example, /foo.*bar/)