@facets kept in nodes filtered by @ignorereflex

Moved from GitHub dgraph/3790

Posted by d4l3k:

If you suspect this could be a bug, follow the template.

  • What version of Dgraph are you using?

v1.0.16

  • Have you tried reproducing the issue with latest release?
    Yes

  • What is the hardware spec (RAM, OS)?
    32GB ram, Arch linux, running in docker-compose from official images

  • Steps to reproduce the issue (command/config used to run Dgraph).

{
  doc(func: eq(url, "https://www.goodreads.com/book/show/3")) @ignorereflex {
    uid
    ~likes {
      likes @facets(rating) {
        uid
        url
      }
    }
  }
}

schema

<likes>: uid @count @reverse .
<url>: string @index(hash) @upsert .
  • Expected behaviour and actual result.

Documents ignored by @ignorereflex should not return facet information.

In reality you get an empty entry without the UID.

{
  "extensions": {
    "server_latency": {
      "parsing_ns": 17583,
      "processing_ns": 45060798,
      "encoding_ns": 11945009
    },
    "txn": {
      "start_ts": 5486919
    }
  },
  "data": {
    "doc": [
      {
        "uid": "0x8e1065",
        "~likes": [
          {
            "likes": [
              {  // this entry should not be present
                "likes|rating": 5
              },
              {
                "uid": "0x8e1067",
                "url": "https://www.goodreads.com/book/show/34262",
                "likes|rating": 5
              },
              {
                "uid": "0x8e1068",
                "url": "https://www.goodreads.com/book/show/2767052",
                "likes|rating": 5
              },
              {
                "uid": "0x8e1069",
                "url": "https://www.goodreads.com/book/show/41865",
                "likes|rating": 3
              },
              {
                "uid": "0x8e106a",
                "url": "https://www.goodreads.com/book/show/28187",
                "likes|rating": 5
              }
            ]
          },
          {
            "likes": [
              {
                "uid": "0x8e1068",
                "url": "https://www.goodreads.com/book/show/2767052"
              },
...

MichelDiz commented :

Hum, not sure about this. The example used in this context isn’t enough. When you call a facet function, it will run no matter other factors.

So, if a node exists with a facet and there’s no other data in the node. It means this node shouldn’t exist - Unless you have a strong reason to keep an empty node with information in the facet. Because Facets are extra information for the relationship of two nodes in the edges. So if the child node is empty, with only the facet. It has no value/meaning. And if you called the facet func and it didn’t show up, you wouldn’t notice the problem. So that behaves well. Helps you to cleanup the DB with unecessary data.

If you need tho, this query below should fit your needs (Just filter the empty nodes - it only shows nodes that has “url” predicate):

{
  doc(func: eq(url, "https://www.goodreads.com/book/show/3")) @ignorereflex {
    uid
    ~likes {
      likes @facets(rating) @filter(has(url)) {
        uid
        url
      }
    }
  }
}

d4l3k commented :

The bug is that this node does in fact exist with data. Removing @ignorereflex causes this node to return “uid” and “url”.

This issue is that the node shouldn’t show up in the list at all instead of in this weird useless partial state without even a UID returned.

Changing to use your example with the filter returns the same results:

{
  doc(func: eq(url, "https://www.goodreads.com/book/show/3")) @ignorereflex {
    uid
    ~likes (first: 10) {
      likes(first: 10) @facets(rating) @filter(has(url)) {
        uid
        url
      }
    }
  }
}
{
  "extensions": {
    "server_latency": {
      "parsing_ns": 25348,
      "processing_ns": 84563801,
      "encoding_ns": 1461650
    },
    "txn": {
      "start_ts": 8423422
    }
  },
  "data": {
    "doc": [
      {
        "uid": "0x8e1065",
        "~likes": [
          {
            "likes": [
              {
                "likes|rating": 5
              },
              {
                "uid": "0x8e1067",
                "url": "https://www.goodreads.com/book/show/34262",
                "likes|rating": 5
              },
              {
                "uid": "0x8e1068",
                "url": "https://www.goodreads.com/book/show/2767052",
                "likes|rating": 5
              },
              {
                "uid": "0x8e1069",
                "url": "https://www.goodreads.com/book/show/41865",
                "likes|rating": 3
              },
              {
                "uid": "0x8e106a",
                "url": "https://www.goodreads.com/book/show/28187",
                "likes|rating": 5
              }
            ]
          },
...

MichelDiz commented :

Okay, without ignorereflex, are able to se at least an uid alone? if so, you need to delete the relation between parent and that one.

I presume you already did it, but please just in case, do it again.

e.g.

assume 0xeeeee as the target/child and the 0x8e1065 the parent.

{
  delete {
     <0xeeeee> * * .
     <0xeeeee> <likes> <0x8e1065> . #to be sure, but shouldn't be needed. Just in case.
  }
}

oh! Correction!

I’ve just noticed that you have Two levels (A kind of inception) So the target ins’t related to 0x8e1065 tho.

{
  doc(func: eq(url, "https://www.goodreads.com/book/show/3")) @ignorereflex {
    uid
    ~likes (first: 10) {
    uid #So instead of "0x8e1065" You need this uid from this level.
      likes(first: 10) @facets(rating) @filter(has(url)) {
        uid
        url
      }
    }
  }
}

So if you delete it and it still happen. I would to try to reproduce with a similar data sample.

d4l3k commented :

This happens with every node pair I’ve found. It’s not just that edge/node. It works as expected without @ignorereflex. This bug is that combining @ignorereflex and @facets returns nodes that shouldn’t be returned (and only the facets).

Here’s the content filtered just to the original node on a different doc than my previous tests given without @ignorefacets

{
  doc(func: eq(url, "https://www.goodreads.com/book/show/4")) {
    uid
    ~likes (first: 10) {
      uid
      likes(first: 10) @facets(rating) @filter(eq(url, "https://www.goodreads.com/book/show/4")) {
        uid
        url
      }
    }
  }
}
{
  "extensions": {
    "server_latency": {
      "parsing_ns": 28956,
      "processing_ns": 22466880,
      "encoding_ns": 1062961
    },
    "txn": {
      "start_ts": 10005050
    }
  },
  "data": {
    "doc": [
      {
        "uid": "0x907b07",
        "~likes": [
          {
            "uid": "0x907b0a",
            "likes": [
              {
                "uid": "0x907b07",
                "url": "https://www.goodreads.com/book/show/4",
                "likes|rating": 3
              }
            ]
          },
          {
            "uid": "0x90a103",
            "likes": [
              {
                "uid": "0x907b07",
                "url": "https://www.goodreads.com/book/show/4",
                "likes|rating": 4
              }
            ]
          },
          {
            "uid": "0x90dc21",
            "likes": [
              {
                "uid": "0x907b07",
                "url": "https://www.goodreads.com/book/show/4",
                "likes|rating": 4
              }
            ]
          },
          {
            "uid": "0x90e3e1",
            "likes": [
              {
                "uid": "0x907b07",
                "url": "https://www.goodreads.com/book/show/4",
                "likes|rating": 5
              }
            ]
          },
...

and then here’s the same query with @ignorereflex

{
  doc(func: eq(url, "https://www.goodreads.com/book/show/4")) @ignorereflex {
    uid
    ~likes (first: 10) {
      uid
      likes(first: 10) @facets(rating) @filter(eq(url, "https://www.goodreads.com/book/show/4")) {
        uid
        url
      }
    }
  }
}
{
  "extensions": {
    "server_latency": {
      "parsing_ns": 22183,
      "processing_ns": 1329375365,
      "encoding_ns": 1085539
    },
    "txn": {
      "start_ts": 10006639
    }
  },
  "data": {
    "doc": [
      {
        "uid": "0x907b07",
        "~likes": [
          {
            "uid": "0x907b0a",
            "likes": [
              {
                "likes|rating": 3
              }
            ]
          },
          {
            "uid": "0x90a103",
            "likes": [
              {
                "likes|rating": 4
              }
            ]
          },
          {
            "uid": "0x90dc21",
            "likes": [
              {
                "likes|rating": 4
              }
            ]
          },
          {
            "uid": "0x90e3e1",
            "likes": [
              {
                "likes|rating": 5
              }
            ]
          },
...

From the docs: “The @ignorereflex directive forces the removal of child nodes that are reachable from themselves as a parent, through any path in the query result”

These nodes should entirely be removed from the results instead of this weird intermediate state with only the facets returned.

campoy commented :

This is indeed a bug, I was able to reproduce it with this dataset:

{
  set {
    _:f <name> "Francesc" .
    _:f <works_in> _:sf (since="2018") .
    _:m <name> "Manish" .
    _:m <works_in> _:sf (since="2019") .
    _:i <name> "Ibrahim" .
    _:i <works_in> _:bg (since="2019") .
    _:sf <name> "San Francisco" .
    _:bg <name> "Bangalore" .
  }
}

Given the schema below:

<name>: string @index(exact) .
<works_in>: [uid] @reverse .

We can get all the coworkers by doing the following request:

query coworkers($name: string = "Francesc") {
  q(func: eq(name, $name)) {
    name
    works_in {
      name
      ~works_in @facets {
        name
      }
    }
  }
}

This returns a list with both Francesc and Manish.

{
  "data": {
    "q": [
      {
        "name": "Francesc",
        "works_in": [
          {
            "name": "San Francisco",
            "~works_in": [
              {
                "name": "Manish",
                "~works_in|since": "2019"
              },
              {
                "name": "Francesc",
                "~works_in|since": "2018"
              }
            ]
          }
        ]
      }
    ]
  }
}

But if we add @ignorereflex Francesc should disappear.

query coworkers($name: string = "Francesc") {
  q(func: eq(name, $name)) @ignorereflex {
    name
    works_in {
      name
      ~works_in @facets {
        name
      }
    }
  }
}

But the issue is, even though Francesc is removed the since predicate is wrongly kept.

{
  "data": {
    "q": [
      {
        "name": "Francesc",
        "works_in": [
          {
            "name": "San Francisco",
            "~works_in": [
              {
                "name": "Manish",
                "~works_in|since": "2019"
              },
              {
                "~works_in|since": "2018"
              }
            ]
          }
        ]
      }
    ]
  }
}