Recursive query with dgraph


(Himanshu Barthwal) #1

Hi,
I am trying to fetch all the children of a given node. I read through the documentation and it seems like
recurse would be something that I want to use but after much hit and trial I am not able to figure it out.
In particular, I want to get all child nodes (via isSubsidiary edge) of a node such that enitity_id=“1”.
I have the entity_id and isSubsidiary (with reverse edge) predicates in my dgraph instance as follows:

1<-2<-3<-4
^
|
5
^
|
6
(Please, don’t mind my ascii art skills.)
And I am expecting my result set to contain the entity_ids [1,2,3,4,5,6] as they all belong to the same tree (1’s children’s tree). Any help would be greatly appreciated.


(Daniel Mai) #2

Can you share the query you tried? A @recurse query should work for you.


(Himanshu Barthwal) #3

Consider the entity with entity_id=“0FPWZZ-E” as follows:

{
  all_subsidiaries(func:eq(entity_id, "0FPWZZ-E")) {
	~isSubsidiaryOfFilermstPriorityNoOverlap {
               entity_id
          }
  }
}

The above query gives me the following response:

{
  "data": {
    "all_subsidiaries": [
      {
        "~isSubsidiaryOfFilermstPriorityNoOverlap": [
          {
            "entity_id": "0HCV3V-E"
          },
          {
            "entity_id": "0FQGTM-E"
          },
          {
            "entity_id": "0C755F-E"
          },
          {
            "entity_id": "0J6KQ0-E"
          },
          {
            "entity_id": "0H0H9K-E"
          },
          {
            "entity_id": "0FQD05-E"
          },
          {
            "entity_id": "003JLG-E"
          },
          {
            "entity_id": "0HHDMB-E"
          },
          {
            "entity_id": "0HG6GW-E"
          },
          {
            "entity_id": "0H3DZ4-E"
          },
          {
            "entity_id": "0J0SDQ-E"
          },
          {
            "entity_id": "0JG29M-E"
          },
          {
            "entity_id": "0DLJSM-E"
          },
          {
            "entity_id": "0FSW5G-E"
          },
          {
            "entity_id": "0HYRRF-E"
          },
          {
            "entity_id": "0DLC27-E"
          }
        ]
      }
    ]
   }
}

Now I try the following query and don’t get any data.

{
  all_subsidiaries(func: eq(entity_id, "0FPWZZ-E")) @recurse(depth:5, loop:false) {
	~isSubsidiaryOfFilermstPriorityNoOverlap {
            entity_id
        }
 }
}

Clearly, the data is there but my recurse query is wrong. Any ideas?


(Daniel Mai) #4

https://docs.dgraph.io/query-language/#recurse-query

The @recurse query docs say you can only specify one-level of predicates. At every node hop the query will traverse through all the listed predicates.

A single-level of predicates means the query should be shaped like this:

{
  all_subsidiaries(func: eq(entity_id, "0FPWZZ-E")) @recurse(depth:5, loop:false) {
    ~isSubsidiaryOfFilermstPriorityNoOverlap
    entity_id
  }
}

(Himanshu Barthwal) #5

@dmai That worked. Thanks a lot !


(Himanshu Barthwal) #6

@dmai I have one more question regarding recurse query. I am trying to write a query to get to
the ultimate parent node from a child as follows:

{
  ultimate_parent(func: eq(entity_id, "05QXVL-E")) @recurse(loop:false) {
    isSubsidiaryOfFilermstPriorityNoOverlap
    entity_id
  }
}

The above query returns the path from “05QXVL-E” to “0FPWZZ-E”. But I only need “0FPWZZ-E”
in the result set and not the whole path. Is there a way to do that? Thanks for your help so far.


(Himanshu Barthwal) #7

@MichelDiz Would you be able to help me with the query above?


(Michel Conrado) #8

Not sure, but could be:

{
  var(func: eq(entity_id, "05QXVL-E")) @recurse(loop:false) {
    GetIT as uid
    isSubsidiaryOfFilermstPriorityNoOverlap
    entity_id
  }
  
  ultimate_parent(func: uid(GetIT)) @filter(Not eq(entity_id, "05QXVL-E")){
    uid
    expand(_all_)
  }
}

If not so, send us results output desired ones and actual ones.


(Himanshu Barthwal) #9

@MichelDiz Thanks for the quick response. It seems the query you posted above prints all nodes of the path except the starting one. But what I need is just the ultimate parent. That is, “0FPWZZ-E”. In particular, consider the following snippet:

{
  path_to_ultimate_parent(func: eq(entity_id, "05QXVL-E")) @recurse(loop:false) {
    isSubsidiaryOfFilermstPriorityNoOverlap
    entity_id
  }
}

It gives me the following result:

    "path_to_ultimate_parent": [
      {
        "isSubsidiaryOfFilermstPriorityNoOverlap": [
          {
            "isSubsidiaryOfFilermstPriorityNoOverlap": [
              {
                "isSubsidiaryOfFilermstPriorityNoOverlap": [
                  {
                    "isSubsidiaryOfFilermstPriorityNoOverlap": [
                      {
                        "entity_id": "0FPWZZ-E"
                      }
                    ],
                    "entity_id": "003JLG-E"
                  }
                ]
              }
            ],
            "entity_id": "0029WX-E"
          }
        ],
        "entity_id": "05QXVL-E"
      }
    ]

Which is the path from child to its ultimate parent. What I need is the following:

"ultimate_parent" : [
  {
    "entity_id" : “0FPWZZ-E”
  }
]

Please let me know if you have anymore questions.


(Michel Conrado) #10

Sorry, that isn’t possible.

Wha you can do is or loop through the arrays levels in your client side. Or do something like this below and loop for the last value from the list. But isn’t guarantee that the sorting order will be always the same. Although normalize directive usually will do the same order from the previous format of the recurse query.

{
   path_to_ultimate_parent(func: eq(entity_id, "05QXVL-E")) @recurse  @normalize {
		 expand(_all_) { expand(_all_)}
  }
}

PS. Normalize only works using “expand(_all_)”. And aliases.

Result

{
  "data": {
    "path_to_ultimate_parent": [
      {
        "entity_id": [
          "05QXVL-E",
          "0029WX-E",
          "003JLG-E",
          "0FPWZZ-E"
        ],
        "uid": "0xf9063"
      }
    ]
  }
}

PS. That’s the only way in my view. The other ways can’t indeed guarantee that the sorting order will be always the same.


(Himanshu Barthwal) #11

@MichelDiz, I think I have a query that does find out the ultimate parent.

{
   uid_for_ultimate_parent (func: eq(entity_id, "05QXVL-E")) @recurse @normalize {
     isSubsidiaryOfFilermstPriorityNoOverlap
     uid : entity_id
  }
}

Gives me the following result:

{
    "uid_for_ultimate_parent": [
      {
        "uid": "0FPWZZ-E"
      }
    ]
}

I have tried with a few other children and it seems to work. I have no clue why though. Any ideas?


(Michel Conrado) #12

This is weird. Sounds like a nice bug feature tho. Nice finding!

It seems to me that when using the alias ​​for UID, you have in some way forced the value of entity_id to be inserted into this. The rule is that UID must be unique per block, maybe this has bumped into this premise. You have recategorized entity_id to be uid. In all blocks it have been renamed to UID. And when it came time to use @normalize Dgraph overwritten all values ​​as ‘uid’ block by block until get the final nested block.

That’s just a theory. If you do uid : uid you gonna have the same result but will come only (the desired one) the UID from the last nested block. That’s nice!

Try this query:

{
   uid_for_ultimate_parent (func: eq(entity_id, "05QXVL-E")) @recurse @normalize
  {
     isSubsidiaryOfFilermstPriorityNoOverlap
     FG as uid :	 uid
  }
  
  finalQ(func: uid(FG)) {
    uid
    entity_id
  }
  
}

If this was indeed something valid in the Dgraph. The second query would only come with a single node.

{
  "data": {
    "uid_for_ultimate_parent": [
      {
        "uid": "0xf9063"
      }
    ],
    "finalQ": [
      {
        "uid": "0xf9061"
      },
      {
        "uid": "0xf9062",
        "entity_id": "003JLG-E"
      },
      {
        "uid": "0xf9063",
        "entity_id": "0FPWZZ-E"
      },
      {
        "uid": "0xf9064",
        "entity_id": "05QXVL-E"
      },
      {
        "uid": "0xf9065",
        "entity_id": "0029WX-E"
      }
    ]
  }
}

(Himanshu Barthwal) #13

@MichelDiz I executed the following query:

{
  ultimate_parent_uids(func: eq(entity_id, "0847GF-E")) @recurse @normalize {
     isSubsidiaryOfFilermstPriorityNoOverlap
     parent_uid as uid : uid 
  }
  ultimate_parents(func:uid(parent_uid)) {
    uid
    entity_id
  }
}

And got the following result:

{
   "ultimate_parent_uid": [
      {
        "uid": "0x1b32457"
      }
    ],
    "ultimate_parents": [
      {
        "uid": "0x1b32457",
        "entity_id": "05L5R1-E"
      },
      {
        "uid": "0x23a18bd",
        "entity_id": "0847GF-E"
      }
    ]
}

I assume you are not surprised with the second result, but I surely am. Is there any workaround via which
I can operate on the ultimate_parent uids instead of flattened path in the second block?
If not, then it will require my app to do this in two steps (query ultimate parents and then filter them on some predicate in the second dgraph request) which does not seem ideal. Do you think it can be an acceptable feature request?
Thanks for your help so far.


(Michel Conrado) #14

I’m not cuz the variable tend to maps only UIDs and you can’t pass a value from @normalize as UID map. So from parent_uid would always come a map of UID’s, not values. Unless you wanted a value (but it did not work for uid(X) func).

I think that give @recurse directive a new feature would be good. For the “ultimate parent” giving the last (or multiple last) node of a recurse search. Would be nice to do it without tricks. But not sure how it would be useful for others. Knowing this would be important to allocate a Dev Core for this. Today they are almost 99% working on critical things for next release, features are in last place of importance now. Unless it’s easy to implement and people need it.

One more thing

I think depending on your context it would be worth doing a “tag” to find out who the “ultimate_parent” is.

The idea would be to mark the “ultimate_parent” with a Boolean predicate. That way we can ensure that only it will return from a recursive query. It could even be a Facet.

e,g:


{
       "entity_id": "0FPWZZ-E",
       "ultimate": "true"
}

And

{
  ultimate_parent(func: eq(entity_id, "05QXVL-E")) @recurse(loop:false) {
    GetIT as uid
    isSubsidiaryOfFilermstPriorityNoOverlap @filter(eq(ultimate, "true"))
    entity_id
  }

  ultimate_parent(func: uid(GetIT)) @filter(Not eq(entity_id, "05QXVL-E")){
    uid
    expand(_all_)
  }
}

(Himanshu Barthwal) #15

Today they are almost 99% working on critical things for next release…

Makes sense.

The idea would be to mark the “ultimate_parent” with a Boolean predicate…

I am guessing what you are suggesting here is to generate mutations based on the ultimate_parent uids query’s result set that marks them is_ultimate_parent. And then the queries you suggested below can use that new predicate to filter out the nodes. Correct?


(Michel Conrado) #16

More or less that.

You could create a query that checks whether a node is the last of the chain or not. Maybe a bot or a background task hunting for nodes with this characteristics. If the Node has no children, only parent it is indeed the ultimate at that moment. So you mutate it as "ultimate": "true".

You could use upsert procedure and soon we will have the new transaction #3059. With the new txn this operation would be simpler.