Filtering by comparing branches of a query


#1

Hi, I’m wondering whether it is possible to filter by comparing branches of a query.

As an example, if I have a database of sensors that each collect two measurements at sporadic times and a schema where I have sensors that each have a list of edges for each measurement from one detector (measurement1) and another list of edges for each measurement from the other (measurement2). Each measurement has an edge to a dateTime (ts). Can I filter for sensors where there exists a measurement2 that occurred after a measurement1 with a value above a certain value (e.g. 50)?

I imagine that I could do this if I could filter at the root based on arbitrary elements nested in the query, but I couldn’t find documentation indicating that this is possible. If I were to imagine such a query in (invalid) graphql± it might look like the following. Is there another way to accomplish this with valid graphql±?

{
  me(func: type(Sensor)) @filter(exists(gt(meas2.ts, meas1.signal.ts))) {
    sensor.id
    primary.signal: measurement1 @filter(gt(value, 50)) (orderasc: ts, first: 1) {
      meas1.signal.ts: ts
      value
    }
    measurement1 {
      ts
      value
    }
    measurement2 {
      meas2.ts: ts
      value
    }
  }
}

(Michel Conrado) #2

If you have two distinct graph structures you can compare them by using a query block for each. Using multiple blocks, variables and so on. When you say “branches of a query” you mean distinct graph structures right?

Each database? Dgraph doesn’t have the concept of multiple database.

Yes, if you have a datetime edge you can use it to filter a target structure by values from another structure. As I’ve mentioned before.

I still didn’t get all of you meant. BTW you can’t mix structures in a single query block. Your example feels like is a single query trying to catch and compare two distinct and unrelated structures.

Maybe if you give dummy samples (JSON structures and desired results) would help to visualize what you mean.


#3

Thanks for the reply Michel!

Sorry for the confusing question. My question is about a single graph in a single database, but I want to compare entities within the body of the query at different levels. For example, in the JSON below there are 2 sensors, one in New York and another in Anchorage, each is collecting temperature and humidity. I want to identify the sensors where there exists a temperature reading above 31 (which both satisfy) and any humidity measurement that occurs after the first temperature reading above 31 (only the New York sensor satisfies). This requires comparing temperature measurements to humidity measurements to filter which sensor to return and I was not sure this would be possible in graphql. This is a common query pattern for my situation.

Here is an example of the JSON structure and the desired results:

JSON:

[
  {
    "uid": "0x1",
    "type": "Sensor",
    "location": "New York",
    "temperature_measurement": [
      {
        "uid": "0x2",
        "type": "Temperature",
        "ts": "2019-07-30T16:18:09.877394",
        "value": 32
      },
      {
        "uid": "0x3",
        "type": "Temperature",
        "ts": "2019-07-30T16:20:12.287947",
        "value": 51
      }
    ],
    "humidity_measurement": [
      {
        "uid": "0x4",
        "type": "Humidity",
        "ts": "2019-07-30T16:27:28.516389",
        "value": 60
      },
      {
        "uid": "0x5",
        "type": "Humidity",
        "ts": "2019-07-30T16:27:41.429091",
        "value": 70
      }
    ]
  },
  {
    "uid": "0x6",
    "type": "Sensor",
    "location": "Anchorage",
    "temperature_measurement": [
      {
        "uid": "0x7",
        "type": "Temperature",
        "ts": "2019-07-30T16:29:05.953131",
        "value": 10
      },
      {
        "uid": "0x8",
        "type": "Temperature",
        "ts": "2019-07-30T16:29:20.366124",
        "value": 33
      }
    ],
    "humidity_measurement": [
      {
        "uid": "0x9",
        "type": "Humidity",
        "ts": "2019-07-30T16:28:34.131180",
        "value": 20
      },
      {
        "uid": "0xa",
        "type": "Humidity",
        "ts": "2019-07-30T16:28:52.505992",
        "value": 30
      }
    ]
  }
]

Desired result from query is just the first element in the JSON list above:

[
  {
    "uid": "0x1",
    "type": "Sensor",
    "location": "New York",
    "temperature_measurement": [
      {
        "uid": "0x2",
        "type": "Temperature",
        "ts": "2019-07-30T16:18:09.877394",
        "value": 32
      },
      {
        "uid": "0x3",
        "type": "Temperature",
        "ts": "2019-07-30T16:20:12.287947",
        "value": 51
      }
    ],
    "humidity_measurement": [
      {
        "uid": "0x4",
        "type": "Humidity",
        "ts": "2019-07-30T16:27:28.516389",
        "value": 60
      },
      {
        "uid": "0x5",
        "type": "Humidity",
        "ts": "2019-07-30T16:27:41.429091",
        "value": 70
      }
    ]
  }
]

(Michel Conrado) #4

The query would be something like below. But the problem is, Dgraph only index by hour, day, month and year. From your sample the difference of time is seconds or minutes. There’s a way to do it but is a way complex query tho.

#These two var blocks are there just to catch the datetime value
#BTW this query is incomplete. It's just an idea of "how to start".
{
  G as var(func: eq(type, Sensor))
    {
    temperature_measurement @filter(gt(value, 31)){
     d as uid
    }
  }

    var(func: uid(d), first:1) {
      dF as ts
  }

   me(func: uid(G)){
    uid
    type
    location
    temperature_measurement {
      expand(_all_)
    }
    humidity_measurement @filter(gt( ts, val(dF) )) 
      {
      expand(_all_)
    }
  }
}

(Michel Conrado) #5

It would be something like this query


#6

Hi Michel,

Thanks!

My concern was that dF would be the first across all sensors, whereas I am interest in having one dF for each sensor that meets the criterion (temperature measurement over 31). In other words, one sensor might have their first temperature over 31 yesterday and another might have their first temperature over 31 2 years ago. I want to make sure that in the “me” query that val(dF) produces the correct value (yesterday vs 2 years ago) depending on the uid(G).

Would something like this be valid? Does the cascade make sense there?

{
  G as var(func: eq(type, Sensor))
    {
    temperature_measurement @filter(gt(value, 31)){
     tsvals as ts
    }
    minTS as min( val( tsvals ) )
  }

   me(func: uid(G)) @cascade {
    uid
    type
    location
    temperature_measurement {
      expand(_all_)
    }
    humidity_measurement @filter(gt( ts, val(minTS) )) 
      {
      expand(_all_)
    }
  }
}

(Michel Conrado) #7

No need for @cascade, so as it seems you don’t need a smaller time indexing, so that’s the whole query below.

Schema

<humidity_measurement>: uid @reverse .
<location>: string @index(exact) .
<temperature_measurement>: uid @reverse .
<ts>: datetime @index(hour) .
<type>: string @index(exact) .
<value>: int @index(int) .

Mutation

{ "set": [
  {
    "uid": "0x1",
    "type": "Sensor",
    "location": "New York",
    "temperature_measurement": [
      {
        "uid": "0x2",
        "type": "Temperature",
        "ts": "2019-07-15T16:18:09.877394",
        "value": 32
      },
      {
        "uid": "0x3",
        "type": "Temperature",
        "ts": "2019-06-20T16:20:12.287947",
        "value": 51
      }
    ],
    "humidity_measurement": [
      {
        "uid": "0x4",
        "type": "Humidity",
        "ts": "2019-07-21T16:27:28.516389",
        "value": 60
      },
      {
        "uid": "0x5",
        "type": "Humidity",
        "ts": "2019-06-20T16:27:41.429091",
        "value": 70
      }
    ]
  },
  {
    "uid": "0x6",
    "type": "Sensor",
    "location": "Anchorage",
    "temperature_measurement": [
      {
        "uid": "0x7",
        "type": "Temperature",
        "ts": "2019-06-20T16:29:05.953131",
        "value": 10
      },
      {
        "uid": "0x8",
        "type": "Temperature",
        "ts": "2019-06-20T16:29:20.366124",
        "value": 33
      }
    ],
    "humidity_measurement": [
      {
        "uid": "0x9",
        "type": "Humidity",
        "ts": "2019-06-20T16:28:34.131180",
        "value": 20
      },
      {
        "uid": "0xa",
        "type": "Humidity",
        "ts": "2019-06-20T16:28:52.505992",
        "value": 30
      }
    ]
  }
]}

Query

{
  G as var(func: eq(type, Sensor))
#This Block is
    #"I want to identify the sensors where there exists a temperature reading above 31 (which both satisfy) "
    {
    temperature_measurement @filter(gt(value, 31)){
     d as uid
    }
  }
#This Block is
    #I need to catch the time from the "humidity measurement that occurs after the first temperature reading above 31" so I need the "2019-07-15T16:18:09.877394" temp to use in the next block.
    var(func: uid(d), first:1, orderdesc: ts) {
      dF as ts
  }

      
#In this block we gonna filter the humidity by the "first temperature above 31"
   var(func: uid(G)){
    humidity_measurement @filter(gt( ts, val(dF) )) 
      {
      ~humidity_measurement {
        final as uid
     #   location
      }
    }
  }
      
#Now is the final query block. In above blocks we filtered as you needed so now we apply the results.

   myQuery(func: uid(final)){
    uid
    type
    location
    temperature_measurement {
      uid
      type
      ts
      value
    }
    humidity_measurement 
    # @filter(gt( ts, val(dF) )) #use this filter if needed
      {
      uid
      type
      ts
      value
    }
  }
}

Result

{
  "extensions": {
    "server_latency": {
      "parsing_ns": 20800,
      "processing_ns": 9674100,
      "encoding_ns": 706600
    },
    "txn": {
      "start_ts": 247
    }
  },
  "data": {
    "myQuery": [
      {
        "uid": "0x1",
        "type": "Sensor",
        "location": "New York",
        "temperature_measurement": [
          {
            "uid": "0x2",
            "type": "Temperature",
            "ts": "2019-07-15T16:18:09.877394Z",
            "value": 32
          },
          {
            "uid": "0x3",
            "type": "Temperature",
            "ts": "2019-06-20T16:20:12.287947Z",
            "value": 51
          }
        ],
        "humidity_measurement": [
          {
            "uid": "0x4",
            "type": "Humidity",
            "ts": "2019-07-21T16:27:28.516389Z",
            "value": 60
          },
          {
            "uid": "0x5",
            "type": "Humidity",
            "ts": "2019-06-20T16:27:41.429091Z",
            "value": 70
          }
        ]
      }
    ]
  }
}

#8

Thanks!

I will explore this based on your query. I do need proper sorting and filtering by dateTime also, but I’ll look into that afterwards.


(system) closed #9

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.