V21.03: After pagination+cascade change, queries are too slow to finish

@anand To boil down the solution you are suggesting:

  • run the query without @cascade so pagination takes effect, save the results in uid variables
  • run the query again with @cascade and use the uid vars as the filters

Note that my queries are very deep on average, so it would come out to something like this:

query q($start: int, $end: int, $limit: int = 0) {
  var_a as var(func: eq(<qa.type>, "Device"), first: $limit) {
    var_b as <qa.has_object> @filter(eq(<qa.type>, "Object"))  (first: $limit) {
      var_c as <qa.has_indicator> @filter(eq(<qa.type>, "Indicator")) (first: $limit)
    }
  }
  a(func: uid(var_a), first: $limit) @cascade(<qa.has_timerange>,<qa.has_object>,<qa.has_indicator>) {
    ...commonFields
    b: <qa.has_object> @filter(uid(var_b)) (first: $limit) {
      ...commonFields
      c: <qa.has_indicator> @filter(uid(var_c))  (first: $limit) {
        ...commonFields
      }
    }
  }
}
fragment commonFields {
  <qa.name>
  uid
  timeranges: <qa.has_timerange> @filter(
    le(<qa.timerange_start>, $end) AND ge(<qa.timerange_end>, $start)
  ){
    start: <qa.timerange_start>
    end: <qa.timerange_end>
  }
}

Which… I guess puts us back into the semantics of @cascade we had in v20.11. (It seems pretty slow @>2s execution time but I do not have the execution time on v20.11) The UIDs hit in the debug metrics returned show a better #UIDs hit of "_total": 6499. Sure I could over page the var block and tighten up the paging in the second one, but that will obviously make performance worse.

I do not know why this would be slower (again thats subjective, I do not have v20.11 numbers) than the same dataset on v20.11, so I am not putting too much weight onto that.

Is this going to be the only solution for cascade going forward, or have you as a team discussed a way to optimize @cascade this in a future release?

1 Like