Divide aggregate sums by further edge

Hello!

I’ve been trying and trying to do something which makes perfect intuitive sense, but clearly my understanding of how to handle variables in dgraph is lacking. I’ll present as simple an example (with sample data) as I can to explain what my issue is.

Imagine I’m storing a series of transactions, which each have an amount, point to a location node (England, France or Spain), and also to payment type node (Cash or Card).

Here’s a simple set command to have some dummy data in play (if you use this, add a reverse index on “location” and “paymentType”):

{
  "set": [{
    "uid": "_:england",
    "locationName": "England"
  }, {
    "uid": "_:france",
    "locationName": "France"
  }, {
    "uid": "_:spain",
    "locationName": "Spain"
  }, {
    "uid": "_:cash",
    "paymentTypeName": "cash"
  }, {
    "uid": "_:card",
    "paymentTypeName": "card"
  }, {
    "amount": 200,
    "location": {
      "uid": "_:france"
    },
    "paymentType": {
    	"uid": "_:cash"
  	}
  }, {
    "amount": 150,
    "location": {
      "uid": "_:england"
    },
    "paymentType": {
    	"uid": "_:cash"
  	}
  }, {
    "amount": 400,
    "location": {
      "uid": "_:spain"
    },
    "paymentType": {
    	"uid": "_:card"
  	}
  }, {
    "amount": 220,
    "location": {
      "uid": "_:france"
    },
    "paymentType": {
    	"uid": "_:card"
  	}
  }, {
    "amount": 120,
    "location": {
      "uid": "_:spain"
    },
    "paymentType": {
    	"uid": "_:cash"
  	}
  }, {
    "amount":460,
    "location": {
      "uid": "_:spain"
    },
    "paymentType": {
    	"uid": "_:cash"
  	}
  }, {
    "amount": 310,
    "location": {
      "uid": "_:england"
    },
    "paymentType": {
    	"uid": "_:card"
  	}
  }, {
    "amount": 440,
    "location": {
      "uid": "_:england"
    },
    "paymentType": {
    	"uid": "_:cash"
  	}
  }, {
    "amount": 250,
    "location": {
      "uid": "_:france"
    },
    "paymentType": {
    	"uid": "_:cash"
  	}
  }, {
    "amount": 560,
    "location": {
      "uid": "_:france"
    },
    "paymentType": {
    	"uid": "_:cash"
  	}
  }, {
    "amount": 180,
    "location": {
      "uid": "_:spain"
    },
    "paymentType": {
    	"uid": "_:cash"
  	}
  }, {
    "amount": 320,
    "location": {
      "uid": "_:spain"
    },
    "paymentType": {
    	"uid": "_:card"
  	}
  }, {
    "amount": 260,
    "location": {
      "uid": "_:england"
    },
    "paymentType": {
    	"uid": "_:card"
  	}
  }]
}

So now we have a bunch of payments each associated with one locale and one payment type. Now I can get the totals by payment type or the totals by location like so:

{
  paymentTypesSearch as var(func: has(paymentTypeName)) {
    ~paymentType {
    	payment as amount
    }
  	paymentsTotal as sum(val(payment))
  }
  
  paymentsByType(func: uid(paymentTypesSearch)) {
    paymentTypeName
    total: val(paymentsTotal)
  }
  
  locationsSearch as var(func: has(locationName)) {
    ~location {
    	payment2 as amount
  	}
  	paymentsTotal2 as sum(val(payment2))
  }
  
  paymentsByLocation(func: uid(locationsSearch)) {
    locationName
    total: val(paymentsTotal2)
  } 
}

With this I get the results:

"data": {
    "paymentsByType": [
      {
        "paymentTypeName": "card",
        "total": 1510
      },
      {
        "paymentTypeName": "cash",
        "total": 2360
      }
    ],
    "paymentsByLocation": [
      {
        "locationName": "Spain",
        "total": 1480
      },
      {
        "locationName": "England",
        "total": 1160
      },
      {
        "locationName": "France",
        "total": 1230
      }
    ]
  }

Which is great. But how can I combine the two? How can I separate by both location and payment type? I’m sure I’m missing something basic here, but I simply can’t figure it out. So I mean something which provides the following output:

"paymentsByLocaleAndType": [
  {
    "locationName": "Spain"
    "total": 1480
    "byType":  [
      {
         "paymentTypeName": "card",
         "total": 720
      },
      {
         "paymentTypeName": "cash",
         "total": 760
      }
    ]
  },

  // ETC.
]

I’ve tried all kinds of things, like this:

{
  test(func: has(locationName)) {
    ~location {
        payment as amount
        paymentType {
          name
          total: sum(val(payment))
      }
    }
  }
}

But of course I get a Invalid variable aggregation. Check the levels.

I’m sure there’s a trick I’m missing. Can anyone help?

Regards

Michael

Hi @Awoogamuffin,

Welcome to Dgraph Community !!

Did you try using groupby to aggregate queries ?

I am able to get total payments by country and paymentType using the following query:

{
  getPaymentByCardAndCountry(func: has(locationName)) {
    locationName
    ~location @groupby(paymentType) {
    	payment as sum(amount)
    }
  	val(payment)
  } 
}

Although, with the limitation that groupby block cannot contain anything other than aggregation variables, the result is not completely same as expected. I do believe however that this does solve your problem up to certain extent.

Do let us know if this solves your problem.

1 Like

That’s great! Thanks!

I’d seen groupby before but it’s only when I have a real-life need for these things that I fully absorb them.

Though I can do this relatively easily, I don’t suppose there’s a way to get the paymentType name in the groupBy block, is there? I can just have a block with the paymentTypes and their uids then stick them together later, so it’s not a huge deal.

On a further note, imagine if the payments were separated into groups (maybe specific restaurants or towns etc.) how would I handle groupby in that case? Seeing as I can only use aggregate functions in a groupby block, how would I get from England → Bournemouth → payments (sort by payment type), but have the payment types grouped for all of England and not concern myself with the towns? Does that make sense?