Need help with dgraph query

Sai · September 30, 2019, 12:41pm

Hi ,

I would like to know how to write a query which helps me in getting the number of types of items bought by each customer. I would like to know this for and every customer.

schema

Customer

customer_name: String
did_order : [uid] (order)

Order

order.customer : uid (customer)
order.item : uid (item)
order.date

Item

item.name // examples :- apple or orange or pen …

each customer can place any number of orders, each order can have any number of items in it. There won’t be any duplicate items in each order and no facets or so to mention the quantity of the item purchased. Please assume the quantity to be one always. Finally, I would like to count the number of Distinct types of items bought by every customer. i.e :- please start the root with has(customer.name) instead of eq(customer.name,xxx).

P.S :- this is a scale-down of our use case which has abt 6 different types of entities and such. Please ignore any non-ideal schema declaration practices if any.

AugustHell · September 30, 2019, 5:28pm

Basic example:

{
	q(func: has(customer_name)) {
	customer_name
	did_order {
		order.date
		order.item {
			item.name
		}
	}
}

Do the tutorial to get a feeling of what you can do with dgraph

Sai · October 1, 2019, 7:04am

Hi August,

Thank you for the response. I have completed the tutorial(in fact many times already: sweat_smile:). My query is to get the “number of distinct items purchased by each customer”. The query posted by you gives me the list of all items purchased by each customer. But it is missing the aggregation(count) part and distinct part. To help you in visualizing my requirement better, I will give an example.

consider 2 customers who placed 2 orders each. lets say that customer A ordered item X and item Y in first order and item Y and item Z in other order, then the distinct items ordered by him are:- item X, item Y, and item Z and the count is 3.

Let’s say that customer B bought item Z in both of his orders, then the number of distinct items ordered by him is 1.
final response required is:-
A : 3
B: 1

for 1 customer I can do this

{
	var(func: eq(customer_name,"A")) {
	customer_name
	did_order {
		order.date
		item_list as order.item
	}
       item_count(func: uid(item_list)){
         final_count : count(uid)
       }
}

I want to know how to do the same when I’m starting with has(customer.name) .i.e when i want to do it for everyone at once.

now, drifting away from layman terms. the challenge I’m facing is " when there are multiple nodes at the root level, I would like to be able to store query variables (uid_lists) whose scope is local. i.e separate uid_list for each node at root level "

AugustHell · October 2, 2019, 7:11am

Ah I see, yes thats beyond what the tutorial is covering. Sorry I misunderstood your question, it looked trivial. To do what you want, you need some kind of mapping of uids to aggregation.
While it’s easy to gather the informations for such a mapping with eg:

{
  v(func: has(customer_name)) {
    cname: customer_name
    did_order {
      order.item {
        itemuid: uid
      }
    }
  }
}

… I don’t know how to do the mapping with a query and would do that with the result in a program.

Maybe that can be done anyhow with @groupby?

Sai · October 3, 2019, 6:03am

Hi August hell,
I tried with group by too. but no luck till now.

 @MichelDiz.  Please help me with this.

MichelDiz · October 4, 2019, 7:15pm

Hi, I’m not sure if I get it. A sample (JSON or RDF) and also an idea of response would be nice to corroborate with what is being said.

however, I think you eventually will use aggregations tho, so:

{
	allCustomers(func: type(Customer)) {
    customer_name
    did_order {
      order.date
      item_list as count(order.item)
    }
    total_items_by_Customer : sum(val(item_list))
    }

    total(){
      final_sum : sum(val(item_list))
      }
}

Response

{
  "data": {
    "allCustomers": [...],
    "total": [
      {
        "final_sum": 10++
      }
    ]
  },
  "extensions": {
    "server_latency": {
      "parsing_ns": 23847,
      "processing_ns": 2416875,
      "encoding_ns": 4990,
      "assign_timestamp_ns": 453883
    },
    "txn": {
      "start_ts": 30304
    }
  }
}

https://docs.dgraph.io/query-language/#aggregation

AugustHell · October 8, 2019, 11:25pm

So customer ordered:
Order 1: Item X and Item Y
Order 2: Item Y and Item Z

I’m not sure Michel, will your aggregation

count the 3 unique items X, Y, Z which it should
or 2+2=4 as the number of items per order?

Thats why I think, there is a group needed to have disinct items of all orders of one customer to get the result Sai is looking for.

Sai · October 9, 2019, 11:59am

Hi @MichelDiz,

While computing total_items_by_Customer, Your query will count the total number of items ordered by each customer right. As @AugustHell said, i want to “count the number of distinct items bought by each customer”. In the example specified by me, the items ordered by the customer A are X, Y, Y, Z . So, the distinct items ordered by him are 3 (X,Y,Z).

Sai · October 9, 2019, 1:01pm

This is the schema and sample rdf

type Customer {
  customer_name: string
  did_order: [Order]
}

type Order {
  order.customer: Customer
  order.item: [Item]
  order.date: string
}

type Item {
    item.name: string
}


item.name: string @index(exact) .
customer_name: string @index(term) .
did_order: [uid] .
order.customer: uid .
order.item [uid] .
order.date: string .

Sample data

{
  set {
    _:alice <name> "Alice" .
    _:alice <dgraph.type> "Customer" .
    _:alice <did_order> _:order1 .
    _:alice <did_order> _:order2 .


    _:bob <name> "bob" .
    _:bob <dgraph.type> "Customer" .
    _:bob <did_order> _:order3 .
    _:bob <did_order> _:order4 .


    _:order1 <dgraph.type> "Order" .
    _:order1 <order.customer> _:alice .
    _:order1 <order.date> "2019-10-09" .
    _:order1 <order.item> _:x .
    _:order1 <order.item> _:y .

    _:order2 <dgraph.type> "Order" .
    _:order2 <order.customer> _:alice .
    _:order2 <order.date> "2019-10-09" .
    _:order2 <order.item> _:y .
    _:order2 <order.item> _:z .

    _:order3 <dgraph.type> "Order" .
    _:order3 <order.customer> _:bob .
    _:order3 <order.date> "2019-10-09" .
    _:order3 <order.item> _:z .

    _:order4 <dgraph.type> "Order" .
    _:order4 <order.customer> _:bob .
    _:order4 <order.date> "2019-10-09" .
    _:order4 <order.item> _:z .

    _:x <dgraph.type> "Item" .
    _:y <dgraph.type> "Item" .
    _:z <dgraph.type> "Item" .




    _:x <item.name> "pen" .
    _:y <item.name> "pencil" .
    _:z <item.name> "quill" .

  }
}

The expected result is
{
Alice : 3
Bob : 1
}

explaination :-

Alice ordered items :- pencil,pen,quill
bob ordered item :- quill

Take away for me from this example :-
some way to create and use variable whose scope is limited i.e query variable(not value variable) whose scope is limited to each customer.

MichelDiz · October 9, 2019, 8:24pm

As far I can tell this wouldn’t be possible. Cuz they are multiple orders. We can count, but not “count unique”. However we can do this if we define a new block for each Customer. And that would have to be done manually. As you did in your first example.

Using the new @normalize behavior (coming soon), we can do something close to what you want. But it doesn’t take out the repeated ones.

{
  q(func:  type(Customer)) {
   customer_name : name
    did_order @normalize {
      order.item  {
       count : count(uid)
       item.name : item.name
      }
    }
  }
}

It returns

{
  "data": {
    "q": [
      {
        "customer_name": "Alice",
        "did_order": [
          {
            "item.name": "pencil"
          },
          {
            "item.name": "quill"
          },
          {
            "count": 2
          },
          {
            "item.name": "pen"
          },
          {
            "item.name": "pencil"
          },
          {
            "count": 2
          }
        ]
      },
      {
        "customer_name": "bob",
        "did_order": [
          {
            "item.name": "quill"
          },
          {
            "count": 1
          },
          {
            "item.name": "quill"
          },
          {
            "count": 1
          }
        ]
  }
}

You can just “remove” the repeated ones manually.

@animesh2049 what you think about this? We could add a new feature like “@mergenormalized”

{
  q(func:  type(Customer)) {
   customer_name : name
    did_order @mergenormalized(item.name) {      # "Merge by name"
      order.item  {
       count : count(uid)
       item.name : item.name
      }
    }
  }
}

Tha would return

{
  "data": {
    "q": [
      {
        "customer_name": "Alice",
        "did_order": [
          {
            "item.name": "pencil"
          },
          {
            "item.name": "quill"
          }
          {
            "item.name": "pen"
          },
          {
            "count": 3
          }
        ]
      },
      {
        "customer_name": "bob",
        "did_order": [
          {
            "item.name": "quill"
          }
          {
            "count": 1
          }
        ]
      }
    ]
  }
}

Topic		Replies	Views
Need help with dgraph query ( Need to know if foreach or any loops will be introduced in dgraph) Dgraph kind:question	6	495	November 19, 2020
To count products and groupby Dgraph kind:question , dgraph	10	414	May 21, 2021
How to write a query simulating a very primitive recommendation system? Dgraph kind:question	2	278	June 21, 2021
DQL and Graph QL for the below Query Dgraph kind:question	14	377	September 13, 2023
Unique count for each node of the same type App Development kind:question	0	690	December 10, 2021

Need help with dgraph query

Response

It returns

Related topics