V1.0.12 slower for some queries

(Kaveh) #1


I have a setup that I can switch between dgraph versions easily. After v1.0.12 release, I was comparing my queries to make sure if I can switch to v1.0.12 in production but I found one complex query to run consistently slower in v1.0.12 compared to v1.0.11.

The query is my longest one and might need improvement in the new version and it is:

    var(func: eq(serial, "MY-PRODUCT")) {
        PRODUCTSELF as uid
        FEATURES as has_feature @filter(eq(code,["COLOR"])  )
        garment_of {
            GENDER as gender
            type_of {
                TYPE as uid
        DESC as described_as

    prod(func: eq(locale,"CA::en")) @cascade {
        products @filter(
                eq(is_published, true)
                and eq(has_thumbnail,true)
                and eq(has_productURL,true)
                and not uid(PRODUCTSELF)
                and (eq(described_as, "") or not eq(described_as, val(DESC)))
                @facets(eq(is_in_stock,true) and gt(updated_at,'$(expr $(date +%s) - 172800)'))
            garment_of @filter( eq(gender,"u") or eq(gender,val(GENDER)) ) {
                type_of @filter(uid(TYPE)) {

            has_feature @filter(uid(FEATURES)) {

The schema I use is:

product_of : uid @reverse @count .
type_of : uid @reverse @count .
garment_of : uid @reverse @count .
serial : string @count @index(exact) .
described_as : string @count @index(term) .
is_published : bool @count @index(bool) .
has_thumbnail : bool @count @index(bool) .
has_productURL : bool @count @index(bool) .
updated_at : dateTime @index(day) .

shop_id : string @count @index(exact) .
shop_name : string @count @index(exact) .
prefix : string @count @index(exact) .

type_id : string @count @index(exact) .
is_generic : bool @count @index(bool) .

garment_id : string @count @index(exact) .
gender : string @count @index(exact) .

article_size_id : string @count @index(exact) .
products : uid @reverse @count .
shop_country : string @count @index(exact) .
language : string @count @index(exact) .
locale: string @count @index(exact) .
is_in_stock: bool @count @index(bool) .

code: string @count @index(exact) .
value: string @count @index(exact) .

All predicates have indexing and I assume they are correct as the older version of dgraph is querying near 40% faster. I am checking on http and grpc btw.

Is this anyone else noticed or it is an edge case that only affects me?

average query time (after 100 runs):
v1.0.11: 189156741.4 ns
v1.0.12: 289063134.8 ns

(Kaveh) #2

I was trying to see if I can replicate the same situation on a more public data and if I am not wrong ( I just tried a couple of times each time 1000 executions), it seems I see the same case for 1million.rdf.gz data set.

while true; do curl -s localhost:8080/query -XPOST -d '
  var(func: eq(name@en,"Minority Report")) {
    d as initial_release_date

  me(func: eq(name@en, "Steven Spielberg")) {
    director.film @filter(ge(initial_release_date, val(d))) {
' | python -m json.tool|grep processing_ns|cut -d':' -f 2|cut -d' ' -f 2;done

If it was replicable for any one else this case make it easier to trace it as data set is the same we all use in tutorial.

Notice for this set I only test it in my local machine a couple times and not in my servers as I was just trying to find a public data/query with the same issue.

average response (1000x):

v1.0.11: 5462598.224
v1.0.12: 7390753.525

(Nikita Zaletov) #3

1.0.12 doesn’t have LRU cache comparing to 1.0.11, may be that’s the case

(Manish R Jain) #4

We’ll have to verify these numbers, but just at a general level, simpler queries might slow down a bit, while more complex queries speed up with these changes. The issue was a lot of contention – for simpler queries that’s not an issue – so they’d be limited by the how fast data can be accessed off disk.

@dmai: Can you post what numbers you get on your desktop?

P.S. We have plans to write a much faster LFU cache based on research – going to post a blog about it today.