V1.0.12 slower for some queries

kavehmz · March 7, 2019, 9:01am

Hi,

I have a setup that I can switch between dgraph versions easily. After v1.0.12 release, I was comparing my queries to make sure if I can switch to v1.0.12 in production but I found one complex query to run consistently slower in v1.0.12 compared to v1.0.11.

The query is my longest one and might need improvement in the new version and it is:

{
    var(func: eq(serial, "MY-PRODUCT")) {
        PRODUCTSELF as uid
        FEATURES as has_feature @filter(eq(code,["COLOR"])  )
        garment_of {
            GENDER as gender
            type_of {
                TYPE as uid
            }
        }
        DESC as described_as
    }

    prod(func: eq(locale,"CA::en")) @cascade {
        locale
        products @filter(
                eq(is_published, true)
                and eq(has_thumbnail,true)
                and eq(has_productURL,true)
                and not uid(PRODUCTSELF)
                and (eq(described_as, "") or not eq(described_as, val(DESC)))
            )
                @facets(is_in_stock,updated_at)
                @facets(eq(is_in_stock,true) and gt(updated_at,'$(expr $(date +%s) - 172800)'))
        {
            is_published
            has_thumbnail
            has_productURL
            serial
            garment_of @filter( eq(gender,"u") or eq(gender,val(GENDER)) ) {
                gender
                type_of @filter(uid(TYPE)) {
                    type_name
                }
            }

            has_feature @filter(uid(FEATURES)) {
                locale
                code
                value
            }
        }
    }
}

The schema I use is:

product_of : uid @reverse @count .
type_of : uid @reverse @count .
garment_of : uid @reverse @count .
serial : string @count @index(exact) .
described_as : string @count @index(term) .
is_published : bool @count @index(bool) .
has_thumbnail : bool @count @index(bool) .
has_productURL : bool @count @index(bool) .
updated_at : dateTime @index(day) .

shop_id : string @count @index(exact) .
shop_name : string @count @index(exact) .
prefix : string @count @index(exact) .

type_id : string @count @index(exact) .
is_generic : bool @count @index(bool) .

garment_id : string @count @index(exact) .
gender : string @count @index(exact) .

article_size_id : string @count @index(exact) .
products : uid @reverse @count .
shop_country : string @count @index(exact) .
language : string @count @index(exact) .
locale: string @count @index(exact) .
is_in_stock: bool @count @index(bool) .

code: string @count @index(exact) .
value: string @count @index(exact) .

All predicates have indexing and I assume they are correct as the older version of dgraph is querying near 40% faster. I am checking on http and grpc btw.

Is this anyone else noticed or it is an edge case that only affects me?

average query time (after 100 runs):
v1.0.11: 189156741.4 ns
v1.0.12: 289063134.8 ns

kavehmz · March 7, 2019, 2:50pm

I was trying to see if I can replicate the same situation on a more public data and if I am not wrong ( I just tried a couple of times each time 1000 executions), it seems I see the same case for 1million.rdf.gz data set.

while true; do curl -s localhost:8080/query -XPOST -d '
{
  var(func: eq(name@en,"Minority Report")) {
    d as initial_release_date
  }

  me(func: eq(name@en, "Steven Spielberg")) {
    name@en
    director.film @filter(ge(initial_release_date, val(d))) {
      initial_release_date
      name@en
    }
  }
}
' | python -m json.tool|grep processing_ns|cut -d':' -f 2|cut -d' ' -f 2;done

If it was replicable for any one else this case make it easier to trace it as data set is the same we all use in tutorial.

Notice for this set I only test it in my local machine a couple times and not in my servers as I was just trying to find a public data/query with the same issue.

average response (1000x):

v1.0.11: 5462598.224
v1.0.12: 7390753.525

makitka · March 7, 2019, 3:07pm

1.0.12 doesn’t have LRU cache comparing to 1.0.11, may be that’s the case

mrjn · March 7, 2019, 3:21pm

We’ll have to verify these numbers, but just at a general level, simpler queries might slow down a bit, while more complex queries speed up with these changes. The issue was a lot of contention – for simpler queries that’s not an issue – so they’d be limited by the how fast data can be accessed off disk.

@dmai: Can you post what numbers you get on your desktop?

P.S. We have plans to write a much faster LFU cache based on research – going to post a blog about it today.

system · April 6, 2019, 3:21pm

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Dgraph Release v23.1.1 is now Generally Available Announce dgraph , release	6	303	May 9, 2024
Query is very slow while adding le function for float predicate in filter Dgraph area:performance	6	1184	November 15, 2022
Slow query when apply @filter or order to predicates Dgraph kind:question , kind:enhancement , kind:bug , area:performance , ticket:created	5	1182	May 6, 2021
Two equivalent queries, one is slow and the other one is fast Dgraph	3	823	January 6, 2020
V21.03: After pagination+cascade change, queries are too slow to finish Dgraph performance , kind:feature	23	1944	May 25, 2021

V1.0.12 slower for some queries

Related topics