Execute query, alpha takes up high memory and ends up OOM

junghc · July 29, 2022, 8:50am

What I want to do

I use dgraph for query operations, and at the same time there will be real-time data written through JavaClient

What I did

I built the dgraph cluster in the figure below:

Each machine is configured as: 8cpu, 16G RAM.

I define a type and write 17 million data, then I execute a query with multiple conditions, the query takes 2-3 seconds, the query efficiency is very low, I want to know why the query is so slow,my query statement is as follows:

query q($timeMin: string!, $timeMax: string!, $first: int, $offset: int, $pid: int, $text: string) {
    var(func: eq(pid, $pid)) @filter(between(create_time, $timeMin, $timeMax)  and eq(stage,  1)and eq(op_type,  2) and anyoftext(op_content_text, $text)) {
        a as uid
    }
    var(func: eq(pid, $pid)) @filter(anyoftext(field_value, $text)and eq(is_delete, 0)) {
        log_id @filter(between(create_time, $timeMin, $timeMax) and eq(stage,  1)and eq(op_type,  2)) {
            b as uid
        }
    }
    var(func: eq(op_type, 67)) @filter(eq(pid, $pid)) {
        c as business_id
    }
    q(func: uid(a, b), first: $first, offset: $offset, orderdesc: create_time) @filter(not eq(id, val(c))) @cascade(id) {
        expand(_all_)
    }
}

When I used Jmeter for stress testing, I found that the memory of alpha would immediately take up 10-14G, and it would crash soon after, and I found OOM. I don’t know why the query would take up such a high memory, which eventually led to the crash of alpha.

Dgraph metadata

v21.03.2

MichelDiz · July 29, 2022, 8:36pm

About OOM https://dgraph.io/docs/deploy/troubleshooting/#running-out-of-memory-oom

Each Alpha needs to be in a single machine. 16GB for each, and I personally would suggest more RAM and CPU for you case as you use 17 mi nodes and does a complex query.

Also, bottleneck can be encountered when not using PCI-Express NVMe. Every part of your cluster needs to be carefully configured so there are no bottlenecks. If you have 200GB of RAM and little CPU it’s a crooked balance.

In your query, try as much as possible to break the query into several blocks. Avoid complex filters in a single block. Make a pipeline that sends to another block. That way your query will behave better because multiple blocks are executed concurrently.

Are you mutating at the same time as querying?

I think your query is too complex. Such a complex query I would do from time to time. If your application insists on complex queries every day, you’re going to need more resources than that.

In day to day running, the vast majority of applications do not make complex queries. Query for a user, for a post, for a message is usually what it takes. If this query is only for some internal dashboard of your enterprise. So the response time matches the complexity and the amount of resources available.

junghc · August 1, 2022, 8:39am

Hi @MichelDiz
Thank you for your reply, I understand that the current way of building the cluster is unreasonable, but at present my resources are limited, so I can only do this. The split you mentioned is divided into multiple queries, I will try.

But I still want to know, why does dgraph take up such a high memory.

MichelDiz · August 1, 2022, 2:03pm

From my experience, Graphs when expanded in the cluster(no written) takes a lot of RAM to do so. The same happens if you expand JSON in an application. It will escalate the RAM usage. I’ve did a JavaScript application once to deal with gigantic JSONs and it consumes a lot of memory. But shortly after it freed up the RAM.

And the RAM management in GO is tricky. We’ve already implemented pretty good RAM management with jemalloc. But Graphs are a bit RAM consumers. Especially when you do very complex queries. That add multiple filters and so on. The need to use filters and other parameters has to be evaluated. Because all this will be computed in some instance of Dgraph (on the server or cloud), not on your machine.

Complex queries made every now and then is normal. But making them everyday use is something you need to calculate the increase in resources to be made available.

junghc · August 3, 2022, 2:54am

Thank you, I will optimize on my query, or increase the RAM.

Topic		Replies	Views
Preventing OOM on alpha when doing large queries Dgraph	11	621	July 21, 2020
8 process parallel query dgraph, in less than a minute, memory runs out Users	1	398	November 17, 2019
Dgraph alpha dies when receive big query (Out-Of-Memory) Users kind:bug	3	462	April 4, 2020
About dgraph 20.11 query efficiency？ Dgraph	6	578	December 22, 2020
Dgraph Alpha Crashing when deleting large amount of data App Development kind:question , dgraph	1	137	May 9, 2024

Execute query, alpha takes up high memory and ends up OOM

What I want to do

What I did

Dgraph metadata

Related topics