What I want to do
I use dgraph for query operations, and at the same time there will be real-time data written through JavaClient
What I did
I built the dgraph cluster in the figure below:
Each machine is configured as: 8cpu, 16G RAM.
I define a type and write 17 million data, then I execute a query with multiple conditions, the query takes 2-3 seconds, the query efficiency is very low, I want to know why the query is so slow,my query statement is as follows:
query q($timeMin: string!, $timeMax: string!, $first: int, $offset: int, $pid: int, $text: string) {
var(func: eq(pid, $pid)) @filter(between(create_time, $timeMin, $timeMax) and eq(stage, 1)and eq(op_type, 2) and anyoftext(op_content_text, $text)) {
a as uid
}
var(func: eq(pid, $pid)) @filter(anyoftext(field_value, $text)and eq(is_delete, 0)) {
log_id @filter(between(create_time, $timeMin, $timeMax) and eq(stage, 1)and eq(op_type, 2)) {
b as uid
}
}
var(func: eq(op_type, 67)) @filter(eq(pid, $pid)) {
c as business_id
}
q(func: uid(a, b), first: $first, offset: $offset, orderdesc: create_time) @filter(not eq(id, val(c))) @cascade(id) {
expand(_all_)
}
}
When I used Jmeter for stress testing, I found that the memory of alpha would immediately take up 10-14G, and it would crash soon after, and I found OOM. I don’t know why the query would take up such a high memory, which eventually led to the crash of alpha.