alianp
July 22, 2019, 4:57pm
1
I have setup a cluster with 3 zeros with replicas 3 as a option and 3 alpha server as 1 group.
I just imported all data around 150GB using bulk loader. Now, i am trying to query based on some reverse edges to show recommendation. But, out of three alphas randomly one or two servers taking long time to give response (around 10s - 12s) and one server is giving response in less than 2s.
Please give me suggestion to fix this issue in my setup.
client: pydgraph (latest)
protocol: gRPC
dgraph version: 1.0.16
systems info (all nodes):-
32 cores
128GB RAM
5TB hard disk
dmai
(Daniel Mai)
July 23, 2019, 12:04am
2
Do all the machines in the cluster have the same machine specs? The one thing that stands out to me is the hard disk. Queries can be slow due langer disk seek latencies from HDDs instead of SSDs.
Can you share the query where you’re seeing this discrepancy in response time? It’s possible that the query could be slow due to not using an index or some other reason.
alianp
July 23, 2019, 4:59am
3
Do all the machines in the cluster have the same machine specs?
Yes.
Queries can be slow due langer disk seek latencies from HDDs instead of SSDs.
Yes, I know HDD is slower than SSD but in that case it should give response within the same time.
Query:-
{
var(func: eq(jd_id, xxxxxxxxxxx)) {
vlang as jd_lang
duid as uid
gflr as genre
dflr as director
cflr as cast_member
sflr as screenwriter
pflr as production_company
aflr as award_received
}
details(func: uid(duid)) @normalize {
mid: jd_id
lbl: label
hash: jd_hash
desc: jd_desc
rel_yr: jd_rel_yr
main_img: jd_main_image
thumb_img: jd_thumb_image
rate: IMDb_average_rating
gcnt as count(genre)
dcnt as count(director)
ccnt as count(cast_member)
scnt as count(screenwriter)
pcnt as count(production_company)
acnt as count(award_received)
trval as IMDb_average_rating
relyr as jd_rel_yr
norm as math(1)
inorm as math(0.001)
rval as math(trval+inorm)
normIn as math(sqrt(gcnt+dcnt+ccnt+scnt+acnt+pcnt+rval))
cast_member{
~cast_member @filter(eq(jd_online_flag, 1) AND eq(jd_lang, val(vlang))) {
label
jd_id
cgcnt as count(genre @filter(uid(gflr)))
cdcnt as count(director @filter(uid(dflr)))
cccnt as count(cast_member @filter(uid(cflr)))
cscnt as count(screenwriter @filter(uid(sflr)))
cacnt as count(award_received @filter(uid(aflr)))
cpcnt as count(production_company @filter(uid(pflr)))
ctrval as IMDb_average_rating
cnorm as math(0.001)
crval as math(ctrval+cnorm)
tcrel as jd_rel_yr
czero as math(0)
crel as math(czero+tcrel)
tcrelyr as math(relyr/norm)
crelyr as math(cond(tcrelyr >= crel, tcrelyr-crel, crel-tcrelyr))
cnormIn as math(sqrt(cgcnt+cdcnt+cccnt+cscnt+cacnt+cpcnt+crval+crelyr))
cscore as math( ((gcnt/normIn)*(cgcnt/cnormIn)) + ((dcnt/normIn)*(cdcnt/cnormIn)) + ((ccnt/normIn)*(cccnt/cnormIn)) + ((scnt/normIn)*(cscnt/cnormIn)) + ((acnt/normIn)*(cacnt/cnormIn)) + ((pcnt/normIn)*(cpcnt/cnormIn)) + ((rval/normIn)*(crval/cnormIn)) )
}
}
director{
~director @filter(eq(jd_online_flag, 1) AND eq(jd_lang, val(vlang))){
label
jd_id
dgcnt as count(genre @filter(uid(gflr)))
ddcnt as count(director @filter(uid(dflr)))
dccnt as count(cast_member @filter(uid(cflr)))
dscnt as count(screenwriter @filter(uid(sflr)))
dacnt as count(award_received @filter(uid(aflr)))
dpcnt as count(production_company @filter(uid(pflr)))
dtrval as IMDb_average_rating
dnorm as math(0.001)
drval as math(dtrval+dnorm)
tdrel as jd_rel_yr
dzero as math(0)
drel as math(dzero+tdrel)
tdrelyr as math(relyr/norm)
drelyr as math(cond(tdrelyr >= drel, tdrelyr-drel, drel-tdrelyr))
dnormIn as math(sqrt(dgcnt+ddcnt+dccnt+dscnt+dacnt+dpcnt+drval+drelyr))
dscore as math( ((gcnt/normIn)*(dgcnt/dnormIn)) + ((dcnt/normIn)*(ddcnt/dnormIn)) + ((ccnt/normIn)*(dccnt/dnormIn)) + ((scnt/normIn)*(dscnt/dnormIn)) + ((acnt/normIn)*(dacnt/dnormIn)) + ((pcnt/normIn)*(dpcnt/dnormIn)) + ((rval/normIn)*(drval/dnormIn)) )
}
}
genre{
label
~genre @filter(eq(jd_online_flag, 1) AND eq(jd_lang, val(vlang))) (first: 50){
jd_id
label
ggcnt as count(genre @filter(uid(gflr)))
gdcnt as count(director @filter(uid(dflr)))
gccnt as count(cast_member @filter(uid(cflr)))
gscnt as count(screenwriter @filter(uid(sflr)))
gacnt as count(award_received @filter(uid(aflr)))
gpcnt as count(production_company @filter(uid(pflr)))
gtrval as IMDb_average_rating
gnorm as math(0.001)
grval as math(gtrval+gnorm)
tgrel as jd_rel_yr
gzero as math(0)
grel as math(gzero+tgrel)
tgrelyr as math(relyr/norm)
grelyr as math(cond(tgrelyr >= grel, tgrelyr-grel, grel-tgrelyr))
gnormIn as math(sqrt(ggcnt+gdcnt+gccnt+gscnt+gacnt+gpcnt+grval+grelyr))
gscore as math( ((gcnt/normIn)*(ggcnt/gnormIn)) + ((dcnt/normIn)*(gdcnt/gnormIn)) + ((ccnt/normIn)*(gccnt/gnormIn)) + ((scnt/normIn)*(gscnt/gnormIn)) + ((acnt/normIn)*(gacnt/gnormIn)) + ((pcnt/normIn)*(gpcnt/gnormIn)) + ((rval/normIn)*(grval/gnormIn)) )
}
}
score as math(max(max(cscore, dscore), gscore))
}
similar(func: uid(score), orderdesc: val(score), first: 20) @filter(NOT uid(duid)) {
uid
mid: jd_id
lbl: label
hash: jd_hash
# desc: jd_desc
rel_yr: jd_rel_yr
# limg: logo_image
# img: image
main_img: jd_main_image
thumb_img: jd_thumb_image
rate: IMDb_average_rating
score: val(score)
dscore:val(dscore)
gscore:val(gscore)
# sscore:val(sscore)
cscore:val(cscore)
}
}
Indexes:-
genre: uid @reverse .
cast_member: uid @reverse .
director: uid @reverse .
award_received: uid @reverse .
screenwriter: uid @reverse .
production_company: uid @reverse .
jd_online_flag: int @index(int) .
jd_id: string @index(exact) .
jd_lang: string @index(exact) .
alianp
August 7, 2019, 11:58am
4
Any suggestion to improve performance?
I tried with docker image and initially it was working fine and returning response in less than 1s. But, after 2 week, Alpha servers are taking long time randomly.
@dmai
In last message updated the query please check it and give me some suggestion to fix this issue.