High CPU and DEADLINE_EXCEEDED when running queries

Moved from GitHub dgraph/4461

Posted by roniCohen123:

What version of Dgraph are you using?

1.1.0

Have you tried reproducing the issue with the latest release?

yes

What is the hardware spec (RAM, OS)?

aws: i3.large
ram: 15gb
cpu: 2 vcpu
os: Canonical, Ubuntu, 16.04 LTS, amd64 xenial image build on 2019-06-17, 4.4.0-1085-aws

Steps to reproduce the issue (command/config used to run Dgraph).

  1. Running queries after not running set mutation in a few days

query:

{
  get(func: eq(merchant_id, 1))
  @filter(eq(team_id, 1) AND
          eq(type, "type") AND
          eq(menu_name, "name") )
   @recurse(depth:10, loop:true)
   {
       uid merchant_id team_id external_id name display_name sub_type type images description price origin times menu_name extra_data
       contains @facets
   }
}

schema:

<contains>: [uid] .
<description>: string .
<display_name>: string .
<external_id>: string @index(exact) .
<extra_data>: string .
<id>: int @index(int) .
<images>: [string] .
<is_in_tree>: bool .
<menu_name>: string @index(exact) .
<merchant_id>: int @index(int) .
<name>: string .
<origin>: string .
<price>: float .
<sub_type>: string .
<team_id>: int @index(int) .
<times>: string .
<type>: string @index(exact) .
  1. CPU become very high, getting DEADLINE_EXCEEDED (we set the deadline to 60000 milliseconds, usually the queries take 5-6 seconds)
  • When we are running some set mutation with fields that are indexed, the queries starting to work and CPU return to normal

Command and config we use to run graph:
zero:
/opt/dgraph/dgraph zero --config=/etc/dgraph/zero.yml --wal=/var/lib/dgraph-zero --log_dir=/var/log/dgraph/zero
config:

cat /etc/dgraph/zero.yml
idx: 174338185313
peer: dgraph-zero.service.production.consul:5080
my: x.x.x.x:5080
replicas: 3

alpha:
/opt/dgraph/dgraph alpha --config=/etc/dgraph/alpha.yml --wal=/data/dgraph/w --postings=/data/dgraph/p --export=/mnt/backup/export --log_dir=/var/log/dgraph/alpha
config:

cat /etc/dgraph/alpha.yml
badger.vlog: disk
lru_mb: 2048
my: x.x.x.x:7080

Expected behaviour and actual result.

Expected to get query result
received DEADLINE_EXCEEDED

danielmai commented :

Can you share the goroutine stack trace dump and CPU profile of Dgraph Alpha?

  • Goroutine dump on Dgraph Alpha can be retrieved by running the following (change the address/port as needed):

    curl --output goroutine.txt "http://localhost:8080/debug/pprof/goroutine?debug=2"
    
  • A CPU profile can be retrieved by running the following:

    go tool pprof localhost:8080/debug/pprof/profile
    

More info here: https://docs.dgraph.io/howto/#profiling-information

roniCohen123 commented :

@danielmai Thanks !
I will need to wait for the next time it will happen to provide the logs
unless it will help If I’ll add it now also?

roniCohen123 commented :

@danielmai here are the logs :
goroutine.txt
pprof.dgraph.samples.cpu.001.pb.gz