Massive kswapd0 CPU spikes?


#1

Hey everyone! I’m running a large, multi-threaded import job into one Dgraph instance: millions of triples from about eight different threads.

I tried running echo 1 > /proc/sys/vm/drop_caches, as some Stackoverflow answers suggested. It helped for a bit, but then kswapd went back up again.

kswapd0 kinda oscillates between really high CPU usage (>90%) and 0%. Which is odd: I at least expected steady swap usage.

(NOTE: I looked at the Dgraph Deploy HOWTO, but I didn’t see any swap recommendations for dgraph. Only for the bulk loader.)

top - 14:10:56 up 145 days, 22:43,  1 user,  load average: 17.65, 13.81, 9.35
Tasks: 213 total,   2 running, 211 sleeping,   0 stopped,   0 zombie
%Cpu(s): 16.7 us, 58.3 sy,  0.0 ni, 23.1 id,  1.7 wa,  0.0 hi,  0.1 si,  0.1 st
KiB Mem : 65807368 total,   330764 free, 20462152 used, 45014452 buff/cache
KiB Swap:        0 total,        0 free,        0 used. 44600528 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
26716 root      20   0  158.7g  51.7g  35.5g S  1006 82.4   8560:58 dgraph
  104 root      20   0       0      0      0 R  95.7  0.0 267:05.69 kswapd0

Anyone have any ideas?

SYSTEM SPECS

  • CentOS 7
  • 4 core, 64GB of RAM
  • Linux some-services.test.net 3.10.0-957.12.2.el7.x86_64 #1 SMP Tue May 14 21:24:32 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
> docker info
Containers: 7
 Running: 4
 Paused: 0
 Stopped: 3
Images: 6
Server Version: 18.09.0
Storage Driver: overlay2
 Backing Filesystem: xfs
 Supports d_type: true
 Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host macvlan null overlay
 Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: c4446665cb9c30056f4998ed953e6d4ff22c7c39
runc version: 4fc53a81fb7c994640722ac585fa9ca548971871
init version: fec3683
Security Options:
 seccomp
  Profile: default
Kernel Version: 3.10.0-862.14.4.el7.x86_64
Operating System: CentOS Linux 7 (Core)
OSType: linux
Architecture: x86_64
CPUs: 16
Total Memory: 62.76GiB
Name: twosixlabs-clickhouse1.datareservoir.net
ID: HXPF:5VJ2:XUDD:OAF6:PBVU:35QU:OLE2:RYAI:YMEY:2ELH:MAXP:LFZQ
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false
Product License: Community Engine
  • my docker-compose.yml
version: "3.2"
services:
  zero:
    image: dgraph/dgraph:latest
    volumes:
      - /mnt/dgraph-volume/data:/dgraph
    ports:
      - 5080:5080
      - 6080:6080
    restart: on-failure
    command: dgraph zero --my=zero:5080
  alpha:
    image: dgraph/dgraph:latest
    volumes:
      - /mnt/dgraph-volume/data:/dgraph
    ports:
      - 8080:8080
      - 9080:9080
    restart: on-failure
    command: dgraph alpha --my=alpha:7080 --lru_mb=10240 --zero=zero:5080
  ratel:
    image: dgraph/dgraph:latest
    ports:
      - 8000:8000
    command: dgraph-ratel

Maybe it’s worth mentioning that this isn’t bare-metal. It’s an OpenStack instance.

Thanks for any tips, ideas, or suggestions!


VIRT has always been high