Data unavailable after possible crash

Since today all of our three production Alpha nodes (containers on Kubernetes) seem to have been terminated around the same time. I expect this was a crash of some sort, but unfortunately I have no logs prior to the shutdown.

Each container has a persistent volume associated to it. After three new containers spun up the data seems to be inaccessible. The schema is still available though. The data should still be there, but I need help in recovering it to its state before crash.

logging in the container through ssh and listing the files in the “w” directory give’s me the following:

root@authorization-service-dg-alpha-0:/dgraph/w# ls -lh
total 153M
-rw-r--r-- 1 root root  70M Nov 21 14:50 000844.sst
-rw------- 1 root root  18M Apr  1 11:14 004145.sst
-rw------- 1 root root  18M Apr  1 11:14 004146.sst
-rw------- 1 root root  18M Apr  1 11:14 004147.sst
-rw------- 1 root root  18M Apr  1 11:14 004148.sst
-rw------- 1 root root 9.2M Apr  1 11:47 004151.sst
-rw------- 1 root root 554K Apr  1 12:25 007160.vlog
-rw------- 1 root root 849K Apr  1 12:45 007161.vlog
-rw------- 1 root root 508K Apr  1 13:01 007162.vlog
-rw------- 1 root root  86K Apr  1 13:04 007163.vlog
-rw------- 1 root root   28 Dec 17 10:26 KEYREGISTRY
-rw-r--r-- 1 root root  41K Apr  1 11:47 MANIFEST

I just deleted the LOCK files that were in each instance and rebooted all instances. Didn’t make a difference. Given the size of the .sst files. I’m assuming the data is still stored there.

If anyone is able to assist that would be really helpful as I don’t have any options left.

I Figured out the issue. All containers were accidentally configured to use the latest Docker tag. After the crash, Kubernetes pulled the latest image from Dockerhub. The latest docker tag now points to version 2.*. We were running v1.2.* before, which logically is not compatible. This was not really obvious from the logs though, as the schema was still in tact and queries executed. To me this really looked like a data-corruption issue.

Stupid mistake, which cause allot of head-scratching, but all good now.

Hey, @boedy can you share where you got the example with the latest tag? (just to know) or it was changed on your side?

Hi @MichelDiz, Most configurations provide by dgraph use the tag latest as can been seen here:

Having a version pinned in all documentation examples, might prevent deployments of users break, when a new major release comes out.

1 Like

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.