Data loss when running export


#1

For obvious reasons we want to be able to run regular backups of our Dgraph database, to prevent data loss.

To make sure this would work as expected, I took an export of our production database and tried to import in a local Docker instance but I saw a lot of data loss.

Below are the steps I performed - unfortunately I can’t provide the actual data as it is in production and therefore sensitive.

1. While SSH’d onto the production box - run:

curl localhost:8280/admin/export

// output - {"code": "Success", "message": "Export completed."}

2. SCP the created directory (in this case dgraph.r40015.u0415.0938) to my local dgraph directory

Before moving on - this is my alpha config in my docker-compose file:

  alpha:
    image: dgraph/dgraph:latest
    container_name: dgraph_alpha
    volumes:
      - type: bind
        source: /Users/{user}/dgraph
        target: /dgraph
        volume:
          nocopy: true
    ports:
      - 8280:8280
      - 9280:9280
    restart: on-failure
    command: dgraph alpha --port_offset 200 --my=alpha:7280 --lru_mb=2048 --zero=zero:5280

So I’ve mounted my local dgraph directory so I’m able to see the exported directory in my container.

3. I then run the following command to import the export into my local Docker container.

docker exec -it dgraph_alpha dgraph live -r /dgraph/dgraph.r40015.u0415.0938/g01.rdf.gz --zero=zero:5280 --dgraph=localhost:9280 -c 1

This produces the following output:

I0415 09:45:45.054696      25 init.go:88]

Dgraph version   : v1.0.14
Commit SHA-1     : 26cb2f94
Commit timestamp : 2019-04-12 13:21:56 -0700
Branch           : HEAD
Go version       : go1.11.5

For Dgraph official documentation, visit https://docs.dgraph.io.
For discussions about Dgraph     , visit https://discuss.dgraph.io.
To say hi to the community       , visit https://dgraph.slack.com.

Licensed variously under the Apache Public License 2.0 and Dgraph Community License.
Copyright 2015-2018 Dgraph Labs, Inc.


Creating temp client directory at /tmp/x789817426
badger 2019/04/15 09:45:45 INFO: All 0 tables opened in 0s

Processing /dgraph/dgraph.r40015.u0415.0938/g01.rdf.gz
Number of TXs run         : 2
Number of RDFs processed  : 1705
Time spent                : 856.866103ms
RDFs processed per second : 1705
badger 2019/04/15 09:45:45 INFO: Storing value log head: {Fid:0 Len:43 Offset:9408}
badger 2019/04/15 09:45:45 INFO: Force compaction on level 0 done

If I then go to onto the UI and run the following query:

{
  q(func: has(_predicate_)) {
   count(uid)
  }
}

- Locally this produces a count of 142
- On the prod server, it also produces a count of 142

However, if I do the following query:

{
  q(func: has(username)) {
   expand(_all_)
  }
}

Locally I get 73 results with the following structure:

{
    "username": "xxxx"
}

However, on production I still get 73 results but with the following structure:

{
    "active": true/false,
    "type": "xxx",
    "username": "xxxx",
    "email": "xxx@xxx.xxx",
    "mobile": "+xxxxx"
}

So it looks like there is some data loss for the attribute values.

I can also run the following query locally but I still only get the username returned:

{
  q(func: has(email)) {
   expand(_all_)
  }
}

#3

The command

docker exec -it dgraph_alpha dgraph bulk -r /dgraph/dgraph.r40015.u0415.0938/g01.rdf.gz -s /dgraph/dgraph.r40015.u0415.0938/g01.schema.gz --zero=zero:5280 --http localhost:8000

produces the following:

I0415 10:23:23.716952     110 init.go:88]

Dgraph version   : v1.0.14
Commit SHA-1     : 26cb2f94
Commit timestamp : 2019-04-12 13:21:56 -0700
Branch           : HEAD
Go version       : go1.11.5

For Dgraph official documentation, visit https://docs.dgraph.io.
For discussions about Dgraph     , visit https://discuss.dgraph.io.
To say hi to the community       , visit https://dgraph.slack.com.

Licensed variously under the Apache Public License 2.0 and Dgraph Community License.
Copyright 2015-2018 Dgraph Labs, Inc.


{
	"RDFDir": "/dgraph/dgraph.r40015.u0415.0938/g01.rdf.gz",
	"JSONDir": "",
	"SchemaFile": "/dgraph/dgraph.r40015.u0415.0938/g01.schema.gz",
	"DgraphsDir": "out",
	"TmpDir": "tmp",
	"NumGoroutines": 4,
	"MapBufSize": 67108864,
	"ExpandEdges": true,
	"SkipMapPhase": false,
	"CleanupTmp": true,
	"NumShufflers": 1,
	"Version": false,
	"StoreXids": false,
	"ZeroAddr": "zero:5280",
	"HttpAddr": "localhost:8000",
	"IgnoreErrors": false,
	"CustomTokenizers": "",
	"MapShards": 1,
	"ReduceShards": 1
}
Connecting to zero at zero:5280
badger 2019/04/15 10:23:23 INFO: All 0 tables opened in 0s
Processing file (1 out of 1): /dgraph/dgraph.r40015.u0415.0938/g01.rdf.gz
badger 2019/04/15 10:23:23 INFO: Storing value log head: {Fid:0 Len:44 Offset:11938}
badger 2019/04/15 10:23:23 INFO: Force compaction on level 0 done
Shard tmp/shards/000 -> Reduce tmp/shards/shard_0/000
badger 2019/04/15 10:23:23 INFO: All 0 tables opened in 0s
badger 2019/04/15 10:23:23 INFO: Storing value log head: {Fid:0 Len:42 Offset:121988}
badger 2019/04/15 10:23:23 INFO: Force compaction on level 0 done
REDUCE 00s [100.00%] edge_count:3.821k edge_speed:3.821k/sec plist_count:1.174k plist_speed:1.174k/sec
Total: 00s

But when I run a query in Ratel, there are no results…


(Michel Conrado) #4

hum, curious. In which version were you previously exporting? Have you set Schema?

Try to use my scripts https://github.com/MichelDiz/Dgraph-Bulk-Script


#5

How exactly do I get the version? On production, I’m running Dgraph though systemd.

There is a schema set on production yes.

Use your scripts for the export or the import? As it’s a prod server I have to be very careful with what I put on it - so I’d rather get the proper export functionality working, if possible.


(Michel Conrado) #6

Run: dgraph -h
It will show a stamp like this

Dgraph version   : v1.0.13
Commit SHA-1     : 691b3b35
Commit timestamp : 2019-03-09 19:33:59 -0800
Branch           : HEAD
Go version       : go1.11.5

import


#7

dgraph -h does not seem to work, instead it outputs the following, which I assume means h is not a valid option:

Usage of dgraph:
  -alsologtostderr
    	log to standard error as well as files
  -log_backtrace_at value
    	when logging hits line file:N, emit a stack trace
  -log_dir string
    	If non-empty, write log files in this directory
  -logtostderr
    	log to standard error instead of files
  -stderrthreshold value
    	logs at or above this threshold go to stderr
  -v value
    	log level for V logs
  -vmodule value
    	comma-separated list of pattern=N settings for file-filtered logging

I will try your scripts for the import, but I’d still like to understand how to do this without; as in a “real world” production environment I wouldn’t be using the scripts.


(Michel Conrado) #8

Humm, you probably are in a very old version.

Run just “dgraph” and se what shows.

Docs shows how to, or you can read the scripts. The process is very simple.

https://docs.dgraph.io/deploy/#bulk-loader
https://docs.dgraph.io/deploy/#live-loader


#9

I followed the docs, and I’m getting data loss, which is why I’m here…


(Michel Conrado) #10

Well I can’t see or reproduce the issue from my side following the docs. So It can be something else.
If I understand your whole context, it might help to solve it.


#11

Is there any more information I can give you, to help you help me?

I’ll definitely try the scripts, but if there’s any more information I can give just let me know. In the mean time I’ll try and figure out what version of dgraph I’m running as well


(Michel Conrado) #12

The versions from to where. You need to check your N-quads. See if the export is successful. Just open your RDF in any text editor (try vscode - Try to find something strange). Your compose file has only one Alpha? show it complete.

You said you’re running by systemd, but why you used a compose file? don’t get it.

BTW, My scripts will get your RDF, inject into the Docker composing, bulk load it and then just run it.


(Javier Alvarado) #13

The bulk loader behaves a little differently from the live loader. Instead of loading to a running (i.e. live) cluster, it writes the data to an offline database ./out/0/p. After the bulk load completes, the p directory needs to be moved to where alpha expects it.

See https://docs.dgraph.io/deploy/#bulk-loader for more details.


(Javier Alvarado) #14

Older versions of dgraph have a bug in the option handling that prevented the help flag by itself from working (it works if you ask for help for a subcommand), but you can always run the version subcommand: dgraph version


(Javier Alvarado) #15

A couple of things that may help:

  • What version of dgraph are you using to do the export? Is it v1.0.14 as well? You can get the version from the dgraph version command or the log files, either with the journalctl command or by looking at log files directly.

  • Can you share what the exported schema looks like? Are the missing attributes (active, type, …) listed there? Do they have the expected types?

  • Do the missing attributes all have some data in the production database? Or are any of them defined in the schema but unused?


#16

Hi @javier - thanks for responding.

dgraph version did the trick - it looks like I’m running version 1.0.11.

Dgraph version   : v1.0.11
Commit SHA-1     : b2a09c5b
Commit timestamp : 2018-12-17 09:50:56 -0800
Branch           : HEAD
Go version       : go1.11.1

I’ve double checked my container version and it is indeed 1.0.14

Dgraph version   : v1.0.14
Commit SHA-1     : 26cb2f94
Commit timestamp : 2019-04-12 13:21:56 -0700
Branch           : HEAD
Go version       : go1.11.5

They aren’t worlds apart, so should that be causing a massive issue?

I can share the exported schema, although I have had to remove a couple of the attributes for confidentiality reasons. As you can see the missing types are there:

name:string @index(exact) . 
type:string @index(exact) . 
email:string @index(exact) . 
active:bool . 
mobile:string @index(exact) . 
can_read:default . 
password:default . 
username:string @index(exact) . 
member_of:uid . 
can_access:uid . 
dgraph.xid:string @index(exact) . 
can_view_shares:uid . 
dgraph.password:password . 
dgraph.group.acl:string . 
dgraph.user.group:uid @reverse . 

All of the attributes I’m trying to query are present on all of the nodes, in the production database. I do, however, have some unused attributes that I was using and now no longer need.

I also looked into the exported RDF file - and I can see the types and values in that file. There are no obvious errors in the RDF export.

Taking this a step further - I took the whole RDF export and ran a set command from the Ratel UI and that seems to work absolutely fine.

So it does seem that the loader isn’t working as opposed to the export.

I did actually try this. I moved the p directory in the out directory into the one my docker container uses, overwriting the existing file(s), but it didn’t make a difference.

Hopefully this gives you enough information to help you, help me. If not - just let me know what else I can provide/try and I will.

While I wait for your feedback I will try the scripts @MichelDiz suggested and see if that works.


(Javier Alvarado) #17

Your load command only specifies the RDF file. Does it make a difference if you specify the schema file with -s /dgraph/dgraph.r40015.u0415.0938/g01.schema.gz as well?

Hm. Did you restart the dgraph alpha process after replacing p?