Export data and input into remote server

Hi there, I’m newbie to Dgraph.
I want export dgraph data from my laptop and then import the data into remote server.

Here are my steps:

  1. on my laptop use this command line: curl localhost:8080/admin/export
  2. then, copy 3 files ( g01.schema.gz, g01.rdf.gz, g01.gql_schema.gz) to the remote server
  3. on the remote server, delete the out,p,w,zw directory
  4. on the remote server use this command line: ./dgraph bulk -f /pathto/g01.rdf.gz -s /pathto/g01.schema.gz --map_shards=4 --reduce_shards=2 --http localhost:8008 --zero=localhost:5080
  5. then I got the output info as below:

{
	"DataFiles": "/pathto/g01.rdf.gz",
	"DataFormat": "",
	"SchemaFile": "/pathto/g01.schema.gz",
	"GqlSchemaFile": "",
	"OutDir": "./out",
	"ReplaceOutDir": false,
	"TmpDir": "tmp",
	"NumGoroutines": 3,
	"MapBufSize": 67108864,
	"SkipMapPhase": false,
	"CleanupTmp": true,
	"NumReducers": 1,
	"Version": false,
	"StoreXids": false,
	"ZeroAddr": "localhost:5080",
	"HttpAddr": "localhost:8008",
	"IgnoreErrors": false,
	"CustomTokenizers": "",
	"NewUids": false,
	"Encrypted": false,
	"MapShards": 4,
	"ReduceShards": 2,
	"BadgerKeyFile": "",
	"BadgerCompressionLevel": 1
}

The bulk loader needs to open many files at once. This number depends on the size of the data set loaded, the map file output size, and the level of indexing. 100,000 is adequate for most data set sizes. See `man ulimit` for details of how to change the limit.
Current max open files limit: 1024

Connecting to zero at localhost:5080
Predicate "dgraph.type" already exists in schema
Predicate "dgraph.graphql.xid" already exists in schema
Predicate "dgraph.graphql.schema" already exists in schema
Processing file (1 out of 1): /home/zhouj/dgraph_data/dgraph_data/g01.rdf.gz
Shard tmp/map_output/000 -> Reduce tmp/shards/shard_0/000
Shard tmp/map_output/003 -> Reduce tmp/shards/shard_1/003
Shard tmp/map_output/002 -> Reduce tmp/shards/shard_1/002
Shard tmp/map_output/001 -> Reduce tmp/shards/shard_1/001
Num CPUs: 12
[16:04:20+0800] REDUCE 01s 79.26% edge_count:1.789k edge_speed:1.789k/sec plist_count:660.0 plist_speed:660.0/sec. Num Encoding: 0
Num CPUs: 12
[16:04:21+0800] REDUCE 02s 100.00% edge_count:2.257k edge_speed:2.257k/sec plist_count:1.014k plist_speed:1.014k/sec. Num Encoding: 0
[16:04:22+0800] REDUCE 03s 100.00% edge_count:2.257k edge_speed:1.128k/sec plist_count:1.014k plist_speed:506.9/sec. Num Encoding: 0
[16:04:23+0800] REDUCE 04s 100.00% edge_count:2.257k edge_speed:752.1/sec plist_count:1.014k plist_speed:337.9/sec. Num Encoding: 0
[16:04:24+0800] REDUCE 04s 100.00% edge_count:2.257k edge_speed:620.6/sec plist_count:1.014k plist_speed:278.8/sec. Num Encoding: 0
Total: 04s
  1. Finally, I can’t got any answer from query which works well on my laptop dgraph.

So, I want to know if I import successfully data into the remote server dgraph. if not, how do I deal with this problem. If so, why I can’t get any answer.

I’ll be appreciate if anyone could help me to figure out this problem. :smile

Hi @DaToo-J, Welcome to the community and thanks for reaching out to us.

From the logs, it looks that the bulk loader has completed successfully.

Are you getting some error or is it returning an empty response?
Bulk loader would have created out directory inside which there would be 2 directories, one for each of the two groups. Can you show the commands that you are running for spinning up alphas?

Yes, after querying the remote dgraph, I got an empty list and no error.
Command that runs alpha: ./dgraph alpha --lru_mb 1024

After a while, I tried the live command which works as I expected.
So could you tell me what differences between bulk and live ?
and which is better for development ? :blush:

Live loader and bulk loader are both used for data loading. The major differences between them lie in speed. The bulk loader is fast while the live loader is slow. While a good point for the live loader is that it can be used for running cluster whereas bulk loader can be run on with only
group zero running.
Bulk loader is suggested for the initial import of large data.
You can have a look at some more details about data loading.

I think that would clarify a bit how both are used differently. Kindly note the structure of directory after running bulk loader and how to use it. In case of any doubts, feel free to shoot follow-up questions.

:ok: Thanks very much for your guide.

1 Like