Report a Dgraph Bug
What version of Dgraph are you using?
Dgraph Version
$ dgraph version
Dgraph version : v21.12.0
Dgraph codename : zion
Dgraph SHA-256 : 078c75df9fa1057447c8c8afc10ea57cb0a29dfb22f9e61d8c334882b4b4eb37
Commit SHA-1 : d62ed5f15
Commit timestamp : 2021-12-02 21:20:09 +0530
Branch : HEAD
Go version : go1.17.3
jemalloc enabled : true
For Dgraph official documentation, visit https://dgraph.io/docs.
For discussions about Dgraph , visit http://discuss.dgraph.io.
For fully-managed Dgraph Cloud , visit https://dgraph.io/cloud.
Licensed variously under the Apache Public License 2.0 and Dgraph Community License.
Copyright 2015-2021 Dgraph Labs, Inc.
Have you tried reproducing the issue with the latest release?
What is the hardware spec (RAM, OS)?
Steps to reproduce the issue (command/config used to run Dgraph).
- Make Json file with UID ( we use leveldb for UID duplicate management)
- Bulk Load 10b data with Two Type(Account, Transaction)
Schema : Account Have Edge to Multiple Transaction([uid]), Transaction Have Edge to Account(uid) - Live Load New Data 50m data per day(Daily Batch)
4 . Query Every Day
Expected behaviour and actual result.
Expected Behavior
{
"group": "corp",
"address": "testAddress01",
"rcv": [{ "amtout" :10,"time":2021-12-01},{ "amtout" :50,"time":2022-05-18}],
},
Actual Behaviour (live load data only with 2022-05-18.json)
{
"group": "corp",
"address": "testAddress01",
"rcv": [{ "amtout" :50,"time":2022-05-18}],
},
Experience Report for Feature Request
Note: Feature requests are judged based on user experience and modeled on Go Experience Reports. These reports should focus on the problems: they should not focus on and need not propose solutions.
What you wanted to do
The past year’s worth of data is put into bulk, live load data is put into daily batch operation, and then the query is running on the next day.
What you actually did
We generate all the data in the Json file. To prevent duplicate UIDs for each address, put the UIDs for each address previously assigned to the JSON file (the UIDs were allocated by the assign API after zero execution)
After executing only zero, put 1 year’s worth of data in bulk load, then execute alpha, and enter the data as live load for the day before that date.
Why that wasn’t great, with examples
If you send a query after this process, you will only see the UID list for live load when you query for the Transaction UID (Edge) connected to the Account Type. The problem is that the UID for 1 year worth of data entered in bulk cannot be checked.
Is what I did wrong?
The strange thing is that if you look at the order of the sst files created in the P folder, the data put in BULK is generated up to 1-8000 sst files, and the data put in live is generated 16000-16100 sst files, so it is not connected. Is there any solution to this?