I am currently testing out dgraph for a small project around customer data. Everything is fine loading small amounts of data:
e.g. one table has 19k rows (~38k nodes), loads them fine
another table has 52k rows (~52k nodes), loads them fine
I noticed, however, that when loading slightly larger tables, alpha crashes at around 180k rows and doesnt even recover. I have to start alpha again and restart my data load. I tried chunking it to 10k rows each load but it crashes at around the same amount of data ~180k rows (~180k nodes) (the whole table has about 1.8M rows, so its not even scratching the surface). I have far bigger tables to load (~300M rows in total) and Im not sure whats causing the crash. I scan the stdout and it would have lines that say mem flush around the time it crashed.
Is this due to a small configuration? Im running a single zero/single alpha on windows (yes, i know, why am i on windows, existing servers) (company devops is stingy and wont give me a few linux instances). I forgot to capture the error message. I’ll be sure to capture them next time.
After I restart alpha, i would get random error: “applying proposal. error cannot retrieve posting for uid <> from list with key” scattered through out the stdout. Is this something to be worried about?
I kind of feel you will ask me to load it via the offline method but its regular for us to have 200k new rows of data, at minimum, per day. It would not be feasible for me to stop-start dgraph so i can use the offline loader.
edit: i forgot to mention Im on v20.03.0