Upgrades are so unergonomic

I know that a lot is gained from being able to say that two major versions of dgraph are not going to be storage compatible, but the story to upgrade massive systems is so unergonomic.

[23:10:15Z] REDUCE 09h12m18s 100.00% edge_count:26.36G edge_speed:1.918M/sec plist_count:14.62G plist_speed:1.064M/sec. Num Encoding MBs: 0. jemalloc: 0 B
Total: 09h12m18s

Exporting everything, bulk loading in one process for everything and then copying TBs of badger sst files across to other peers manually is a drag. If bulk loading were the process of backup/restore and just took longer - that would be far more palatable, and probably a lot quicker to calculate and load from many peers (group leaders probably) instead of one process only.

Just did it, fresh in my mind - just wanted to vent.

2 Likes

I think maybe the worst part of it is that it is not automated. Like, bulk loader will go to the same location that your exports were sent to (as long as its not GCS, but I can look by that until fixed). But it wont even do that for the schema files that are dropped one per group during export. So you have to download them all and concatenate them together manually so it will work… ok, fine.

Then you have a 10h wait on processing the 26 billion things in your export. Also fine, its a lot of things.

But here is the second crazy part - lets say, like me, you have 5 groups. Bulk loader leaves you with 5 directories of 4k files each and you have to copy them across the network to the right places. This is especially stupid in kubernetes, where I have been installing netcat on the init containers of each alpha so I can yolo copy files between running init containers in pods… bad.

I suggest a different story for import:

  • export from old cluster, just as it currently is
  • new cluster comes up and is fresh. You hit a admin GraphQL mutation that says import, same signature as restore
  • cluster sets all peers to draining and dumps all of its storage. Then the leaders of the groups of the new system reach out to the storage and see if their group number has a export file.
    • what if the number of groups grows between export and import? Maybe it does a 1:1 and then balances to the new group. Maybe it does not do anything special during load and it auto balances later.
    • what if the number of groups contract between export and import? Maybe it just assigns 2 files to one group or something like that.
  • after the leaders have managed their export files and schema just as dumped, they manage sending snapshots to non-leaders in their group.
  • after all peers are at the same raft application, the system is ready to go - no other interaction needed except maybe manually removing the draining state (to match the restore workflow)
2 Likes

thanks for the awesome information.