-
Notifications
You must be signed in to change notification settings - Fork 10
Allow snapborg to backup individual snapshots to multiple borg repos [6/6] #12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
Let's make sure to merge this PR in after #10 so that merge conflicts are avoided. |
f0b9ff1 to
d23ac47
Compare
d23ac47 to
4d9408a
Compare
4d9408a to
001b493
Compare
|
There's an issue with this PR which I'm unsure how to solve without your input. This new approach looks in the borg repo and identifies the currently backed-up snapshots to know whether any given snapshot was backed up. However, this won't work in fault-tolerant mode: if there's an error of some sort, there's a good chance that the borg repo is inaccessible. So snapborg can't check if the snapshot was backed up or not. I have a couple ideas to solve this but they don't feel very clean. I'm wondering if |
|
Also, separate issue: I ran into an issue with |
I'm not sure anymore whether the benefit the fault tolerant mode brings (being able to work with backup targets that are not permanently reachable while still kind of enforcing regular backups) should be implemented within snapborg. If you have a hard limit on the maximum time between backups, you could also use some external monitoring system. And if you haven't then all that fault tolerant mode gives you is some kind of warning that your backup is getting outdated. Without fault tolerant mode, it could be interesting to define how snapborg should behave if one borg repository is reachable whereas another one is not. Should it be a hard error if any repository fails? Or should it be possible to define "mandatory" and "optional" repositories? Another option would be to to store a list of repositories it has already been transferred to alongside each snapper snapshot in its userdata. This could lead to synchronization issues (snapborg backup comments have to be synchronized with snapper snapshot userdata) but might work quite well. Also I think the fault tolerant mode could be kept this way. In general I like the idea to create a UUID for each snapper snapshot because this provides an exact one-to-many mapping from snapshots to borg backups (using the snapshot ID as I suggested in #6 would lead to issues if multiple snapper configs (possibly on different machines) backup to the same borg repository. |
Cool :-) But let's see what the options are before ditching, I don't like the idea of forcing an external monitoring system on people if this is a common usecase (which for the time being, it is 😉)
According to the principle of least surprise, there should never be any surprises - with a backup system, the biggest surprise someone can have is attempting to restore from a backup that they discover doesn't exist (even if the backup exists elsewhere). Following that train of thought, I think the best option in the case of failure would be to keep backing up to other repos as much as is feasible, but to act as if a failure to backup to any repo were a hard failure. Whoever set the system up should investigate why the failure occurred and take steps to mitigate it - that means alerting them properly.
This could work without the sync issues you're afraid of, if we agree on the convention that the source of truth is the borg repo - effectively, the snapper data for fault-tolerance would only be looked at if the borg repo were inaccessible. However, I'm worried about referring to individual borg repos in the snapper userdata: if the repo was moved, or the remote repo changed IP address, that could cause failures. Alternatively, ditching fault-tolerant mode: there's need to setup an alerting system. I've never used this but it looks like it could be a good replacement. Assuming it works of course 😄 Coming back to the config file format for a second: I'm afraid I'm going to have to change it in this PR, simply to list multiple repos in the same entry. I'll do my best to make this backwards-compatible, but I can't promise anything until I've done it. I will start making changes to the config file when I get the chance, and I'll leave fault-tolerance mode alone for the time being. If you decide that there's a good replacement (such as systemd-alert linked above, or something else), then fault-tolerance can be removed. Otherwise, we'll have to fix fault-tolerant mode so that it works correctly even if the borg repo is unavailable. |
This PR solves #6
Mechanism used is the "snapborg id": each snapshot gets a uuid generated, and each borg backup gets the same uuid as a comment.
To check that a snapshot is backed up, check the uuids in the borg repo and in the snapper userdata. A backed-up snapshot means its uuid appears in the borg repo being looked at.
Notes:
Write the snapper snapshot ID in the borg backup metadata (name) and on each execution of snapborg run borg list beforehand to determine which snapshots have already been backed upI considered doing this but the advantage of using comments is that the code is a lot cleaner and less prone to parsing errors. However, it also means that one must use a command such assudo borg list --json --format "{comment}" $REPOto see the comment. Because of this, It also means that it's harder for users to know which snapshots are backed up and which ones aren't.snapborg_backup=truemetadata. Borg will deduplicate the data and save space, but there will be a one-time performance hit: the first post-upgrade snapborg run will probably take a long time as it will be backing up all snapshots that fit within the retention policy.