Lint

2026-05-28 14:27:52 +01:00 · 2026-05-28 14:27:52 +01:00 · f5f017b01f
commit f5f017b01f
parent d83691323f
14 changed files with 103 additions and 46 deletions
--- a/src/content/posts/backup-container-btrfs-borg.md
+++ b/src/content/posts/backup-container-btrfs-borg.md
@ -23,12 +23,12 @@ links:
 **The short version:**

 - One Alpine container, ~75 lines of Bash, that snapshots a BTRFS volume and pushes the snapshot to one or more [Borg](https://borgbackup.readthedocs.io/) repositories on a fixed interval. The snapshot is the only thing standing between "consistent backup" and "corrupt database in the archive."
- Multi-target via numeric env vars (`BORG_REPO_0`, `BORG_REPO_1`, ...). The wrapper iterates until the next index isn't set. No config format, no DSL — the env file is the configuration.
+- Multi-target via numeric env vars (`BORG_REPO_0`, `BORG_REPO_1`, ...). The wrapper iterates until the next index isn't set. No config format, no DSL; the env file is the configuration.
 - Two years of self-hosting, multiple restored incidents, zero data loss I noticed.

 ## The problem the snapshot solves

-I self-host several databases that are mid-write at every moment of the day. `tar | borg create` against the live volume is a race: a Postgres or SQLite file that's half-written when borg reads it goes into the archive in a state nothing on Earth can replay. The "right" answer is to coordinate a quiesce with every database — a fan-out of `pg_dump`, SQLite `.backup`, Redis `BGSAVE`, and so on, all with retry, timeouts, and per-app credentials.
+I self-host several databases that are mid-write at every moment of the day. `tar | borg create` against the live volume is a race: a Postgres or SQLite file that's half-written when borg reads it goes into the archive in a state nothing on Earth can replay. The "right" answer is to coordinate a quiesce with every database: a fan-out of `pg_dump`, SQLite `.backup`, Redis `BGSAVE`, and so on, all with retry, timeouts, and per-app credentials.

 The cheaper answer, if you've put everything on one BTRFS volume, is `btrfs subvolume snapshot`. It returns instantly with a copy-on-write fork of the entire filesystem. Every file is now atomically consistent at exactly the same instant. Run borg against the snapshot, not against the live volume.

@ -59,7 +59,7 @@ BORG_REPO_1=/local-backup

 There's also a no-index fallback (`BORG_REPO=...` with no number) for the single-target case. Same script, no extra config plane.

-I keep coming back to this pattern for small-system orchestration. The env file *is* the data structure. There's no YAML parsing, no JSON schema, no config-validation layer between you and the variable that actually matters.
+I keep coming back to this pattern for small-system orchestration. The env file _is_ the data structure. There's no YAML parsing, no JSON schema, no config-validation layer between you and the variable that actually matters.

 ## The scheduler is a sleep, not cron

@ -79,7 +79,7 @@ A comment in the file says it out loud: "Using a simple sleep loop to schedule b
 Two subtleties worth naming:

 - **First-boot grace period.** If `backup_completion_time.log` doesn't exist yet (fresh container, first backup still running), fall back to `container_start_time.log` so the container isn't reported unhealthy during the first scheduled run.
- **Partial success is not success.** In multi-target mode, the completion log is only written if *every* target succeeded. One repo failing means the healthcheck stays red even if the other two are fine. Stale-but-quiet was the failure mode I wanted to make impossible.
+- **Partial success is not success.** In multi-target mode, the completion log is only written if _every_ target succeeded. One repo failing means the healthcheck stays red even if the other two are fine. Stale-but-quiet was the failure mode I wanted to make impossible.

 ## Smaller calls

@ -90,7 +90,7 @@ Two subtleties worth naming:
 - **`--files-cache=ctime,size,inode`.** The default `mtime,size,inode` re-hashes files when their mtime changes; on BTRFS, ctime is the more honest signal of "this content actually changed."
 - **`compression=zstd,12`.** The sweet spot for backup data on my hardware: substantially better than zlib, not so slow it dominates the run.
 - **`borg compact --threshold=5 --cleanup-commits`.** Reclaims space from pruned archives whenever the segment-file fragmentation crosses 5%.
- **`IGNORE_GIT_UNTRACKED=true`.** Optional. Walks every `.git` dir under the snapshot, runs `git ls-files --others --exclude-standard`, and feeds the result into `--exclude-from`. Skips `target/`, `node_modules/`, build caches — anything the repo already knows isn't worth keeping.
+- **`IGNORE_GIT_UNTRACKED=true`.** Optional. Walks every `.git` dir under the snapshot, runs `git ls-files --others --exclude-standard`, and feeds the result into `--exclude-from`. Skips `target/`, `node_modules/`, build caches; anything the repo already knows isn't worth keeping.
 - **`SYS_ADMIN` capability on the container.** Needed for `btrfs subvolume snapshot` and `delete` from inside the namespace. The narrower capability set didn't have a way through.

 ## What I'd change