Backups and Storage
The ideal rule of Backups is “3, 2, 1. 3 Copies. 2 Mediums. 1 Off site”.
It requires quite a bit of work to make this happen for everything you care about.
This page describes partially what I do and partially what I recommend. As usual for garden pages, it is incomplete and I’ll my best to add to it over time.
Computer Backup Link to heading
In an ideal world your laptop or desktop would not contain anything of importance that could not be easily restored. You could delete it, and your Full Disk Encryption keys1, at-will and nothing would be worse except maybe needing to re-download some things from remote servers.
This is, for the most part, how iPhones work combined with iCloud and it’s pretty awesome (minus the Privacy Implications of iCloud).
Still, it makes sense to have things around locally on disk to work on them and while you’re working on them you may have the only copy of that version of the data for a little while. Plus, while it might not “matter” if your laptop is wiped or destroyed, having a recent backup is much easier to restore than to set everything back up again as it was even if no important information was lost. Meaning that even if you use Dropbox or etc to backup all your files… what about all the configuration of your computer? Wifi passwords, browser sessions, browser tabs, and more… It’s not critical information and it could easily be restored or recreated, but backing it up is easier to deal with.
File Backups Link to heading
The first backup to setup is File Backups. I currently use Arq on my macOS computers to create hourly backups into a cloud storage system (BackBlaze’s B2) and my NAS (via Minio).
You have other options too including borg
and restic
which provide basically the same thing but they’re a little more advanced (they also work on things other than a Mac). Arq has a cloud service, but I prefer to handle things myself.
B2 is insanely cheap. I pay roughly $12/year to store and access hourly backups dating back to 2017 for several computers. If you already pay for Google Drive, Dropbox, or etc. it is probably smarter to just use that to store your backups just because 1) you’re probably not using all the space you have and 2) you’re probably already paying for it so it doesn’t matter how cheap B2 is unless you’re already running out of space in those services. On the other hand, you may consider cancelling those subscriptions and simply using B2 since the files will be backed up by Arq or similar.
Using apps like Arq, borg
, or restic
is really nice because they’re very simple designs. borg
and restic
are both open source and free (as in beer). While Arq is not free or open source, it is much easier to use and its design is very simple. Because they’re simple it means there are fewer moving parts for it to fail or me to misconfigure it. All these programs can also encrypt your data locally so even if someone hacks your Dropbox/etc. no one can read through your backups. Other services like BackBlaze’s or CrashPlan’s unlimited back up services only make sense if you’re backing up insane amounts of data on a single computer. If it’s something less than actually 10TB of data then it may make sense to just use B2 directly instead. I recommend comparing pricing by visit their websites.
A final note in this section: You should use B2 as your remote backup, but you should use an external hard drive/ssd or a NAS with redundant storage (hopefully ZFS, but at least a NOT FULL BTRFS array) to store your backups locally. Every month or so I do a “full” backup of my systems including Every Single File so in the worst case that I need the exact contents of this node_modules
folder I can get this exact set of files back2.
System Backups Link to heading
The first backup for a macOS system should be done using Time Machine. For no other reason than it’s the thing which macOS supports the best.
Macs with T1/T2 chips it gets increasingly difficult to do disk image backups of. I used to simply go into recovery mode and create a DMG of the disk with everything.
git-annex Link to heading
git-annex
is an amazing tool for managing files, specifically large files, in a way that lets you use both your many devices, cloud services, and external hard drives.
I’ve always wanted something which I could store a file in a kind-of-database that stored I had the files plus made sure they were okay or valid. Made it easy to sync with things like S3 or whatever. Git Annex does all that.
Meaning even if you “deleted” things on the disk it would be unrecoverable because FDE uses Symmetric encryption often backed by a TPM (which you just, in theory, wiped). ↩︎
Less of an issue these days, but we had a service running on Node 0.11/0.12 in like 2019… some of the packages didn’t exist anymore. Some things do not compile anymore with up to date compilers. Brew doesn’t have some of the old versions of the libraries needed… it was literally the only way to run that old project on a new computer. Saved us a lot of time. ↩︎