Some software you just use. Other software you need to trust. Trust is required when there is no tight control loop.

With normal software, you are constantly giving commands and observing effects. It might destroy your work, but it’s very likely you would notice that happening quickly. There are so many opportunities for you to notice, so many different commands you might give. You don’t have to see it behaving catastrophically to notice that something is wrong. For example, I am typing this in Emacs. I type C-x C-s to save, and the “unsaved” indicator disappears. Perhaps one day the icon doesn’t disappear. I would copy my buffer into some other application for safety, and then figure out what’s wrong. There are lots of worst cases, but it’s pretty common to lose work, and the damage is usually limited.

Backup software is different. Ideally it runs automatically, but even if backups are manual, the manual work is making the backup, not testing that it actually worked. And even with automated testing, the true purpose of a backup is to restore something that’s gone, and that’s not something you will see until years later. You run it today, and trust that it will work correctly in the future.

I’m pretty sure I don’t trust any backup software available for Linux. Partly this is for reasons I can point to and prove. But partly also, 2006 was the last time I suffered a true catastrophic disk failure. Having never had any of this backup software actually prove useful, I don’t have much reason to trust it.

Time Machine for Mac is different. I have never suffered a disk failure on any Mac I owned. But I have used Time Machine to migrate from one machine to another, or to replace one disk with a larger one. I’m pretty confident about Time Machine. And Time Machine also ticks all the boxes left blank by every Linux application out there. Every piece of backup software should be copying how Time Machine works for its basic operation. But even a perfect replica would still not be something I can trust, until I see it working in a real emergency.

The state of backup software on Linux is dire. I can quickly look at practically all Linux backup software and immediately spot where it falls far short of what Time Machine provides. And that’s what I’ll do here. But don’t get me wrong – a new project that tackles all of these problems would still not be an acceptable substitute. It would still not give me any reason to trust it.

Use Cases

One difficulty is that there are quite a few use cases where backup software might come in handy. It would be great if all of those cases were supported. But some are more important than others.

My Hard Disk Crashed

This is what happened to me in 2006. I had no backup. Sure, that’s on me. I ended up shipping my broken hard disk plus a fresh device to a company in Singapore for them to physically swap the housing and mechanism. I was lucky: I paid €500 and got my data back.

Importantly, what I actually got back was my Subversion repository. Many of the system files were irrevocably lost, but that didn’t bother me much. The Subversion repository was by definition all the stuff I wanted to keep. And I got back the rest of my home directory as well, so that’s nice.

A lot of backup software is built with exactly this use case in mind. And it’s true that hard disks are fragile things. Since this disaster, I’ve had quite a few spinning disks fail in various ways. But oddly, none of those failures have been anything I needed a backup to recover from. That’s partly because hard disks and modern filesystems have built-in robustness features that can recover from a few broken blocks. And these days computers use SSDs, which are even less likely to fail. Two decades later, I’m not sure if this is as important a use case as it once was.

Oops I Deleted Something

This, however, is exactly as relevant today as it’s ever been. And it’s the use case that Time Machine leans into the hardest.

Time Machine’s UI strikes me as rather over-engineered. It kinda takes over your whole machine. And the graphical elements, with the Finder stacked deeply into the distance and it’s little side meter, are a pretty intrusive way to indicate what’s happening.

But when I think about it, this use case is probably the one that is most likely to involve a user in blind panic mode. It’s Sunday night, I’m tired, my report is due on Monday, it looks like I’m going to make it, and then oops, I deleted the latest version instead of the outdated version. And you don’t just delete files, you also accidentally delete large swathes of text from within a file. Or you catastrophically mess up something and can’t remember how to get it back how it was.

These are exactly the situations where the adrenaline and tiredness are likely to make any attempt to recover even worse.

I guess this is partly why Time Machine looks the way it does. It’s designed to be reassuring, and quickly reassuring. It’s designed to clear away the distractions so you can concentrate fully on the recovery task.

My Laptop Got Stolen

This has never happened to me. I assume that I’m playing with fire and have just been lucky. I don’t actually take my laptop out of the house very often, so I’m not really worried about that case. It’s far more likely that my phone would get stolen. I once had my phone stolen by a street gang in Addis Ababa. I yelled at them and they gave it back, presumably to avoid a beating if a cop noticed the disturbance. Lucky.

Losing your phone feels awful. Probably you have copies of some important stuff. But also, you probably never really thought about it much, and now you get an endless sequence of “oh shit, my photos of my holiday were on there” and then “oh shit, all the details of my application were on there”, and eventually “I can’t believe all my photos of mum are gone”. It’s not a feeling of panic, but it’s a feeling of loss.

This is the closest thing to a use case that Linux backup software actually addresses. And it’s pretty important.

It is still fairly likely for a laptop to just break. In this case Murphy’s law applies: it will likely break at exactly the moment you need it most. This means that when things are restored, you want the laptop in exactly the state you left it. If the report is due in the morning, and you’ve just spent 90 minutes waiting for the restore to complete, you don’t want to spend another hour reinstalling and setting up your office software or development environment.

My House Burned Down

When I was 11 an electric blanket in my bedroom started a fire while everyone was out. A neighbour called the fire brigade and it was put out before it burned down the house. That’s the closest I’ve ever come to this scenario. These days the prospect of my apartment being bombed is not as outlandish as it once might have been.

Of course, this scenario assumes that you are more likely to escape the disaster yourself than your laptop is. It’s been a cliché for at least a century that you do not stop to collect “important papers” before leaving a burning building.

I actually do have a go-bag now. My brother had to evacuate his house in a hurry due to a chlorine leak, so that made me think it’s a practically useful idea. My go-bag has a full backup in it, just because it would make me feel more comfortable to have that on my person. It’s a valid case.

The Upgrade Went Wrong

This is advice you see over and over again: “make a full backup before attempting this”. Of course, no-one does.

This case goes beyond just upgrades that accidentally go wrong. Here in the enshittocene, “upgrades” often mean the removal of features that were working perfectly well before. Or the new version requires a pointless round trip to the Internet, just to force you to maintain a subscription. Upgrade regret is a real phenomenon.

This case is unlike the others in that you specifically want a backup of the system software, rather than your personal data. This will become a significant point later.

I Have A New Laptop

This is such a different case that it makes sense to use an entirely different tool. There’s a different tool for that on a Mac. But on Linux, a backup utility is your closest approximation. You make a backup, pretend that your laptop got stolen, and restore onto the new machine.

A complication of this case is that the new machine is probably different to the old one. The good news is that the hard disk is probably bigger, not smaller. But it might be a completely different architecture.

Criteria

With those use cases in mind, I can talk about the features I need backup software to support in order to consider it a reliable tool.

Easy Setup

Look, I’ve been a computer programmer for decades. I have degrees in this stuff. I’m able to wrestle with complex operating system problems. That also means I know how to pick my battles. Backups are not the place for technical sophistication.

The problem is that if I mess up my backup system – and I often mess things up – I won’t find out for years. And then it’ll be too late. It needs to be impossible for me to set up the backup system wrong. If a backup system provides ways for a user to shoot themselves in the foot, that should be a high priority bug.

An easy giveaway here is a backup system that invites me to “choose the files you want to back up”. No way. That puts all the responsibility on my shoulders. This is a setup for victim-blaming: you lost your files because you chose the wrong files to back up. Nope. If you write backup software, it’s your responsibility to ensure that the right files are backed up.

Multiple Backups

“If you backed up your machine and didn’t test it, you didn’t back it up”. That’s what we’re taught.

You can verify a backup automatically, but that’s just a machine marking its own homework. Nothing beats actually restoring a lost file that you really need. That’s the only way to come to trust your backup system.

How to make that happen when disks almost never fail? By invoking the “oops I deleted something” use case. We all accidentally delete files from time to time. In all but the most extreme cases, more frequently than we burn down our house.

This means that the “oops I deleted something” case is, in fact, the “my house burned down” case. They must use the same software and the same restore procedure. If they are implemented different ways, whatever you’re using for disaster recovery can’t be trusted. And then you might as well not bother.

But “oops I deleted something” has extra requirements. Often, you deleted it and didn’t notice it for a week. In the meantime your automated backup system probably kicked off. There’s no point restoring one broken file with the same broken file.

So the logical conclusion is that even in the likely case that you never need to wind back your whole system to a specific date two days or two months ago, the backup system needs that anyway. This is a complication.

Fast Backups

This is a corollary of the above. The typical “oops I deleted something” is in fact that you created the file on Monday, deleted it on Tuesday, and need it Wednesday. To recover from that situation, backups need to be made at least daily. And if they’re going to be made daily, they need to be fast. Otherwise you’ll end up getting pissed off with your system running at a crawl and you’ll switch off the backups.

Typically this is described as “incremental backups”. If you have a 1TB disk, you can’t sync all of that data every day. You need to sync only the parts that changed.

The usual approach is to mix up frequent incremental backups with very occasional full backups. This guards against filesystem corruption that changes a file without the system noticing. For the usual incremental backups, it’s fine to look at the file timestamp or some similar mechanism.

Automated Testing

This is complicated.

Backup testing actually describes a whole new set of use cases, or rather failure modes. You need to be clear what exactly you’re guarding against here.

There is an amazing feature of computers that I think is too mundane to get enough attention: checksums. Intuitively, we all understand what it means to copy a file from one place to another, because we’ve done it by hand with a biro. That teaches us that mistakes happen. What most people do not intuitively understand is that computers almost never make mistakes. Not because they are perfect machines, but because they routinely check their work with a checksum. If the checksum doesn’t match, they try again until it does. And instantly, mistakes are practically eliminated. Checksums are implemented in multiple layers, through the network, filesystem, application software, and built-in to the firmware running on your hard disk. So the same data is being checked against multiple different checksums, all the time.

It goes beyond that: you can not merely detect mistakes, but correct them too. That feature is rarely implemented in systems that can retry, like copying. But it’s a common feature of filesystems.

So if you’re checking byte-by-byte that a source file matches a destination file, you’ve gotta ask yourself: why? What exactly are you guarding against? Has that thing happened to you, or one of your friends? Did you read about it on Reddit one time?

The most obvious real issue that testing can detect is if the backup location itself has failed. Read every byte. If the disk or the filesystem detects an error, the backup disk itself is bad. Throw away the backup and make a new one. That’s the important case.

The interesting thing about that is, there’s no need to actually check any of those bytes. What you want to do there is check the disk, not the backup. The disk has its own checksums. That’s enough to detect any reasonably plausible problems. And since every computer has a disk, and that disk is critical for the functioning of the computer, it doesn’t make much sense to have a computer that doesn’t check its disk regularly.

Right?

So… what’s the point of having backup software tediously check the integrity of its backups? You should be able to trust the destination not to go wrong without flagging something.

The answer is, a backup integrity check should be checking for misconfiguration. It should check that the files on the machine right now match the files in the backup. The problems it’s guarding against are, the remote machine got misconfigured and now that location has backups for a completely different system. Or, the client machine got misconfigured and now it’s backing up the wrong directory. Or, the backup was set up to only backup a fraction of the system, so the client machine is 1TB and the backup is 1MB. These are the things an automated backup can usefully do.

So, an essential piece of backup verification is a popup on the user’s machine saying something like, “I just checked the backup, and it had 734GB of data from the directory /home/mat“. Mostly you will ignore that. Occasionally you will stop and wonder how it is that your disk is 2TB, it’s 95% full, but the backup is only 734GB. That’s probably the only point of the whole exercise. And that’s what “no you didn’t” really means.

Full System

I do get it. Lots of system files are just caches of easily-replaceable junk. Lots of system files are critical secrets that should be handled with care. And all of this stuff is easily downloadable from the Internet. There’s not much practical use to filling up disks with copies of that stuff.

But the Internet has accumulated thirty years of questionable advice on how to fix problems with your Linux machine. The advice frequently involves modifying system files in some peculiar way, and almost as frequently that advice actually does solve real problems.

When my computer destroys itself, I need to get back to the same working state it was in before. This likely is not the working state delivered or desired by the distribution, and no doubt the new way is a better way. But that’s for another time. When backups are being restored, you need to cut down the variables.

Backup the whole system. If you must, you may very carefully and explicitly specify cache directories that need not be backed up. Fine. But your computer shouldn’t be filling up tens of percent of its disk with daily-changing caches. If the cache gets backed up, it gets backed up.

Partial Restore

I covered this under “Multiple Backups”, but it’s its own point. You should be able to restore a single file, or a particular directory. This should function like copy, not just replacing the existing version.

With disasters being so rare, this is by far the most practically useful feature of a backup system.

Partial Delete

Some data should not exist. I personally am a data hoarder, and it makes me shudder to think of data being deliberately deleted. But even I need to securely delete keys to encrypted hard disks, to avoid the need to actively scrub every block.

Then there are the less common use-cases, where data should be gone forever for legal or ethical reasons.

If you have partial backups going back an indeterminate amount of time, it can be a complicated matter to delete only the data that must be deleted. The backup software needs to support that.

Flexible Restore

Probably the second most useful case is migrating to a new machine. This poses quite some challenges.

It’s certainly a lot harder if the new machine is a different CPU architecture to the old one. A decade ago that was barely a question. Today it is not surprising at all to want to migrate between x86 and ARM, and Risc V is getting ready to make its move. Everything I said under “Full System” applies here though. That one tweak you made to dbus configuration will work just as well on ARM as x86. So if you can pull off this trick, it will be well worth the effort.

Nevertheless, even sticking to one architecture, it’s important that you can restore onto a differently-sized disk, at least. This should include a smaller disk, when the data will fit. Probably there are very few backup solutions beyond dd that can’t even manage that flexibility.

But the point remains: don’t be finicky.

Encryption

The usual rule is 3-2-1: your data should be in three places, on two different media, with one off-site. Notably, you can’t be 3-2-1 on your own. So you can’t watch over your data personally. You encrypt your backups so that this doesn’t matter.

Backup software doesn’t have to support encryption directly. It’s possible to store backups on encrypted disks. But one of the important criteria is that backup software is hard to mess up. For that reason, it brings extra confidence if the backup software simply doesn’t have the option of storing data unencrypted.

A particular extra feature is using asymmetric keys. This allows the software to encrypt the backup without the computer itself ever touching the key necessary for decrypting it. This is especially useful if the computer should be quietly backing itself up when it’s left alone. Ideally you don’t want to leave sensitive keys in the memory of a device that might get stolen.

Varied Destinations

It turns out this is the fly in the ointment with Apple Time Machine. Backing up to a USB disk plugged into the laptop is certainly a useful case. But this requires manual work of plugging and unplugging and securely storing the device. That’s a big ask for every day. So, it makes sense to have some low-effort backup option that stores the data somewhere over the network. A cloud service is a good way to tick off the “off-site” requirement. A local NAS is also a way to limit the damage of any local hardware failure, such as a broken water main.

So a backup system should allow backing up to various services over the network, as well as local storage options. This comes with extra requirements for robustness, since network connections tend to drop out. All of this complicates the goal of being impossible to mess up, which is probably why Apple never supported it in Time Machine.

Results

Software	A	B	C	D	E	F	G	H	I	J
rsync	❌	❌	✅	❌	✅	✅	✅	✅	❌	✅
Timeshift	❌	✅	✅	✅	❌	❌	❌	❌	❌	✅
Back In Time	❌	✅	✅	❌	❌	✅	✅	✅	✅	✅
Déjà Dup	✅	✅	✅	✅	❌	✅	❌	✅	✅	✅
Clonezilla	❌	❌	❌	❌	✅	❌	❌	❌	✅	✅
Cronopete	✅	✅	✅	❌	❌	✅	❌	✅	❌	❌
Borg	❌	✅	✅	✅	✅	✅	❌	✅	✅	✅
Bvckup	❌	❌	✅	❌	❌	✅	✅	✅	❌	❌
Duplicati	?	?	?	?	?	?	?	?	?	?

Label	Description
A	Easy Setup
B	Multiple Backups
C	Fast Backups
D	Automated Testing
E	Full System
F	Partial Restore
G	Partial Delete
H	Flexible Restore
I	Encryption
J	Varied Destinations

Rsync

Rsync is an archival tool, which is very different to a backup utility. It’s designed to create perfect byte-for-byte copies of a directory tree. Which is close enough to making a backup that many people advocate using this for backups. I myself followed exactly the advice here for many years.

The biggest problem with this approach is that it’s not suitable for the “oops I deleted something” problem. It only helps if you are lucky enough to have a backup that is neither too old nor too fresh. This means it’s not a tool that you will use regularly.

And that ties into the second biggest problem: it is hellishly complex to run correctly. There are a great many command line options. If you set up a cron job and it goes wrong, there won’t be any error message to tell you so.

Rsync is well-suited to a tight feedback loop, running it on the command line yourself. It is not something you can set and forget. That’s not good enough for a backup solution.

Unfortunately, there are a lot of people out there ready to insist that if you can’t get this much running smoothly, you don’t deserve to use a computer at all. Even more unfortunately, those people are probably (like me) unlikely to ever actually suffer a catastrophic data loss. So we don’t get the schadenfreude of watching these self-appointed experts facepalm. Pity.

Timeshift

What a strange piece of software:

Timeshift is designed to protect system files and settings. It is NOT a backup tool and is not meant to protect user data. Entire contents of users’ home directories are excluded by default. This has two advantages:

You don’t need to worry about your documents getting overwritten when you restore a previous snapshot to recover the system.

Your music and video collection in your home directory will not waste space on the backup device.

Dear me, heaven forfend that I should waste space storing my music and video!

I honestly have no idea what problem Timeshift is trying to solve. If anyone knows, I would be fascinated to hear. Certainly the developers didn’t include any explanation in their README. Perhaps they felt that would be a waste of space.

Back In Time

This is the type of software that requires you to add one-by-one the directories you’d like to back up. Their example even suggests you wouldn’t want your entire home folder, only specific subdirectories within that:

In the Include tab, add files/directories to backup (e.g. Documents, Music…).

I really don’t understand the mindset here. Why do people not want their data backed up? I do understand that some folders should be explicitly excluded. But I would expect to tell my backup software what to exclude, not what to include. That’s how Time Machine works.

Déjà Dup

This is the software I actually use on my Linux machine. By default it only backs up your home directory. So it really only covers the “oops I deleted something” case. I guess I’m just hoping nothing happens to my laptop. Not a comfortable place to be.

Clonezilla

Clonezilla is a partition and disk imaging/cloning program similar to True Image® or Norton Ghost®.

Standing ovation for stating clearly what it’s for. What it’s for, is in no way being a backup system.

I’ve seen this mentioned in a few places as being a way to make backups, so I’m including it here. That’s the last I’ll mention it.

Cronopete

Cronopete – An Apple’s Time Machine Clone For Linux

Sounds nice! Unfortunately:

It is important to note that Cronopete is NOT designed to back up the whole operating system; only the personal files. Never try to backup the root folder, or system folders like “/etc”.

It’s nice that the home page includes a list of alternatives, and the features Cronopete lacks. I approve of Cronopete. But I will not be using it.

Borg

This is an entirely command line tool, so it automatically fails on the “easy setup” score. It has many features, but some of them are hard enough to use that they might as well not be present at all. For example, removing an individual problematic file is not something I would be willing to just trust the software to do.

I actually do use Borg backup for backing up my Yunohost systems. I trust Yunohost developers to do the work I’m not interested doing myself here, and that’s fine. But it’s no solution for my laptop.

Bvckup

This appears to be rsync for Windows users. But sometimes when you search for “open source backup software” it turns up. That’s enough about that.

Duplicati

Designed for organizations with large proprietary datasets

…is what it says on its homepage, which also features two sections headed “But the most valuable training data a company owns is its own history”. Just above “AI workflow benefits”. Sure. Seems legit.

What To Do?

So basically all the options suck.

My actual approach right now is pretty convoluted:

My Ubuntu laptop gets backed up to the main NAS using Déjà Dup, with as close to default settings as possible. That should mean that my actual work is protected.
If my laptop does die, I’m in for a world of pain. I have a script that regularly dumps a report of installed packages and snaps and flatpaks and things. That will be a big part of what I do to get back on my feet.
My Yunohost servers use Borg, and the archives are rsynced to the NAS with cron jobs and scripts.
My aging Digital Ocean VPS has certain critical directories rsynced to the NAS with more cron jobs and scripts.
The NAS is rsynced to an encrypted USB disk permanently plugged into it.
I irregularly also back up the NAS to anothing pair of encrypted USB disks that normally live in the basement.

None of this is very satisfactory.

Matthew Exon

@mat

27 posts

5 followers

Backups

Paul Cantrell