Delta Patching: Why Doesn’t This Happen More?

Bloated puffer fish

Bloated puffer fish (Source: https://www.flickr.com/photos/mahalie/)

I’ve taken this rant to my blog instead of posting a Tweetstorm (that’ll get deleted anyway, because I delete my Tweets on a delay using this tool I wrote). You’re welcome.

Anyway, I was upgrading some servers in my lab and got to wondering why I have to download the entire package for a new version of, say, the linux-headers*.deb package? I can see a version of it sitting there in /var/cache/apt/archives, so why not just grab the delta between the package I have now and whatever the latest version is?

Sure, there are some challenges in how you do that, but in a world with rsync, xdelta, VM differencing disks, copy-on-write snapshots in btrfs, etc., etc., why do we have to download so much stuff when we’re basically just patching things?

And this extends into other areas. Check out the response to my request to add “resume partial downloads” to a CoreOS baremetal utility. There’s an assumption around the place that people have essentially infinite bandwidth. Which just isn’t true.

Now maybe, as in the CoreOS comment, bandwidth inside a data centre is very large indeed. Okay, but do we really need to use it all for downloading full packages from our staging server? When you’re constantly keeping systems patched, just for security reasons, surely minimising the amount of data, and therefore time, it takes is a worthwhile endeavour?

It’s worse again for things like container images, which are supposed to be immutable, so you just grab an entirely new image every time you change a byte in a config file. That’s a lot of time spent waiting around for a bunch of largely the same bytes to transfer.

And then we get to possibly the worst culprit of all: console games. I did a quick search and found people talking about 5 gigabyte patches for games. Holy crap!

I have ADSL 2+ at home that screams along at an awesome 7.5Mbit/s if there’s a good tailwind. Are you telling me you can spend a bunch of time figuring out how to perfectly render waves and smoke effects, but you can’t figure out how to transfer a minimum delta? Or you have, and yet somehow the smallest possible patch you’ve managed to build is 5 GB? Really?

Consider how much this is worth just in terms of people’s time. Time spent waiting for a patch to download before you can play a game you really want to play. Time spent waiting for the packages to download so you can patch the latest security flaw and then get back to business. There are tens of millions of people using these systems, and they’re spending a bunch of time waiting around because programmers are too lazy to figure out how to write delta patching into their update systems.

Can’t we fix this?

Bookmark the permalink.

Comments are closed.