Cloud Backups - An hackers approach

Since the dawn of time - or at least since we use computers with broadband connections - we want to backup all our files to computers all around the world, so we know they are safe from natural desasters, the agencies or just the cup of coffee that you will spill over your laptop.

Now with the rise of all the cloud providers (Amazon AWS, Azure and Google) online backup space became cheap and online backup services came to live. And they are totally fine if you only have a couple of gigabytes to safe or only have a single computer. This is enough for most of the population and mostly free of charge or below 10 bucks a month.

BUT as a hacker and computer nerd you most certainly have at least a dozen machines under your control and a metric shitload of data you wanna back up. This is where i was a couple of days ago.

What i want

So what is it that i want from my backup solution:

  • Store unlimited amounts of data (or at least ~10TB to begin with)
  • Have it strongly encrypted with a key only i have, so only i can decrypt it
  • It has to be cheap - free would be great of course :-)

What the internet provides

After a bit of digging, the cheapest storage you can get is Amazon Glacier for 0.004 USD/GB/Month.
The API is a bit cumbersome and the AWS console is not one of the best interfaces to work with but eh.

Let’s do the math and see if it is viable:

10000 GB * 0.004 USD = 40 USD/Month

Meh! 40 Bucks is too much for me. There has to be a better and cheaper way!

Other online backup solutions that provide unlimited storage often only backup a single windows-pc or mac so thats not a nice solution either.

Back to the early hacker days

I rememberd that i have an usenet account lying around. Yes that Usenet! Where the old graybeards on their Unix terminals went to discuss if vi or emacs is the best editor.

The usenet is still with us and there are quite a couple providers out there that not only store the editor-flamewars but also tons of binary articles. These articles are just files stored on those servers. From cat pictures to tentacle porn. And don’t forget they all sync the data between them, spread over the entire world on dozens of servers.

And all that for ~15Eur/Month at my provider.

So why not just use that incredibly source of disc-space and bandwith for our purposes? So let’s try it!

The Backup Process

To store our backups securly to the Usenet we need to do a bit more than just tar them up and throwing them onto the usenet:

  1. We’ll pack them with tar and compress using 7-Zip

    $ tar --xattrs -cpv - <your-directory-to-backup> | 7z a -si backup-file.tar.7z

    That tar command will also safe your file permissions and extended attributes.

  2. Encrypt the backup using GnuPG
    You have to generate an GnuPG key first using gnupg --gen-key if you don’t have one already and then encrypt the tar file.

    $ gpg -r <your-key-id> -e backup-file.tar.7z

    You’ll get a new file named backup-file.tar.7z.gpg.

  3. Checksum that encrypted file and write down the hash

    sha256sum backup-file.tar.7z.gpg|head -c 64

    We’ll use the hash as basename for our final files. So you can easily find them on the usenet but nobody else has a clue what they contain.
    I have prefixed mine with a little three letter code like abc_, but only the SHA should be enough.

  4. Split the tar file into chunks of 50MB each

    $ split -b 50M backup-file.tar.7z.gpg abc_<CHECKSUM>.part
  5. Generate checksums for all files
    Now we want to store the checksums for all the files and save them in case we need to restore, so we can verify that we got them back correctly from the usenet and also if we have stiched the backup back together correctly. SHA512 might be overkill here but eh. We have it, so we gonna use it.

    $ sha512sum * >> backup_summary.txt
  6. Create .par2 files for reparing potentially bad blocks when we download the backup form the usenet
    PAR files are common amongst usenet users and most binary usenet clients have built-in support so they will verify and fix your files automatically after download.

    $ par2create -r10 -n7 abc_<CHECKSUM> abc_*

    This will give us a 10% redundancy, so up to 10% of the downloaded files could be bad blocks and we’re still able to recover.

  7. Push them out the door and onto the usenet
    Now it gets juicy. We finally have verything together to push the backup to the usenet. For that we use Nyuu, a fast usenet upload client written in node.js.

    $ nyuu -h <your-usenet-server> -P <server-port> -S -u <username> -p <password> -s "abc_<CHECKSUM>" -o "abc_<CHECKSUM>.nzb" abc_*.part* *.par2

    Now watch the magic happen!

    After the process you’ll find an .nzb file in the working directory. Keep this along with the backup_summary.txt on a secure place.
    You can later feed this file to a usenet client and it will download your backup automatically.

Of course you will put that stuff into a script ;-)

Restore Process

Restore is pretty easy.

  1. Use the .nzb to download your backup
  2. Use sha512sum and the backup_summary.txt to verify your files are good
  3. Do cat *.part* > backup_file.tar.7z.gpg to stitch your encrypted file back up
  4. Now decrypt with GnuPG and unzip the tar and extract it.

Final Notes

I use this approach now for a few days so i’m not totally certain that your files will in the usenet forever, but most providers have a retention time for a couple of years so you should be good. But as any good hacker you should have other backups on seperate offline media anyways (also encrypted of course).