Recently I decided I should make a permanent backup of my files on to blank DVD discs just in case my backup hard drive fails. I had the following requirements
- The data should be compressed but performance should be favoured over compressed sized.
- The data should be encrypted because some of the data I have I consider private information that I don’t want others to be able to access (e.g. gpg private key). However I can’t use my gpg private key for encryption because one of the things I am backing up is my gpg private key and in the event of hard drive failure I would not be able to decrypt my backup.
- The data is ~14GB in size so the backup needs to be split across multiple discs.
I thought I’d share my solution to this problem. I satisfied the above requirements by
- Using tar to collect my files and directories into one file and gzip for compression. Although bzip2 provides a smaller compressed file size it is significantly slower than gzip.
- UsingĀ gpg‘s symmetric encryption/decryption that uses a passphrase for encryption/decryption instead of a private and public key set for encryption/decryption.
- Using the split tool to break the encrypted compressed archive into 4699MB chunks so that I could burn these chunks on to single layer DVD+R discs.
I will run through the individual steps required to encrypt and decrypt. The majority of the steps are on the command line. I will assume that you are using BASH as your terminal shell.
Creating the backup
- We’ll create the compressed archive containing one or more directories that we wish to backup. Run the following command where backup.tar.gz is the name of the archive you wish to create and /path/to/folder is the path to a folder you wish to add to the compressed archive.The -z option instructs tar to compress the archive using gzip. The -v option will show what file/directories are being added to the compressed archive as it is being created. The -p option preserves file permissions.
$ tar -cvpzf backup.tar.gz /path/to/folder/
You can add multiple files/folders to archive in one go when creating the archive. An example command is shown below
$ tar -cvpzf backup.tar.gz /path/to/folder1/ /path/to/folder2/ path/to/file
If you want to see what files/directories are in your compressed archive run the following command where backup.tar.gz is the name of the archive you created.
$ tar -tvf backup.tar.gz
Be careful, absolute file paths go “into” the archive, e.g. using /path/to/folder will recreate the folder structure path/to/folder inside the archive. Use relative file paths or use theĀ --strip-components option on absolute paths (see man page for tar) instead.
- Now we’ll encrypt our compressed archive using gpg’s symmetric encryption which uses a passphrase instead of a public and private key set. To do this run the following command where backup.tar.gz is the name of the compressed archive made in the previous step and backup.tar.gz.gpg is the name for the encrypted compressed archive that we wish to create.
$ gpg --enable-progress-filter --status-fd=0 --compress-algo uncompressed --output backup.tar.gz.gpg --symmetric backup.tar.gz
You will be asked to enter a passphrase and then confirm it. Do not forget this passphrase because without you will NOT be able to decrypt your backup.
By default the CAST5 algorithm is used for encryption but a different algorithm can be specified using the --cipher-algo option. Run gpg --version for a list of supported algorithms.
By default gpg will compress whatever it is encrypting, we don’t want this to happen because our archive has already been compressed. To prevent gpg doing compression when encrypting the option --compress-algo uncompressed is specified.
The --enable-progress-filter --status-fd=0 options allow progress information to be shown. It will appear similar to what is shown below
PROGRESS backup.tar.gz ? 12181504 104857600
What this information means is documented in doc/DETAILS available with the GNU gpg source code. Essentially the first number is the number of bytes processed so far and the next number is the total number of bytes to process.
- Now we’ll split the encrypted compressed archive (backup.tar.gz.gpg) by running the following command where backup.tar.gz.gpg. is the prefix that is used for the filename of each chunk.
$ split --numeric-suffixes -b 4699MB backup.tar.gz.gpg backup.tar.gz.gpg.
This will split backup.tar.gz.gpg into chunks (the original file will be kept) of size 4699MB (see info split for the available multipliers).
Once this command has completed the result can be seen by running the following command
$ du --si * 4.7G backup.tar.gz.gpg.00 4.7G backup.tar.gz.gpg.01 2.7G backup.tar.gz.gpg.02
- Now each of the chunks can be burned on to a single layer DVD+R disc. This is the one step I prefer not to use command line tools for and I chose to use k3b for the job. I advise that you instruct whatever disc burning software you use to verify the discs it burns. I also advise you make multiple copies of the discs so that if one is damaged you can still get your data!
The previous steps can be joined together using pipes and is illustrated below. The progress information isn’t particularly useful though as the total number of bytes to be encrypted is not known to gpg.
$ tar -czpv /path/to/folder | gpg --enable-progress-filter --status-fd=2 --compress-algo uncompressed --symmetric | split --numeric-suffixes -b 4699MB - backup.tar.gz.gpg.
Accessing the backup
- Transfer the split chunks on to your machine by copying them from the DVD discs they were burnt onto to your computer’s hard drive.
- Now we’ll rejoin the chunks. First switch to the directory you copied the chunks to then run the following command
$ cat backup.tar.gz.gpg.* > backup.tar.gz.gpg
- Next we’ll decrypt backup.tar.gz.gpg by running the following command where backup.tar.gz is the decrypted compressed archive.
$ gpg --enable-progress-filter --status-fd=1 --output backup.tar.gz -d backup.tar.gz.gpg
Note you will be asked for the passphrase you used to encrypt with originally.
- You can now check the contents of the decrypted compressed archive by running the following command
$ tar -tvzf backup.tar.gz
If you wish to extract the files from the archive you can run the following command where /path/to/extract/to is the directory in which the extracted files and directories will be put.
$ tar -xvzf backup.tar.gz -C /path/to/extract/to
The previous steps can be joined together using pipes and is illustrated below.
$ cat backup.tar.gz.gpg.* | gpg --enable-progress-filter --status-fd=2 | tar -xvzf - -C /path/to/extract/to
Note that the gpg’s progress display doesn’t seem to work at all. I’m not sure why
I hope someone finds this useful.
