Backups: Deja-Dup and Obnam

At this point, everyone in my social circle knows that I’m obsessed with backups. I value my data, and I want to know that I have another copy available to me in the event of theft, fire, flood, disk failure, etc. I also value my privacy as well as the integrity and authenticity of my data. In other words, I want a comprehensive encrypted backup of my digital assets.

Over the past 2 years, I’ve been using Deja-Dup (a GUI front-end to duplicity) to create automatic encrypted backups of my user data, and it’s been great. It’s easy to configure what files to include and exclude. It uses GPG to provide symmetric encryption, and optionally stores the passphrase in the Gnome keyring which is stored in an encrypted format and only decrypted when you login to your account. It also runs as a daemon in the background, so it can initiate backups automatically without manual intervention or user cronjobs.

My only complaint about Deja-Dup is that it’s tied to a single location. Over the past few months, I’ve wanted to rotate external hard drives, but Deja-Dup doesn’t have a mechanism to seamlessly allow this. While I’ve been communicating with the maintainer of the program, I’m not entirely sure that Deja-Dup will meet my current use case. While I plan to keep using Deja-Dup as one of my tools for local backups, I think I’ll have to use something else for my rotating backups.

I started interviewing friends and colleagues about their backup strategies, and that’s how I learned of Lars Wirzenius’s tool “Obnam”. It doesn’t have a GUI but I adore the command-line, so it’s really all the same to me. It’s easy to configure using INI style files. It uses GPG to provide assymetric encryption (while this requires a GPG keypair and not just a passphrase, it’s less susceptible to brute force attacks than symmetric encryption). In other words, it’s comprehensive and encrypted.

My only complaint after looking into it was that you couldn’t automate it as easily as Deja-Dup. With Obnam, you could set up a cronjob to routinely backup your data. However, that would require you to use a GPG key without a passphrase, which is suboptimal. If someone got their hands on your GPG key and your encrypted backup, they’d have unfettered access to your data. Of course, you could argue that if they have your GPG key, they might already have access to your unencrypted data and not *need* your backups. Alternatively, you could probably use a GPG key with a passphrase, and just store that passphrase in a file and feed that into your cronjob. However, it’s the same problem. You have the keys to the kingdom written down, which makes it that much less secure. Of course, if you’re willing to sacrifice some security for convenience, then why not?

It’s a tough one. As one interviewee mentioned, you have to consider your threat model. Lars also talks about this in the Obnam manual: http://code.liw.fi/obnam/manual/obnam-manual.en.html#backup-strategies

In my mind, there are a few scenarios:

1) Access to your unencrypted data on your computer = insecure
2) Access to an encrypted backup and a GPG key without a passphrase = insecure
3) Access to an encrypted backup and a GPG key with a recorded passphrase = insecure
4) Access to an encrypted backup and no GPG key = secure
5) Access to an encrypted backup and GPG key with passphrase = (reasonably) secure

In the first case, the attacker already has access to your system, so any speculation about backups is moot.

In the latter cases, the attacker has gained access to your encrypted backup either by obtaining a physical copy (stealing a physical disk) or illegitimately accessing a backup server containing the encrypted data. As for how they have obtained a copy of your key, there are a few ways. Perhaps there was a copy of your key on the backup server. Perhaps they’ve found a backup of your key (although you should protect this with encryption as well). Suffice it to say, it’s possible that they have your GPG key and your encrypted backup. However, if you have a key with a passphrase and you haven’t written that down, you’re as secure as can be.

When I asked my interviewees (including Lars Wirzenius himself) about how they use Obnam, they stated that they often didn’t use automated backups with Obnam. They used automated alarms and other methods to remind themselves to run their Obnam backup scripts using their passphrase protected GPG keys.
I think there’s some merit to this idea. First, as I’ve mentioned, it’s the most secure method. If your passphrase is just in your head, then it doesn’t matter if someone gets your private key and your encrypted backup. Second, it makes you much more active in backing up your data. While humans are more prone to forget manual backups than a computer is to forget an automatic backup, it’s useful to develop a habit of consciously backing up your data. This mindset translates well to the work place and to other devices which you might not be backing up as you should.

In any case, I now have Obnam working using a GPG keypair with a strong passphrase. While I haven’t decided how I want to prompt myself to remember to initiate manual backups, I’m sure I’ll think of something.

Current ideas include:
1) Automated alerts via a calendar
2) A highly visible launcher on the desktop (or perhaps on the sidebar in Unity once I transition fully to Ubuntu)
3) A GUI pop-up reminder as part of a login or logout script
4a) A daemon or cronjob that runs automatically but requires manual intervention via gpg-agent to obtain the passphrase
4b) A daemon or cronjob that runs automatically which uses gpg-agent and another mechanism to provide the passphrase which has been saved but stored in an encrypted format

In the end, I think it all comes down to convenience vs security. In any case, I’m quite liking Obnam so far. If you’re thinking about how to do your backups better on Debian or Ubuntu, you should give it a try!

Open Source Software for Libraries: Experiments

Since I first arrived at Prosentient Systems in January 2012, I’ve been working on and experimenting with open source software for libraries. Initially, I was somewhat hesitant, since my background is in English, French, and Library and Information Studies. However, the more I’ve learned and experimented, the more I’ve wanted to continue working with open source software!

To that end, I recently purchased a new desktop computer (my first since 2003). With a 2.6ghz and 4gb of RAM, it’s reasonably fast enough to do whatever I might want to do.

Step 1: Install Debian (i.e. Linux)

DONE!
(This involved installing the base system, adding myself as a sudo user, getting the sound working [I like working to music] by manually editing the alsa-base.conf to detect my integrated Intel sound card, installing Chromium as a browser, setting up some firewall software, and installing vim and vim-runtime [as Debian only comes with vim-common and vim-tiny files which aren’t quite enough for my text editing needs].)

(I’ve also thought about setting up an SSH server so I can also remote into my Linux box from my Windows netbook using PuTTY, but…I’m willing to put that off for the time being.)

Step 2: Install Koha using the community-generated Debian packages

IN PROGRESS

(Well, not quite. It’s 9:11pm, so it’s Doctor Who time. However, later in the week, I’m going to give it a go using the instructions available at koha-community.org.

Since I’m already a fairly active Koha developer, I’m pretty familiar with the code, and I’ve already done quite a bit of troubleshooting, so I’m not really worried about this. I can handle MySql (the database). I can handle Zebra (the indexing engine). Admittedly, to date, I’ve only done an assisted standard install and an assisted dev install (since I didn’t have root access to the server). However, I’m pretty confident that I can get Koha up and running. Actually, after I do a standard install via the packages, I might set up a regular dev install and a standard install (from the Git clone)

I might also try installing from a downloaded Tarball, as well as setting up a dev install using “Koha Gitify”.

There are lots of different ways of obtaining Koha code and setting up an instance, and I want to try them all!)

So thinking again about the knowledge that I might need to set up Koha…I like to think again of the LAMP acronym.

L->Linux (I’ve installed Debian, and I’m reasonably proficient using the command line)
A->Apache (I see how Koha uses apache, I’ve gone through the files, and I’ve set up my own web server in the past using apache, so…it might take a few tries, but it’ll be all right)
M->Mysql (it’s a relational database. While I typically interact with it using a GUI, I’m sure I can handle it from the command-line as well. The GUI might be a bit faster and provide easier scrollback, but I can also install one if I really want to.)
P->Perl (While I still want to improve my design skills, I have yet to meet a Perl script/module in Koha that I haven’t been able to understand [thanks to my own persistence and the generous help of some very skilled and amicable Koha community developers]. Given time, all the code in Koha is understandable.)

I doubt that everyone setting up Koha is going to need to have an in-depth knowledge in all these areas, but I’m sure it helps. Hopefully, it will mean I don’t have to hassle people in #koha too much ;).

Step 3: Install DSpace

EVENTUALLY

(While I have quite a bit of experience modifying the DSpace JSPUI and some of its Java classes, I don’t have extensive experience writing Java, compiling Java, working with Postgresql, or troubleshooting DSpace. So…this might be a project I leave for a little while. I’m keen, but I’m more active in the Koha community and I find that Koha is much more relevant to the majority of libraries than DSpace.)

Step 4: Who knows?

I’m thinking of trying out lots of different systems. Here is a list that I’m pondering:

1) Archivematica (originating from Vancouver, BC – it is used for digital preservation)
2) VuFind (a PHP-based discovery layer for library applications)
3) WordPress (as a Content Management System)
4) Evergreen (the open source library management system/integrated library system)
5) Drupal (the CMS)
6) Islandora (digital library/archive)
7) Fedora Repository (digital library/archive)
8) Greenstone (digital library/archive)
9) Kete (digital library/wiki?)

Does anyone have any ideas about other open source software for libraries that I haven’t mentioned and that might be worth trying out?

While I only have Linux on this machine, I could create another partition for Windows XP (I still have my old desktop install disk laying around) or I could set up a Windows XP VM in Virtual Box. So…send me a message, post a comment, or give me a shout and let me know anything else I should try.

For now…Doctor Who Series 7 Finale!

Ghostery, Adblock+, etc.

Had a colleague getting some Javascript security errors when trying that HTML5 game…

Obviously, the first suspects were browser security plugins, which reminded me that I haven’t looked extensively at all the different ones out there like Ghostery, Adblock, NoScript, Requestpolicy, etc.

I’ll have to do that sometime down the line (hmm, so many ideas, so little time). Until then, here is a short discussion about ghostery, adblock, and some others…

Ghostery

Looks like it might be an interesting blog for security in general…

The Guide on the Side

For all of you instructional librarian folk or anyone else doing online tutorials:

You might want to take a look at “The Guide on the Side”. It is software that allows you to add interactive (and easy to create) tutorials to your library resource web pages!

[Note: If you don’t have access to your own server, you’ll need someone to host the actual software for you it seems.]

Check out the University of Arizona’s JSTOR tutorial:http://bit.ly/zA9DCf

http://americanlibrariesmagazine.org/columns/practice/guide-side