Highlander

No, this is not about the movie, but about Indy's new friend. Luca is a black-golden Highlander, a breed of cats which you can find in Wikipedia only when looking for British Longhair, a name which doesn't do them any justice. Luca is gaining weight even faster than Indy, and the moment at which Indy cannot simply wrestle him to the ground with one paw and then sit on his face is not too far. 😄

Luca1 Luca2 Luca3 Luca4

Attic

Attic is the candidate in this list of backup programs which managed to convince me entirely. It has a lot in common with the one I've written about below, i.e., Obnam: it is written in python and offers very similar features, namely, snapshot backups, data de-duplication across files and backup generations, and optional AES (instead of GPG) encryption. It's available for Debian (Jessie and Sid) and Archlinux, and for all others (including Debian Wheezy and Fedora) you can use use 'pip install attic' instead.

Attic doesn't have a separate configuration file. The following bash script handles everything:

#!/bin/bash
#https://attic-backup.org/quickstart.html#quickstart
#http://peterjolson.com/full-system-backup-using-attic-backup-to-nfs/
ionice -c3 -p$$
repository="cobra@blackvelvet:/bam/backup/attic/deepgreen"
excludelist="/home/cobra/bin/exclude_from_attic.txt"
hostname=$(echo $HOSTNAME)
notify-send "Starting backup"
  attic create --stats                                  \
    $repository::$hostname-`date +%Y-%m-%d--%H:%M:%S`   \
    /home/cobra                                         \
    --exclude-from $excludelist                         \
    --exclude-caches
notify-send "Backup complete"
  attic prune -v $repository --keep-within=1d --keep-daily=7 --keep-weekly=4 --keep-monthly=6

Attic is a little more verbose than Obnam in its reports:

Date: Thu,  4 Sep 2014 19:00:31 +0200 (CEST)
From: "(Cron Daemon)" cobra@blackvelvet.localdomain
To: cobra@blackvelvet.localdomain
Subject: "Cron" cobra@blackvelvet; /home/cobra/bin/backup.attic

Archive name: blackvelvet-2014-09-04--19:00:04
Archive fingerprint: e20c50dbdf43f9d31ee138338aab929ae78d82e4564bf6f05f55edd439a73e35
Start time: Thu Sep  4 19:00:04 2014
End time: Thu Sep  4 19:00:31 2014
Duration: 27.40 seconds
Number of files: 166064

       Original size      Compressed size    Deduplicated size
This archive:               41.71 GB             30.56 GB             21.05 MB
All archives:              995.98 GB            732.81 GB             26.54 GB
------------------------------------------------------------------------------
Keeping archive: blackvelvet-2014-09-04--09:00:01     Thu Sep  4 09:00:29 2014
Keeping archive: blackvelvet-2014-09-04--08:00:01     Thu Sep  4 08:00:31 2014
Keeping archive: blackvelvet-2014-09-04--07:00:01     Thu Sep  4 07:00:33 2014
Keeping archive: blackvelvet-2014-09-03--23:00:01     Wed Sep  3 23:00:23 2014
Keeping archive: blackvelvet-2014-09-03--22:00:01     Wed Sep  3 22:00:29 2014
Keeping archive: blackvelvet-2014-09-03--21:00:01     Wed Sep  3 21:00:29 2014
Keeping archive: blackvelvet-2014-09-03--20:00:01     Wed Sep  3 20:00:27 2014
Keeping archive: blackvelvet-2014-08-31--23:00:01     Sun Aug 31 23:00:24 2014
Keeping archive: blackvelvet-2014-08-30--23:00:01     Sat Aug 30 23:00:27 2014
Keeping archive: blackvelvet-2014-08-29--07:53:53     Fri Aug 29 07:54:13 2014
Keeping archive: blackvelvet-2014-08-28--22:31:53     Thu Aug 28 22:32:18 2014
Pruning archive: blackvelvet-2014-09-03--19:00:01     Wed Sep  3 19:00:25 2014
Pruning archive: blackvelvet-2014-09-03--18:00:01     Wed Sep  3 18:00:22 2014
Pruning archive: blackvelvet-2014-09-03--17:00:01     Wed Sep  3 17:00:26 2014
Pruning archive: blackvelvet-2014-09-03--16:00:01     Wed Sep  3 16:00:24 2014
Pruning archive: blackvelvet-2014-09-03--15:00:01     Wed Sep  3 15:00:26 2014
Pruning archive: blackvelvet-2014-09-03--14:00:01     Wed Sep  3 14:00:23 2014
Pruning archive: blackvelvet-2014-09-03--11:00:01     Wed Sep  3 11:00:23 2014
Pruning archive: blackvelvet-2014-09-03--10:00:01     Wed Sep  3 10:00:26 2014

I've decided to use Attic instead of Obnam for one main reason: speed. On the mini, the slowest member of my computing batallion, 15000 files occupying 4.7 GB take 17 s to backup for Attic, but 95 s for Obnam. On blackvelvet, the fastest by far, 165000 files occupying 42 GB take 25 s instead of 70 s.

What are the most important factors determining Attic's performance? Well, it depends. If you backup to a remote location, your upload speed may ultimately limit the overall speed. If that's not the case, since you're in a GBit LAN or backup to an internal hard disk, it still depends. If a lot of files have to be accounted for, the critical factor is the I/O performance of the mass storage device you're using. In the opposite case, Attic puts a 100% load on one core of your CPU independent of its performance class. Attic is thus utilizing the available resources quite nicely. Consequently, the performance of Attic can be roughly scaled with the file number for machines with comparable performance. For example, my lifebook here at home has 16000 files spread over 3.3 GB, and it takes Attic a mere 6 s for a snapshot. My office computer has essentially the same disk and cpu performance (Core 2 Duo class CPU, 320 GB HD), but more data: 255000 files (x15) spread over 67 GB (x20) taking 90 s (x15).

Three more remarks:

Attic's deduplication and compression schemes are apparently more efficient than those of Obnam, since the total size of the repository with a similar number of snapshots is significantly smaller. Those obtained by rsnaphot or rdiff-backup are anyway much larger since these tools do not compress the data.

A backup to a remote location is accelerated if Attic is installed on both, client and server. However, you can still backup to an atticless server using sshfs (with a small performance hit).

Finally, you can restore your data in the same way as with Obnam, namely, by mounting the repository:

attic mount cobra@blackvelvet:/bam/backup/attic/deepgreen /home/cobra/attic_mnt/

Nice.

Obnam

Obnam is one of the two backup programs in my list which I highly recommend. It offers snapshot backups (in the spirit of btrfs snapshot subvolumes), data de-duplication across files and backup generations, and optional GnuPG encryption. Archers can get it on the AUR, and Debilians may obtain the current version from the developer's repository.

Obnam is ridiculously easy to configure and use:

[config]

repository = sftp://blackvelvet/bam/backup/nb_snapshot/deepgreen/

keep = 7d,4w

lru-size = 1024
upload-queue-size = 512

log = /home/cobra/.obnam/logs/default.log
log-level = warning
log-max = 10mb
log-keep = 10
log-mode = 0600

exclude = \.cache$, \.thumbnails$, \.tmp$, /cache/, /Downloads/, /temp/, /Trash/, /VirtualBox VMs/, /wuala/

Save this file as ~/.config/obnam/default.conf, for example, modify it to your needs, and execute obnam either directly

obnam --verbose backup $HOME

or via a small shell script:

#!/bin/bash
notify-send "Backup started, please be patient..."
obnam backup $HOME
if [ "$?" -ne 0 ]; then
  notify-send "Unable to complete backup."
    exit 1
else
  notify-send "Backup successfully completed."
fi

On my notebooks, I run this script manually, but on my desktops, I've added an entry to my crontab:

0    7-23  * * *   /home/cobra/bin/backup.obnam

The cron daemon sends a mail to report what has happened:

Date: Fri, 29 Aug 2014 09:01:11 +0200 (CEST)
From: "(Cron Daemon)" cobra@blackvelvet.localdomain
To: cobra@blackvelvet.localdomain
Subject: "Cron" cobra@blackvelvet /home/cobra/bin/backup.obnam

Backed up 77 files (of 187140 found), uploaded 63.0 MiB in 1m9s at 934.9 KiB/s average speed

That's a snapshot of my desktop with a total backup volume of 42 GB. Without the two lines in the config customizing the lru- and upload-queue sizes, a snapshot takes about 10 min, i.e., nine times longer. This mediocre performance with the default settings is certainly one of the reasons for the numerous reports of obnam being nice but slow. The speedup obtained by changing these settings, however, depends on your hardware: on the Mini, obnam is CPU limited, and it takes 90 s for one snapshot (of 5 GB size) no matter what the lru- and upload-queue sizes.

Restoring data is as easy as creating the backup. You can simply mount the backup, like that:

obnam mount --to /bam/obnam_mnt/

and access the resulting read-only filesystem with anything you like. Nice.

Better backup

There's no such thing as too many backups. An example.

A guy I know (lets call him user C) amused himself by playing with an incremental backup scheme based entirely on tar:

#!/bin/bash
# Based on ideas of Alessandro "AkiRoss" Re
#http://blog.ale-re.net/2011/06/incremental-backups-with-gnu-tar-cron.html
MONTH=$(date +%Y%m)
DAY=$(date +%Y%m%d)
BCKDIR=/bam/backup
ARCDIR=/bam/backup/archives
SRCDIR=/bam/backup/snapshots/deepgreen
TARGETFILE="$ARCDIR/nb_snapshot_$DAY.tar.zz"
LOGFILE="$ARCDIR/nb_snapshot_$MONTH.snar"
EXCLUDEFILE="$BCKDIR/nb_socketlist"
tar -c -X $EXCLUDEFILE -g $LOGFILE -f - $SRCDIR | pigz --fast --rsyncable --zlib > $TARGETFILE

The script worked well, and the backups kept piling up. User C realized that he needed a mechanism to restrict the time span over which backups are kept, and he thus added the following line to the end of the script to implement this mechanism:

find $ARCDIR -type f ! -newermt "1 month ago" -print0 | xargs -0 rm -f

Well, actually ... the line user C really used was ... hmmm ... slightly different. Instead of referring to' $ARCDIR', he addressed '.'.

Ooops.


User C is of course my own dumb self. But dumb or not, my backup concept has worked: I had a backup taken minutes before the unintentional purge of my entire home directory, and was able to resume normal operation within the hour. This hourly incremental backup resides on an internal disk different from the one holding my system and home partitions, and is synchronized every night with my NAS.

Now, let's get technical. How do I ensure that I have a backup when I need it?

That's really very simple: I use software which makes the task to create backups and to restore them as foolproof as possible. And that means:

(i) The backup software is command-line oriented and thus fits seamlessly into a cron- and script-based infrastructure, making automation of the whole process a breeze.
(ii) The backup software's configuration is script-based and thus transparent and straightforward to revise, keep, and document.

For several years now, I've used rsnapshot at home and rdiff-backup at the office. To save the home directories of my notebooks, I transfered them to my desktop using plain rsync. The above tar script then created an incremental backup also for these directories. That's a very heterogeneous and altogether outdated solution, and although it has served me well for years, I've recently decided to give my backup scheme a complete overhaul. In particular, I wanted one solution covering all my use cases. To my delight, I've discovered that a number of new backup programs for Linux are actively developed at present, some of which boast features such as global deduplication, efficient compression, and optional GPG encryption.

Here's a (certainly incomplete) list of contenders I will examine in the near future:

attic
backshift
burp
hashbackup
obnam
zbackup

In any case, when the day ends, and all hourly backups have been done, I sync them to the NAS:

#!/bin/bash
rsync -az /bam/backup/ cobra@thecus::snapshot_blackvelvet

I know, I know: if the house burns down, I'm going to lose this replica of my backup as well. Gimme a 100 Mbit/s upload. Then we'll talk.

Get to know your machine

I was recently asked by a *buntu user why the games he'd like to play "stutter". I've asked him which graphic card he has, but he had no idea, nor did he know where to look.

There are several ways to get this information, including the canonical 'lspci' and 'lshw' as well as 'hwinfo'. By far the easiest, however, is inxi. Graphic card and driver are obtained by 'inxi -G', but even the full information delivered by 'inxi -F' is delivered in a way which is pleasing to eye and mind:

Inxi

On Arch, inxi is in the community repository. On Debian Wheezy, you can get inxi as described here.

Bigiron 2

Bigiron was phased out at May 1st after a long and productive life. Its successor is on the way, and as I've predicted previously, it's no longer powered by AMD:

Osmium

I'd prefer to run Debian Wheezy on this newcomer just as on Bigiron and Tungsten, our other workstation, but some software we intend to use may not install on a Debian system. Anyway, I'll post benchmarks as soon as Bigiron 2 aka Osmium is up and running.

Browsers on steroids

In May 1999, Intel advertised their Pentium III with 550 MHz to be designed for the internet. We all couldn't stop laughing since even our existing hardware could saturate the modems or (at best) ISDN cards many times over. My laughter got stuck in my throat after I got my first DSL (768 kbit/s, i.e., 12 times ISDN!) in July of that year. It was like going from Doom to Quake on a Voodoo card. A revolution, a breakthrough, something to be witnessed!

At present, I have an average and completely unremarkable ADSL2+ connection with a bandwidth not much above 12 Mbit/s down (and 1 Mbit/s up 😞 ) and a latency never below 40 ms. Compared to my office, these numbers are mediocre: the connection there offers a bandwidth of 100 MBit/s (up and down) and a latency of 4 ms. That should guarantee a very pleasant internet experience, wouldn't you think so?

Alas, all of our http traffic is passed through the content scanners of our Cisco gateway, and the resulting browsing experience is comparable to the one I had on my trusted Elsa Microlink 56k modem. Often even worse. But that's another story.

At home, I'm limited by the high latency of my connection rather than by its comparatively low bandwidth. Current web sites link to a myriad of other sites, resulting in an avalanche of DNS requests each of which takes time (yes, my Fritz!Box has a tiny DNS cache, but it is just that: tiny).

Most of these secondary sites are related to ads, which we don't want to see (or hear). Every halfway IT literate person thus uses adblockers. Alas, these indispensable extensions may surpass their host in memory consumption. I don't care about that at all on my desktop with its 16 GB of RAM, but what I'm going to do when I want to couch-surf with my teeny-tiny mini?

Well, obviously: I use the desktop with its excessive resources as server and proxy. To start with, the desktop runs a caching DNS server for my LAN. Tests with namebench show that about half of my requests are cache hits and are thus virtually instantaneous, resulting in a very noticeable speedup on all clients. In addition, the desktop serves as a filtering and caching proxy by passing the traffic of all clients through a privoxy-polipo chain. The latter is simply a global replacement of the browser disk cache (which thus should be disabled). The former is a filtering proxy removing the majority of ads, banners, and other annoyances from the pages we visit. After this step, a local adblocker taking care of the rest has little to do and takes few resources.

Sitting on the couch, next to my wife with her Nexus 7. She: "Hey look, go to this address!" Me: /type/.../tschadong/. She: "Why didn't you say that you had it already open?"

Browsing the web has never been snappier.

Squandering

hexagon with three rectangles

A collaborator tries to send me a sketch of three rectangles in a hexagon. The file, however, is too large for our mail server which is configured to only accept attachments smaller than 100 MB. I inform him that such a drawing saved as a vector graphic should be just one or two kB in size, and send him this example:

He replies:

"The new Figure was drawn in AutoCAD, which does not surport (sic) EPS well. So we saved it as TIFF."

Uncompressed (and with a ridiculously high resolution), of course. m(

At the first glance, this incidence may appear to be a classical example for Maslow's law: if all you have is a hammer, everything looks like a nail. But it's rather the combination of good intentions and essentially complete ignorance which is responsible for the sad result.

(i) "Dr. B always told me to use vector graphics when making a sketch. Instead of this open source crap he always advocates, let's use professional software!"
(ii) "Hm, the EPS export doesn't work! [1] What I'm going to do?"
(iii) "Let's save it in one of these other formats. Ah, JPEG...oh no, Dr. B doesn't like that for some reason. Something about compression."
(iv) "TIFF, I know that too! And here, it states 'uncompressed', that's just perfect!"
(v) "Better to chose a really high resolution, so Dr. B will never see the difference. Gnihihihihi!"

[1] PEBKAC: the eps export of AutoCAD works flawlessly, of course.

The guy has a PhD in physics, and yet the five orders of magnitude in file size somehow didn't register with him.

Why do I care? Well, for one, I like things to be done efficiently and professionally, particularly at work where I do not intend to waste my valuable time. Second, people trying to send 100 MB+ files by e-mail are the same who complain about disk space all the time. Their 1 TB hard drive is "WAY TOO SMALL !!" Their 1 GB mail quota is "RIDICULOUS !!!!" Their 5 GB owncloud space is ... well, you get the drift.

These people, which mostly belong to the 'Generation Smartphone', also seem to have an overwhelming need to access all of their data anytime and everywhere, no matter how old and irrelevant these data may be. As a result, the amount of data we have to backup quadrupled over the past four years, and threatens to double again in two years. We are currently evaluating a possible solution for this problem, but whatever we do, it will be significantly more cost-intensive than imagined by the "why-don't-you-just-buy-a-3TB-drive-for-me" faction.

Timeline

Beispiel

Einmal CLI, immer CLI. 😉

Does this choice imply that I'm a hopeless relict from the past, a dinosaur who can't deal with the changes that took place over these three decades? Not at all. It is my conviction that the CLI is the only complete and consistent interface between computers and humans that has been developed to date. Nobody has expressed that better than Luke in his insightful entry on the Command Line and User Friendliness:

The CLI is much closer to the way we operate in real life – it is a conversational user interface. You “speak” to the computer and it responds back to you. It is the most intuitive, most natural and easiest to grasp type of UI we have invented so far.

Focusing on the command-line and command-line based applications has a beneficial side effect: it's good for my health. Many users get all excited about changes in the GUI of the OS or the applications they are using. Think about Windows 8, or MS Office and the ribbons. Or KDE4, Gnome 3 and Unity in the open-source world.

I usually don't care about these changes at all. I don't have to, since I don't depend on GUIs: I do all essential things using the shell and an editor. My last publications, in particular, were produced by LaTeX and pylab, with code I wrote: in an editor.

Whenever I have to open figures created by coworkers in Origin, I get a lesson in anti-computing. I won't dwell on the fact that the virtual Windows 7 needs to update at this very moment, and exhibits the responsivity of Siberian honey from the stone age. No, just Origin alone is fascinating enough. If we had a contest called "Maximize the number of buttons per dialog", the developers of Origin would win single-handedly. In any case, these excursions drive me crazy within minutes, and I'm eternally grateful that I usually don't need to click a hundred buttons in a dozen different sub-menus, but can simply type what I want.

Lightweight browsers

I've stopped using Opera as my main browser when the development of Presto was terminated more than a year ago. It seems that there are not too many alternatives left, in particular for users of Linux: among the big five, only Firefox and Chromium run on the open-source operating system. The latter lacks functions important for me, and I thus grudgingly decided to get used to Firefox. Not without several extensions, of course. Besides those used by 99% of the 1% users using extensions at all, I use Pentadactyl and find it indispensable in everyday use (hey, try to squeeze more usages of 'use' into one sentence).

In any case, the fox has grown fat and gets even fatter with all those extensions. Wouldn't it be nice to find a lightweight alternative with an analogous, vi based UI but with more modest requirements concerning computing resources? It sure would.

Here's a list of the alternative browsers I have encountered so far. The list is certainly not complete — far from it. I've marked projects which I believe to be abandoned with a dagger. I've also added the engine the browser is using, the mode it is operating in primarily, and the Archlinux repository in which it is contained.

arora† webkit mouse Extra
conkeror gecko keyboard (emacs) AUR
dillo independent mouse Community
dwb webkit keyboard (vi) Community
elinks independent keyboard Community
jumanji† webkit keyboard (vi) AUR
links independent keyboard Core
luakit† webkit keyboard (vi) Community
lynx independent keyboard Extra
midori webkit mouse Community
netsurf independent mouse Community
rekonq webkit mouse Community
surf webkit keyboard Community
vimb webkit keyboard (vi) AUR
vimprobable2 webkit keyboard (vi) AUR
uzbl† webkit keyboard (vi) Community
w3m independent keyboard Extra
xombrero webkit keyboard (vi) AUR

Conkeror really stands out in this list since it is kind of the antithesis to the myriad of webkit browsers with a vi-like interface based on either qt or gt: it uses XULrunner and offers emacs keybindings. A must for all emacs aficionados!

All of the webkit-based browsers are able to connect to https://fancyssl.hboeck.de, as are lynx, links and w3m. Why is that remarkable? Well, neither Firefox nor Chrome can cope with the high security settings of this page, and I thus thought that dwb would make an excellent replacement for Opera which I still use in a virtual machine for onlinebanking. Alas, all webkit-based browsers in the Archlinux (and Debian Jessie) repositories are plagued by a bug which makes them segfault on certain transport-encrypted sites, such as, for example, https://ing-diba.de. I'm currently trying to zero in on the actual origin of this bug.

Update: Posativ pointed out that the support of advanced encryption standards in webkit-based browsers doesn't mean that they won't connect to a server insisting on the RC4 cipher. That's absolutely true, unfortunately, as demonstrated by a visit of this site. If you'd like to examine the server instead of the client, go here.

And since webkit-based browsers are currently plagued by a myriad of bugs in the libraries they depend on, I've decided to stick to the big five for the moment even in my spartan virtual machines running wmii as window manager. I've settled (rather arbitrarily) on Chromium for online banking and on Iceweasel for the VPN. The few ads I may encounter in the latter case are blocked by privoxy in medium setting. Chromium doesn't offer any obvious configuration concerning proxies and cipher suites, but respects the following command-line parameters which disables the use of RC4:

chromium --proxy-server=localhost:8118 --cipher-suite-blacklist=0x0001,0x0002,0x0004,0x0005,0x0017,0x0018,0xc002,0xc007,0xc00c,0xc011,0xc016,0xff80,0xff81,0xff82,0xff83