Opposite extremes

I have a CentOS virtual machine because I had to install it on a compute server in my office, and I keep it since it's such an interesting antithesis to the rolling release distribution I prefer for my daily computing environment. CentOS is by a large margin the most conservative of all Linux distributions, and it's sometime useful for me to have access to older software in its natural habitat. Just look at this table comparing the versions of some of the major packages on fully updated Arch and CentOS 7 installations:

                Current                         Arch                            CentOS
linux           5.1.15                          5.1.15                          3.10
libc            2.29                            2.29                            2.17
gcc             9.1.0                           9.1.0                           4.8.5
systemd         242                             242                             219
bash            5.0                             5.0                             4.2
openssh         8.0p1                           8.0p1                           7.4p1
python          3.7.3                           3.7.3                           2.7.5
perl            5.30.2                          5.30.2                          5.16.3
texlive         2019                            2019                            2012
vim             8.1                             8.1                             7.4
xorg            1.20.5                          1.20.5                          1.20.1
firefox         67.0.1                          67.0.1                          60.2.2
chromium        75.0.3770.100                   75.0.3770.100                   73.0.3683.86

You can easily see why I prefer Arch over CentOS as a desktop system.

But CentOS has it's merits, particularly for servers. There's no other distribution (except, of course, its commercial sibling RHEL) with a longer support span: CentOS 7 was released in July 2014 and is supported till end of June 2024. And that's not just a partial support as for the so-called LTS versions of Ubuntu.

Now, I've noticed that CentOS keeps old kernels after updates, befitting its highly conservative attitude. However, in view of the very limited hard disk space I typically give my virtual machines (8 GB), I got a bit nervous when I saw that kernels really seemed to pile up after a few updates. There were five of them! Turned out that's the default, giving the word “careful” an entirely new meaning.

But am I supposed to remove some of these kernels manually? No. I was glad to find that the RHEL developers had already recognized the need for a more robust solution:

yum install yum-utils
package-cleanup --oldkernels --count=3

And to make this limit permanent, I just had to edit /etc/yum.conf and set


Well thought out. 😉

What you don't want to use, revisited

A decade ago, I advised my readers to stay away from OpenOffice for the preparation of professional presentations, primarily because of the poor support of vector graphics formats at that time. In view of the difficulties we have recently encountered when working with collaborators on the same document with different Office versions, I was now setting great hopes in LibreOffice for the preparation of our next project proposal. First of all, I thought that using platform-independent open source software, it should be straightforward to guarantee that all collaborators are using the same version of the software. Second, the support for SVG has been much improved in recent versions (>6) of LibreOffice, and I believed that we finally should be able to import vector graphics directly from Inkscape into an Office document. Third, the TexMaths extension allows one to use LaTeX for typesetting equations and to insert them as SVG, promising a much improved math rendering at a fraction of the time needed to enter it compared to the native equation editor. Fourth, Mendeley offers a citation plugin for LibreOffice, which I hoped would make the management of the bibliography and inserting citations as simple as with BibTeX in a LaTeX document.

Well, all of these hopes were in vain. What we (I) had chosen for preparing the proposal (the latest LibreOffice, TexMaths extension, and Mendeley plugin) proved to be one of the buggiest software combos of all times.

ad (i): Not the fault of the software, but still kind of sobering: our external collaborator declared that he had never heard about LibreOffice, and that he wouldn't know how to install it. Well, we thought, now only two people have to stay compatible to each other. We installed the same version of LibreOffice (first Still, than Fresh), I on Linux, he on Windows. But the different operating systems probably had little to do with what followed.

ad (ii): I was responsible for all display items in the proposal, and I've used a combination of Mathematica, Python, Gimp, and Inkscape to create the seven figures contained in it. The final SVG, however, was always generated by Inkscape. I've experienced two serious problems with these figures. First, certain line art elements such as arrows were simply not shown in LibreOffice or in PDFs created by it. Second, the figures tended to “disappear”: when trying to move one of them, another would suddenly be invisible. The caption numbering showed that they were still part of the document, and simply inserting them again messed up the numbering. We've managed to find one of these hidden figures in the nowhere between two pages (like being trapped between dimensions 😱), but others stayed mysteriously hidden. We had to go back to the previous version to resolve these issues, and in the end I converted all figures to bitmaps. D'Oh!

ad (iii): I wrote a large part of my text in one session and inserted all symbols and equations using TeXMaths. Worked perfectly, and after saving the document, I went home, quite satisfied with my achievements this day. When I tried to continue the next day, LibreOffice told me the document is corrupted, and was subsequently unable to open it. I finally managed to open it with TextMaker, which didn't complain, but also didn't show any of the equations I had inserted the day before. Well, I saved the document anyway to at least restore the text. Opening the file saved by TextMaker with Writer worked, and even all symbols and equations showed up as SVG graphics, but without the possibility to edit them by TeXMaths.

ad (iv): Since my colleague had previously used the Mendeley plugin for Word, it was him who had the task to insert our various references (initially about 40). That seemed to work very well, although he found the plugin irritatingly slow (40 references take something like a minute to process). However, when he tried to enter additional references a few days later, Mendeley claimed that the previous one were edited manually, displayed a dialogue asking whether we would like to keep this manual edit or disregard it. Regardless the choice, the previous citations were now generated twice. And with any further citation, twice more, so that after adding three more citations, [1] became [1][1][1][1][1][1][1][1]. The plugin also took proportionally longer for processing the file, so in the last example, it took about 10 min. Well, we went one version back. But what worked so nicely the day before was now inexplicably broken. It turned out that a simple sync of Mendeley (which is carried out automatically when you start this software) can be sufficient for triggering this behavior. We finally inserted the last references manually, overriding and actually irreversibly damaging the links between the citations and the bibliography.

In the final stages, working on the proposal felt like skating on atomically thin ice (Icen 😎). We always expected the worst, and instead of concentrating on the content, we treated the document like a piece of prehistoric art which could be damaged by anything, including just viewing the document on the screen. That feeling was very distracting. I would have loved to correct my position, really, but LibreOffice in its present state is clearly no alternative to LaTeX for preparing the documents and presentations required in my professional environment. I will check again in another ten years. 😉

In principle, I would have no problem with being solely responsible for the document if I could use LaTeX and would get the contribution from the collaborators simply as plain text. It is them having a problem with that, since they don't know what plain text is. In this context, I increasingly understand the trend to collaborative software: it's not that people really work at the same time, simultaneously, on a document, but it's the fact that people work on it with the guaranteed same software which counts.

Functions with default values

Suppose you would like to have a command generating a secure password for an online service at the command line. You would google for that and find 10 ways to generate a random password. At the end of his article, the author presents the ideal way to generate a secure password:

date | md5sum

The author of the article (Lowell Heddings, the founder and CEO of How-To Geek) states:

I’m sure that some people will complain that it’s not as random as some of the other options, but honestly, it’s random enough if you’re going to be using the whole thing.

Random enough? Sure, the 'whole thing' looks random enough:


but the look is deceptive: this is in fact an extremely weak password. To understand why, let's look at the output of the 'date' command:

↪ date
Sun 12 May 2019 04:33:26 PM CEST

We see that without additional parameter (like +"%N"), 'date' gives us one password for each second of the year. How many passwords do we get in this way? Well,

 ↪ date +"%s"

i.e., 1,557,666,649 seconds has passed since 00:00:00, Jan 1, 1970 (Unix epoch time), and that's how many passwords we get.

Now, the possibility to order Pizza online came much later, namely, at August 22nd, 1994.

↪ date -d 19940822 +"%s"

That leaves us with 780,160,249 passwords since this memorable day in 1994 or a complexity of 30 bits, corresponding to a 5-digit password with a character space of 62. Let's get one of these and see how difficult it is to crack:

↪ pwgen -s 5 -1

Now, even my ancient GTX650Ti with its modest MD5 hashing performance of 1.5 GH/s cracks this password in 5 s (note that an RTX2080 delivers 36 GH/s...):

○ → hashcat -O -a 3 -m 0 myhashes.hash ?a?a?a?a?a
hashcat (v5.1.0) starting...

OpenCL Platform #1: NVIDIA Corporation
- Device #1: GeForce GTX 650 Ti, 243/972 MB allocatable, 4MCU


Session..........: hashcat
Status...........: Cracked
Hash.Type........: MD5
Hash.Target......: 0b91091d40a8623891367459d5b2a406
Time.Started.....: Mon May 13 12:48:58 2019 (5 secs)
Time.Estimated...: Mon May 13 12:49:03 2019 (0 secs)
Guess.Mask.......: ?a?a?a?a?a [5]
Guess.Queue......: 1/1 (100.00%)
Speed.#1.........: 514.0 MH/s (6.21ms) @ Accel:64 Loops:47 Thr:1024 Vec:2
Recovered........: 1/1 (100.00%) Digests, 1/1 (100.00%) Salts
Progress.........: 2328363008/7737809375 (30.09%)
Rejected.........: 0/2328363008 (0.00%)
Restore.Point....: 24379392/81450625 (29.93%)
Restore.Sub.#1...: Salt:0 Amplifier:0-47 Iteration:0-47
Candidates.#1....: s3v\, -> RPuJG
Hardware.Mon.#1..: Temp: 44c Fan: 33%

But actually, it's even worse: instead of cracking the hash one can easily precompute all possible values of the 'date | md5sum' command, and thus create a dictionary containing these “passwords”. I could start right away:

for (( time=777506400; time<=1557666649; time++ )); do date -d@$time | md5sum | tr -d "-"; done > lowell_heddings_passwords.txt

On my desktop with its Xeon E3 v2, this command computes one million passwords in about half an hour, i.e, I'd need about 17 days for computing all passwords back to 1994. Writing a corresponding program running on the GPU would cut this down to seconds. Note that the resulting list of “random enough” passwords is static, i.e., it is indeed a dictionary, and not even a particularly large one.

Lowell Heddings himself mentions several alternative ways to generate a password in his article before turning to the worst possible solution. But if we desire cryptographically secure solutions, even apparently innocuous commands are beset with difficulties, as pointed out by, for example, the carpetsmoker (better carpets than mattresses). In the end, it all boils down to the following three choices that are available on virtually any Linux installation. If we limit ourselves to a character space of 62:

cat /dev/urandom | base64 | tr -d /=+ | head -c 25; echo
openssl rand -base64 25 | tr -d /=+ | head -c 25; echo
gpg2 --armor --gen-random 1 25 | tr -d /=+ | head -c 25; echo

If we insist of having almost all printable characters (which often calls for trouble):

cat /dev/urandom | base91 | head -c 25
openssl rand 25 | base91 | head -c 25; echo
gpg2 --gen-random 1 25 | base91 | head -c 25; echo

One could, in principle, also utilize dedicated password generators, such as Theodore Tso's 'pwgen', Adel I. Mirzazhanov's 'apg', or haui's 'hpg':

pwgen -s 25
apg -a 1 -M ncl -m 25 -x 25
hpg --alphanum 25

All of these ways are cryptographically equivalent in the sense that the entropy of the passwords generated by either of them asymptotically approaches the theoretical value ($\log_2(62) \approx 5.954$ bits per character) when you average over many (10,000,000 or more). In the present context (functions with default values) the generators do not offer any advantage, but only add unnecessary complexity.

Now, whatever you chose as your favorite, you don't want to memorize the command or rely on the history of your favorite shell. One could define an alias with the password length as parameter, but I prefer to use a function for this case to have a default length of 25 characters with the option to change this value:

↪ pw62

↪ pw62 8

↪ pw62 32

Here's how to implement this functionality for the three major shells. Note the very elegant way with which a default value can be implemented within the bash. Update: haui reminded me that the zsh is a drop-in replacement for the bash and thus of course implements all bash variable substitution, particularly ${var:-default}. Hence, we can use the same syntax for the bash and the zsh, and only the fish needs the comparatively clumsy construct shown below. 😎


function pw62
cat /dev/urandom | base64 | tr -d /=+ | head -c ${1:-25}; echo


function pw62
if set -q $argv
set length 25
set length $argv

cat /dev/urandom | base64 | tr -d /=+ | head -c $length; echo


zsh (alternative to the bash syntax)

function pw62()
if [ "$1" != "" ]
integer length=$1
integer length=25

cat /dev/urandom | base64 | tr -d /=+ | head -c $length; echo

Oid's Graffel

I generally like Debian, as documented by the fact that it's my Linux distribution of choice for pdes-net.org, for the two compute server at the office, for the Mini (which is currently out of order due to a defunct SSD), and for the virtual machine that I've reserved for online banking (biig mistake...see below). Since the stable version of Debian delivers only outdated software, I'm using 'testing' as the base, and if needed, I also install packages from 'sid'.

On my main systems, however, I don't use Debian, but Archlinux. I have several good reasons for this decision. One of them is that packages that belong in a museum are not reserved to Debian Stable, but are also regularly found in Testing or Sid.

One example is 'look', which I've recently reported to be a fast way for finding an entry in a huge file. The version of look in Debian, however, contains a bug that has been fixed ten years ago. Except, of course, in Debian (and all derivatives).

But what are 10 years if you can have 20? In 2010, c't presented a Perl script for downloading and processing the transactions from an account at Deutsche Bank. The script served me well for several years, but it broke a number of times due to changes of the web interface and Perl itself. I was able to fix the script the first four times, but the last time, about five years ago, I had to ask haui for help. And a few weeks ago, it simply broke completely, and I decided to let it go and extend my old bash script to process the csv files downloaded from Deutsche Bank.

Part of one of the new scripts is the following oneliner:

tail -n +4 $current_rates | iconv -f ISO8859-1 -t utf8 | awk '{split($0,a,";"); print a[14]}' | sed 's/,/./g' | bc -l | xargs printf %.2f"\n" | tr '\n' ' ' | awk '{print strftime("%Y-%m-%d")"\t"$7"\t"$6"\t"$1"\t"$5"\t"$4" \t"$2" \t"$3}'> $cleaned_rates

Worked perfectly on my notebook running Archlinux, but in the virtual machine reserved for online banking, I got the following error message:

mawk: line 2: function strftime never defined


$ awk -W version
mawk 1.3.3 Nov 1996, Copyright (C) Michael D. Brennan

Are you kidding me? That's rather extreme even for Debian standards. Particularly when considering that version 1.3.4 was published in 2009, and strftime was added to it in 2012. But surely, sid has a more recent version...NOT 😒

Even CentOS 6 came with mawk 1.3.4. Shame on you, Debilian!

Well, the only choice was to install gawk, and in this particular case, the performance hit doesn't matter at all. But why isn't that the default, if the Debilians have chosen to neglect mawk? And why do they do that anyway?

Well, whatever. The scripts are working now. 😉