Free Software, Free Society!
Thoughts of the FSFE Community (English)

Wednesday, 03 July 2019

Down the troubleshooting rabbit-hole

Hardware Details

HP ProLiant MicroServer
AMD Turion(tm) II Neo N54L Dual-Core Processor
Memory Size: 2 GB - DIMM Speed: 1333 MT/s
Maximum Capacity: 8 GB

Running 24×7 from 23/08/2010, so nine years!




The above server started it’s life on CentOS 5 and ext3. Re-formatting to run CentOS 6.x with ext4 on 4 x 1TB OEM Hard Disks with mdadm raid-5. That provided 3 TB storage with Fault tolerance 1-drive failure. And believe me, I used that setup to zeroing broken disks or replacing faulty disks.


As we are reaching the end of CentOS 6.x and there is no official dist-upgrade path for CentOS, and still waiting for CentOS 8.x, I made decision to switch to Ubuntu 18.04 LTS. At that point this would be the 3rd official OS re-installation of this server. I chose ubuntu so that I can dist-upgrade from LTS to LTS.


This is a backup server, no need for huge RAM, but for a reliable system. On that storage I have 2m files that in retrospect are not very big. So with the re-installation I chose to use xfs instead of ext4 filesystem.


I am also running an internal snapshot mechanism to have delta for every day and that pushed the storage usage to 87% of the 3Tb. If you do the math, 2m is about 1.2Tb usage, we need a full initial backup, so 2.4Tb (80%) and then the daily (rotate) incremental backups are ~210Mb per day. That gave me space for five (5) daily snapshots aka a work-week.

To remove this impediment, I also replaced the disks with WD Red Pro 6TB 7200rpm disks, and use raid-1 instead of raid-5. Usage is now ~45%



Frozen System

From time to time, this very new, very clean, very reliable system froze to death!

When attached monitor & keyboard no output. Strange enough I can ping the network interfaces but I can not ssh to the server or even telnet (nc) to ssh port. Awkward! Okay - hardware cold reboot then.

As this system is remote … in random times, I need to ask from someone to cold-reboot this machine. Awkward again.

Kernel Panic

If that was not enough, this machine also has random kernel panics.




Let’s start troubleshooting this system

# journalctl -p 3 -x


Important Errors

ERST: Failed to get Error Log Address Range.
APEI: Can not request [mem 0x7dfab650-0x7dfab6a3] for APEI BERT registers
ipmi_si dmi-ipmi-si.0: Could not set up I/O space

and more important Errors:

INFO: task kswapd0:40 blocked for more than 120 seconds.
      Not tainted 4.15.0-54-generic #58-Ubuntu
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
INFO: task xfsaild/dm-0:761 blocked for more than 120 seconds.
      Not tainted 4.15.0-54-generic #58-Ubuntu
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
INFO: task kworker/u9:2:3612 blocked for more than 120 seconds.
      Not tainted 4.15.0-54-generic #58-Ubuntu
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
INFO: task kworker/1:0:5327 blocked for more than 120 seconds.
      Not tainted 4.15.0-54-generic #58-Ubuntu
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
INFO: task rm:5901 blocked for more than 120 seconds.
      Not tainted 4.15.0-54-generic #58-Ubuntu
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
INFO: task kworker/u9:1:5902 blocked for more than 120 seconds.
      Not tainted 4.15.0-54-generic #58-Ubuntu
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
INFO: task kworker/0:0:5906 blocked for more than 120 seconds.
      Not tainted 4.15.0-54-generic #58-Ubuntu
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
INFO: task kswapd0:40 blocked for more than 120 seconds.
      Not tainted 4.15.0-54-generic #58-Ubuntu
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
INFO: task xfsaild/dm-0:761 blocked for more than 120 seconds.
      Not tainted 4.15.0-54-generic #58-Ubuntu
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
INFO: task kworker/u9:2:3612 blocked for more than 120 seconds.
      Not tainted 4.15.0-54-generic #58-Ubuntu
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.


First impressions ?




After a few (hours) of internet research the suggestion is to disable

  • ACPI stands for Advanced Configuration and Power Interface.
  • APIC stands for Advanced Programmable Interrupt Controller.

This site is very helpful for ubuntu, although Red Hat still has a huge advanced on describing kernel options better than canonical.


# vim /etc/default/grub
GRUB_CMDLINE_LINUX="noapic acpi=off"


# update-grub
Sourcing file `/etc/default/grub'
Sourcing file `/etc/default/grub.d/50-curtin-settings.cfg'
Generating grub configuration file ...
Found linux image: /boot/vmlinuz-4.15.0-54-generic
Found initrd image: /boot/initrd.img-4.15.0-54-generic
Found linux image: /boot/vmlinuz-4.15.0-52-generic
Found initrd image: /boot/initrd.img-4.15.0-52-generic


# grep noapic /boot/grub/grub.cfg | head -1

        linux   /boot/vmlinuz-4.15.0-54-generic root=UUID=0c686739-e859-4da5-87a2-dfd5fcccde3d ro noapic acpi=off maybe-ubiquity

reboot and check again:

#  journalctl -p 3 -xb
-- Logs begin at Thu 2019-03-14 19:26:12 EET, end at Wed 2019-07-03 21:31:08 EEST. --
Jul 03 21:30:49 servertwo kernel: ipmi_si dmi-ipmi-si.0: Could not set up I/O space

okay !!!



Unfortunately I could not find anything useful regarding

# dmesg | grep -i ipm
[   10.977914] ipmi message handler version 39.2
[   11.188484] ipmi device interface
[   11.203630] IPMI System Interface driver.
[   11.203662] ipmi_si dmi-ipmi-si.0: ipmi_platform: probing via SMBIOS
[   11.203665] ipmi_si: SMBIOS: mem 0x0 regsize 1 spacing 1 irq 0
[   11.203667] ipmi_si: Adding SMBIOS-specified kcs state machine
[   11.203729] ipmi_si: Trying SMBIOS-specified kcs state machine at mem address 0x0, slave address 0x20, irq 0
[   11.203732] ipmi_si dmi-ipmi-si.0: Could not set up I/O space

# ipmitool list
Could not open device at /dev/ipmi0 or /dev/ipmi/0 or /dev/ipmidev/0: No such file or directory

# lsmod | grep -i ipmi
ipmi_si                61440  0
ipmi_devintf           20480  0
ipmi_msghandler        53248  2 ipmi_devintf,ipmi_si


blocked for more than 120 seconds.

But let’s try to fix the timeout warnings:

INFO: task kswapd0:40 blocked for more than 120 seconds.
      Not tainted 4.15.0-54-generic #58-Ubuntu
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message

if you search online the above message, most of the sites will suggest to tweak dirty pages for your system.

This is the most common response across different sites:

This is a know bug. By default Linux uses up to 40% of the available memory for file system caching. After this mark has been reached the file system flushes all outstanding data to disk causing all following IOs going synchronous. For flushing out this data to disk this there is a time limit of 120 seconds by default. In the case here the IO subsystem is not fast enough to flush the data withing 120 seconds. This especially happens on systems with a lot of memory.

Okay this may be the problem but we do not have a lot of memory, only 2GB RAM and 2GB Swap. But even then, our vm.dirty_ratio = 20 setting is 20% instead of 40%.


But I have the ability to cross-check ubuntu 18.04 with CentOS 6.10 to compare notes:


ubuntu 18.04

# uname -r

# sysctl -a | egrep -i  'swap|dirty|raid'|sort = 200000 = 1000
vm.dirty_background_bytes = 0
vm.dirty_background_ratio = 10
vm.dirty_bytes = 0
vm.dirty_expire_centisecs = 3000
vm.dirty_ratio = 20
vm.dirtytime_expire_seconds = 43200
vm.dirty_writeback_centisecs = 500
vm.swappiness = 60


CentOS 6.11

#  uname -r

# sysctl -a | egrep -i  'swap|dirty|raid'|sort = 200000 = 1000
vm.dirty_background_bytes = 0
vm.dirty_background_ratio = 10
vm.dirty_bytes = 0
vm.dirty_expire_centisecs = 3000
vm.dirty_ratio = 20
vm.dirty_writeback_centisecs = 500
vm.swappiness = 60


Scheduler for Raid

This is the best online documentation on the
optimize raid

Comparing notes we see that both systems have the same settings, even when the kernel version is a lot different, 2.6.32 Vs 4.15.0 !!!

Researching on raid optimization there is a note of kernel scheduler.


Ubuntu 18.04

# for drive in {a..c}; do cat /sys/block/sd${drive}/queue/scheduler; done

noop deadline [cfq]
noop deadline [cfq]
noop deadline [cfq] 


CentOS 6.11

# for drive in {a..d}; do cat /sys/block/sd${drive}/queue/scheduler; done

noop anticipatory deadline [cfq]
noop anticipatory deadline [cfq]
noop anticipatory deadline [cfq]
noop anticipatory deadline [cfq] 


Anticipatory scheduling

CentOS supports Anticipatory scheduling on the hard disks but nowadays anticipatory scheduler is not supported in modern kernel versions.

That said, from the above output we can verify that both systems are running the default scheduler cfq.


Ubuntu 18.04

  • Western Digital Red Pro WDC WD6003FFBX-6
# for i in sd{b..c} ; do hdparm -Tt  /dev/$i; done

 Timing cached reads:   2344 MB in  2.00 seconds = 1171.76 MB/sec
 Timing buffered disk reads: 738 MB in  3.00 seconds = 245.81 MB/sec

 Timing cached reads:   2264 MB in  2.00 seconds = 1131.40 MB/sec
 Timing buffered disk reads: 774 MB in  3.00 seconds = 257.70 MB/sec

CentOS 6.11

  • Seagate ST1000DX001
 Timing cached reads:   2490 MB in  2.00 seconds = 1244.86 MB/sec
 Timing buffered disk reads: 536 MB in  3.01 seconds = 178.31 MB/sec

 Timing cached reads:   2524 MB in  2.00 seconds = 1262.21 MB/sec
 Timing buffered disk reads: 538 MB in  3.00 seconds = 179.15 MB/sec

 Timing cached reads:   2452 MB in  2.00 seconds = 1226.15 MB/sec
 Timing buffered disk reads: 546 MB in  3.01 seconds = 181.64 MB/sec


So what I am missing ?

My initial personal feeling was the low memory. But after running a manual rsync I’ve realized that:


was load average: 0.87, 0.46, 0.19


was (on high load), when hit 40% of RAM, started to use swap.

KiB Mem :  2008464 total,    77528 free,   635900 used,  1295036 buff/cache
KiB Swap:  2097148 total,  2096624 free,      524 used.  1184220 avail Mem 

So I tweaked a bit the swapiness and reduce it from 60% to 40%

and run a local snapshot (that is a bit heavy on the disks) and doing an upgrade and trying to increase CPU load. Still everything is fine !

I will keep an eye on this story.



Monday, 01 July 2019

Usability & Productivity Sprint 2019

I [partially, only 2 days out of the 7] attended the Usability & Productivity Sprint 2019 in Valencia two weekends ago.

I was very happy to meet quite some new developer blood, which is something we had been struggling a bit to get lately, so we're starting to get on the right track again :) And I can only imagine it'll get better and better due to the "Onboarding" goal :)

During the sprint we had an interesting discussion about how to get more people to know about usability, and the outcome is that probably we'll try to get some training to members of KDE to increase the knowledge of usability amongst us. Sounds like a good idea to me :)

On the more "what did *you* actually do" side:
* worked on fixing a crash i had on the touchpad kded, (already available on the latest released Plasma!)
* finished part of the implementation for Optional Content Group Links support in Okular (i started that 3 years ago and i was almost sure i had done all the work, but i clearly had not)
* Did some code reviews on existing okular phabricator merge requests (so sad i'm still behind though, we need more people reviewing there other than me)
* Together with Nicolas Fella worked on allowing extra fields from json files to be translated, we even documented it!
* Changed lots of applications released on KDE Applications to follow the KDE Applications versioning scheme, the "winner" was kmag, that had been stuck in version 1.0 for 15 years (and had more than 440 commits since then)
* Fixed a small issue with i18n in kanagram

I would like to thank SLIMBOOK for hosting us in their offices (and providing a shuttle from them to the hotel) and the KDE e.V. for sponsoring my attendance to the sprint, please donate to KDE if you think the work done at sprints is important.

Sunday, 30 June 2019

Dealing with Colors in lower Android versions

I’m currently working on a project which requires me to dynamically set the color of certain UI elements to random RGB values.

Unfortunately and surprisingly, handy methods for dealing with colors in Android are only available since API level 26 (Android O). Luckily though, the developer reference specifies, how colors are encoded internally in the Android operating system, so I was able to create a class with the most important color-related (for my use-case) methods which is able to function in lower Android versions as well.

Link to the gist

I hope I can save some of you from headaches by sharing the code. Feel free to reuse as you please đŸ™‚

Happy Hacking!

PS: Is there a way to create Github-like gists with Gitea?

Thursday, 27 June 2019

KDE Applications 19.08 Schedule finalized

It is available at the usual place

Dependency freeze is two weeks (July 11) and Feature Freeze a week after that, make sure you start finishing your stuff!

P.S: Remember last day to apply for Akademy Travel Support is this Sunday 30 of June!

Sunday, 23 June 2019

A good firewall for a small network

In this article I will outline the setup of my (not so) new firewall at home. I explain how I decided which hardware to get and which software to choose, and I cover the entire process of assembling the machine and installing the operating system. Hopefully this will be helpful to poeple in similar situations.


While the ability of firewalls to protect against all the evils of the internets is certainly exaggerated, there are some important use cases for them: you want to prevent certain inbound traffic and manipulate certain outbound traffic e.g. route it through a VPN.

For a long time I used my home server (whose main purpose is network attached storage) to also do some basic routing and VPN, but this had a couple of important drawbacks:

  • Just one NIC on the server meant traffic to/from the internet wasn’t physically required to go through the server.
  • Less reliable due to more complex setup → longer downtimes during upgrades, higher chance of failure due to hard drives.
  • I wouldn’t give someone else the root password to my data storage, but I did want my flat-mates to be able to reset and configure basic network components that they depend on (Router/Port-forwarding and WiFi).
  • I wanted to isolate the ISP-provided router more strongly from the LAN as they have a history of security vulnerabilities.

The different off-the-shelf routers I had used over the years had also worked only so-so (even those that were customisable) so I decided I needed a proper router. Since WiFi access was already out-sourced to dedicated devices I really only needed a filtering and routing device.


Board & CPU

The central requirements for the device were:

  • low energy consumption
  • enough CPU power to route traffic at Gbit-speed, run Tor and OpenVPN (we don’t have Gbit/s internet in Berlin, yet, but I still have hopes for the future)
  • hardware crypto support to unburden the CPU for crypto tasks
  • two NICs, one for the LAN and one for the WAN

I briefly thought about getting an ARM-based embedded board, but most reviews suggested that the performance wouldn’t be enough to satisfy my requirements and also the *BSD support was mixed at best and I didn’t want to rule out running OpenBSD or FreeBSD.

Back to x86-land: I had previously used PC Engines ALIX boards as routers and was really happy with them at the time. Their new APU boards promised better performance, but thanks to the valuable feedback and some benchmarking done by the community over at, I came to the conclusion that they wouldn’t be able to push more than 200Mbit/s through an OpenVPN tunnel.

In the end I decided on the Gigabyte J3455N-D3H displayed at the top. It sports a rather atypical Intel CPU (Celeron J3455) with

  • four physical cores @ 1.5Ghz
  • AESNI support
  • 10W TDP

Having four actual cores (instead of 2 cores + hyper threading) is pretty cool now that many security-minded operating systems have started deactivating hyper threading to mitigate CPU bugs [OpenBSD] [HardenedBSD]. And the power consumption is also quite low.

I would have liked for the two NICs on the mainboard to be from Intel, but I couldn’t find a mainboard at the time that offered this (other than super-expensive SuperMicro boards). At least the driver support on modern Realteks is quite good.

Storage & Memory

The board has two memory slots and supports a maximum of 4GiB each. I decided 4GiB are enough for now and gave it one module to allow for future extensions (I know that’s suboptimal for speed).

Storage-wise I originally planned on putting a left-over SATA-SSD into the case, but in the end, I decided a tiny USB3-Stick would provide sufficient performance and be much easier to replace/debug/…

Case & Power

Since I installed a real 19” wrack in my new flat, of course the case for the firewall would have to fit nicely into that. I had a surprisingly difficult time finding a good case, because I wanted one were the board’s ports would be front-facing. That seems to be quite a rare requirement, although I really don’t understand why. Obviously having the network ports, serial ports and USB-Ports to the front makes changing the setup and debugging so much easier ¯\_(ツ)_/¯

I also couldn’t find a good power supply for such a low-power device, but I still had a 60W PicoPSU supply lying around.

Even though it came with an overpowered PSU and a proprietary IO-Shield (more on that below), I decided on the SuperMicro SC505-203B. It really does look quite good, I have to say!


Mounting the mainboard in the case is pretty straight-forward. The biggest issue was the aforementioned proprietary I/O-Shield that came with the SuperMicro case (and was designed only for SuperMicro-boards). It was possible to remove it, however, the resulting open space did not conform to ATX spec so it wasn’t possible to just fit the Gigabyte board’s shield into it.

I quickly took the measurements and starting cutting away on the shield to make it fit. This worked ok-ish in the end, but is more dangerous than it looks (be smarter than me, wear gloves ☝ ). In retrospect I also recommend that you do not remove the bottom fold on the shield, only left, right and top; that will make it hold a lot better in the case opening.

The board can be fit into the case using standard screws in the designated places. As mentioned above, I removed the original (actively cooled) power supply unit and used the 60W PicoPSU that I had lying around from before. Since it doesn’t have the 4-pin CPU cable I had improvise. There are adaptors for this, but if you have a left-over power supply, you can also tape together something. I also put the transformer into the case (duck-tape, yeah!) so that one can plug in the power cord from the back of the case as usual.


OPNSense logo


There are many operating systems I could have chosen since I decided to use an x86 platform. My criteria were:

  • free software (obviously)
  • intuitive web user interface to do at least the basic things
  • possibility to login via SSH if things don’t go as planned
  • OpenVPN client

I feel better with operating systems based on FreeBSD or OpenBSD, mainly because I have more experience with them than with GNU/Linux distributions nowadays. In previous flats I had also used OpenWRT and dd-wrt based routers, but whenever I needed to tweak something beyond what the web interface offered, it got really painful. In general the whole IPtables based stack on Linux seems overly complicated, but maybe that’s just me.

In any case, there are no OpenBSD-based router operating systems with web interfaces (that I am aware of) so I had the choice between

  1. pfsense (FreeBSD-based)
  2. OPNSense, fork of pfsense, based on HardenedBSD / FreeBSD

There seem to be historic tensions between the people involved in both operating systems and I couldn’t find out if there were actual distinctions in the goals of the projects. In the end, I asked other people for recommendations and found the interface and feature list of OPNSense more convincing. Also, being based on HardenedBSD sounds good (although I am not sure if HardenedBSD-specifica will really ever play out on the router).

Initially I had some issues with the install and OPNSense people were super-friendly and responded immediately. Also the interface was a lot better than I expected so I am quite sure I made the right decision.


Setup is very easy:

  1. Go to, select amd64 and nano and download the image.
  2. Unzip the image (easy to forget this).
  3. Write the image to the USB-stick with dd (as always with dd: be careful about the target device!)
  4. Optionally plug a serial cable into the top serial port (the mainboard has two) and connect your Laptop/Desktop with baud rate 115200
  5. Plug the USB-stick into the firewall and boot it up.

There will be some beeping when you start the firewall. Some of this is due to the mainboard complaining that no keyboard is attached (can be ignored) and also OPNSense will play a melody when it is booted. If you are attached to the serial console you can select which interface will be WAN and which will be LAN (and their IP addresses). Otherwise you might need to plug around the LAN cables a bit to find out which is configured as which.

When I built this last year there were some more issues, but all of them have been resolved by the OPNSense people so it really is “plug’n’play”; I verified by doing a re-install!


Go to the configured IP-address ( by default) and login (root: opnsense by default). If the web-interface comes up everything has worked fine and you can disconnect serial console and do the rest via the web-interface.

After login, I would to the following:

  • change the password
  • activate SSH on the LAN interface
  • configure internet access and DHCP
  • setup any of the other services you want

For me setting up the internet meant doing a “double-NAT” with the ISP-provided router, because I need its modem and nowadays it seems impossible to get a stand-alone VDSL modem. If you do something similar just configure internet as being over DHCP.

If you want hardware accelerated SSL (also OpenVPN), go to System → Firmware → Setting and change the firmware flavour to OpenSSL (instead of LibreSSL). After that check for updates and upgrade. In the OpenVPN profile, under Hardware Crypto, you can now select Intel RDRAND engine - RAND.

Take your time to look through the interface! I found some pretty cool things like automatic backup of the configuration to a nextcloud server! The entire config of the firewall rests in one file so it’s really easy to setup a clean system from scratch.

All-in-all I am very happy with the system. Even though my setup is non-trivial, with only selected outgoing traffic going through the VPN (based on rules), I never had to get my hands dirty on the command line – everything can be done through the Web-UI.

Sunday, 16 June 2019

Information stalls at Linux Week and Veganmania in Vienna

Linuxwochen Vienna 2018
Linux Weeks in Vienna 2018

Veganmania Vienna 2018
Veganmania at MQ in Vienna 2018

Linuxwochen Vienna 2019
Linux Weeks in Vienna 2019

Information stall at Veganmania 2019
Veganmania at MQ in Vienna 2019

Information stall at Veganmania 2019
Veganmania at MQ in Vienna 2019

As has been tradition for many years now, this year too saw the Viennese FSFE volunteers’ group hold information stalls at the Linuxwochen event and Veganmania in Vienna. Even though the active team has shrunk due to former activists moving away, having children or simply having very demanding jobs, we have still managed to keep up these information stalls in 2019.

Linux Weeks Vienna 2019

The information stall at the Linux weeks event in May was somewhat limited due to the fact that we didn’t get our usual posters and the roll-up in time. Unfortunately we discovered too late that they had obviously been lent out for an other event and hadn’t been returned afterwards. So we could only use our information material. But since at this event the FSFE is very well known, it wasn’t hard at all to carry out our usual information stall. It’s less about outreach work and more of a who-is-who of the free software community in Vienna anyway. For three days we met old friends and networked. Of course some newbies found their way to the event also. And therefore we could spread our messages a little further too.

In addition, we once again provided well visited workshops for Inkscape and Gimp. The little talk on the free rally game Trigger Rally even motivated an attending dedicated Fedora maintainer to create an up-to-date .rpm package in order to enable distribution of the most recent release to rpm distros.

Veganmania MQ Vienna 2019

The Veganmania at the Museums Quartier in Vienna is growing bigger every year. In 2019 it took place from 7th to 10th of June. Despite us having a less frequented spot with our information stall at the event due to construction work, it again was a full-blown success. Over the four days in perfect weather, the stall was visited by loads of people. There were times when we were stretched to give some visitors the individual attention they might have wanted. But I think in general we were able to provide almost all people with valuable insights and new ideas for their everyday computing. Once again Veganmania proved to be a very good setting for our FSFE information stall. It is always very rewarding to experience people getting a glimpse for the first time of how they could emancipate themselves from proprietary domination. Our down-to-earth approach seems to be the right way to go.

We do not only explain ethical considerations but also appeal to the self-interest of people concerning independence, reliability and free speech. Edward Snowden’s and Wikileaks discoveries clearly show how vulnerable we make ourselves in blindly trusting governments and companies. We describe with practical examples how free software can help us in working together or recovering old files by building on open standards. Of course pointing to the environmental (and economic) advantages of using old hardware with less resource hungry free software is a winning argument also.


Alongside the introductory Austrian version of the leaflet about the freedoms free software enables, which was put together as a condensation of RMS’ book Free Software, Free Society, one of our all-time favourite leaflet features 10 popular GNU/Linux Distros with just a few words about their defining differences (advantages and disadvantages). I updated the leaflet just a day before the festival. I replaced Linux Mint, Open Suse and gNewSense with the recently even more popular Manjaro, MX Linux and PureOS. I also updated the information on the importance of open standards on the back. We have run out of our end-user business cards for our local association which makes knowledgable people available to others searching for help. Therefore, we decided to use the version we originally designed for inviting experts to the platform. It obviously was wrong to order the same amount of cards for both groups. Our selection of information material seems to work well as an invitation for people to give free software a try. Of course it feels probably also like a safeguard that people can contact me if they want to get my support – or that of someone else listed on


The first day was rather windy and we had to carefully manage our material if we didn’t want to have our leaflets flying all over the place. In the very early morning of the second day the wind was so strong that some tents where blown away and destroyed. There was even a storm warning which could have forced the organisers to cancel the event. Fortunately our material was well stored and the wind died down over the day. We also had to firmly hold-on to our sunshade because it was very hot, but beside that everything went fine.

It was just coincidence that Richard Matthew Stallman had a talk in Vienna on the evening of the first day of the Veganmania street festival. So at least one of us could take this rare opportunity to see RMS at a live talk while the other carried on with manning the information stall.

As we didn’t have our posters on Linuxwochen we investigated where they were and got them sent to us via snail mail just in time. We didn’t only get our posters but merchandise too. This was a premiere for our stall. It was clear from the beginning that we wouldn’t sell many shirts since most designs assumed prior knowledge of IT related concepts like binary counting. The general public doesn’t seem very aware of such details and people don’t even get the joke. (If we had had the same merchandise at the Linuxwochen we probably would have sold at least as many items despite having reached a much smaller crowd there.)

Outlook and thanks

There will be another information stall at the second Veganmania in Vienna this year, which takes place in end of August. The whole setting there is a little different as there isn’t a shopping street nearby but instead, the location is on the heavily frequented recreational area of Vienna’s Danube Island. Just like last year, it should be a good place for chatting about free software, as long as the weather is on our side.

I want to thank Martin for his incredible patience and ongoing dedication manning our stall. He is extremely reliable, always friendly and it is just a real pleasure working with him.

Thanks to, a local place to rent and buy carrier bicycles, we could transport all our information material in a very environmentally friedly way.

Monday, 10 June 2019

MariaDB Galera Cluster on Ubuntu 18.04.2 LTS

MariaDB Galera Cluster on Ubuntu 18.04.2 LTS

Last Edit: 2019 06 11
Thanks to Manolis Kartsonakis for the extra info.


Official Notes here:
MariaDB Galera Cluster

a Galera Cluster is a synchronous multi-master cluster setup. Each node can act as master. The XtraDB/InnoDB storage engine can sync its data using rsync. Each data transaction gets a Global unique Id and then using Write Set REPLication the nodes can sync data across each other. When a new node joins the cluster the State Snapshot Transfers (SSTs) synchronize full data but in Incremental State Transfers (ISTs) only the missing data are synced.

With this setup we can have:

  • Data Redundancy
  • Scalability
  • Availability




In Ubuntu 18.04.2 LTS three packages should exist in every node.
So run the below commands in all of the nodes - change your internal IPs accordingly

as root

# apt -y install mariadb-server
# apt -y install galera-3
# apt -y install rsync

host file

as root

# echo gal1 >> /etc/hosts
# echo gal2 >> /etc/hosts
# echo gal3 >> /etc/hosts


Storage Engine

Start the MariaDB/MySQL in one node and check the default storage engine. It should be

MariaDB [(none)]> show variables like 'default_storage_engine';


echo "SHOW Variables like 'default_storage_engine';" | mysql
| Variable_name          | Value  |
| default_storage_engine | InnoDB |



A Galera Cluster should be behind a Load Balancer (proxy) and you should never talk with a node directly.


Galera Configuration

Now copy the below configuration file in all 3 nodes:


# Galera Provider Configuration

# Galera Cluster Configuration

# Galera Synchronization Configuration

# Galera Node Configuration

Per Node

Be careful the last 2 lines should change to each node:

Node 01

# Galera Node Configuration

Node 02

# Galera Node Configuration

Node 03

# Galera Node Configuration


Galera New Cluster

We are ready to create our galera cluster:



mysqld --wsrep-new-cluster


Jun 10 15:01:20 gal1 systemd[1]: Starting MariaDB 10.1.40 database server...
Jun 10 15:01:24 gal1 sh[2724]: WSREP: Recovered position 00000000-0000-0000-0000-000000000000:-1
Jun 10 15:01:24 gal1 mysqld[2865]: 2019-06-10 15:01:24 139897056971904 [Note] /usr/sbin/mysqld (mysqld 10.1.40-MariaDB-0ubuntu0.18.04.1) starting as process 2865 ...
Jun 10 15:01:24 gal1 /etc/mysql/debian-start[2906]: Upgrading MySQL tables if necessary.
Jun 10 15:01:24 gal1 systemd[1]: Started MariaDB 10.1.40 database server.
Jun 10 15:01:24 gal1 /etc/mysql/debian-start[2909]: /usr/bin/mysql_upgrade: the '--basedir' option is always ignored
Jun 10 15:01:24 gal1 /etc/mysql/debian-start[2909]: Looking for 'mysql' as: /usr/bin/mysql
Jun 10 15:01:24 gal1 /etc/mysql/debian-start[2909]: Looking for 'mysqlcheck' as: /usr/bin/mysqlcheck
Jun 10 15:01:24 gal1 /etc/mysql/debian-start[2909]: This installation of MySQL is already upgraded to 10.1.40-MariaDB, use --force if you still need to run mysql_upgrade
Jun 10 15:01:24 gal1 /etc/mysql/debian-start[2918]: Checking for insecure root accounts.
Jun 10 15:01:24 gal1 /etc/mysql/debian-start[2922]: WARNING: mysql.user contains 4 root accounts without password or plugin!
Jun 10 15:01:24 gal1 /etc/mysql/debian-start[2923]: Triggering myisam-recover for all MyISAM tables and aria-recover for all Aria tables
# ss -at '( sport = :mysql )'

State                Recv-Q                Send-Q                                Local Address:Port                                  Peer Address:Port
LISTEN               0                     80                                                          *         
# echo "SHOW STATUS LIKE 'wsrep_%';" | mysql  | egrep -i 'cluster|uuid|ready' | column -t
wsrep_cluster_conf_id     1
wsrep_cluster_size        1
wsrep_cluster_state_uuid  8abc6a1b-8adc-11e9-a42b-c6022ea4412c
wsrep_cluster_status      Primary
wsrep_gcomm_uuid          d67e5b7c-8b90-11e9-ba3d-23ea221848fd
wsrep_local_state_uuid    8abc6a1b-8adc-11e9-a42b-c6022ea4412c
wsrep_ready               ON


Second Node

systemctl restart mariadb.service
root@gal2:~# echo "SHOW STATUS LIKE 'wsrep_%';" | mysql  | egrep -i 'cluster|uuid|ready' | column -t

wsrep_cluster_conf_id     2
wsrep_cluster_size        2
wsrep_cluster_state_uuid  8abc6a1b-8adc-11e9-a42b-c6022ea4412c
wsrep_cluster_status      Primary
wsrep_gcomm_uuid          a5eaae3e-8b91-11e9-9662-0bbe68c7d690
wsrep_local_state_uuid    8abc6a1b-8adc-11e9-a42b-c6022ea4412c
wsrep_ready               ON


Third Node

systemctl restart mariadb.service
root@gal3:~# echo "SHOW STATUS LIKE 'wsrep_%';" | mysql  | egrep -i 'cluster|uuid|ready' | column -t

wsrep_cluster_conf_id     3
wsrep_cluster_size        3
wsrep_cluster_state_uuid  8abc6a1b-8adc-11e9-a42b-c6022ea4412c
wsrep_cluster_status      Primary
wsrep_gcomm_uuid          013e1847-8b92-11e9-9055-7ac5e2e6b947
wsrep_local_state_uuid    8abc6a1b-8adc-11e9-a42b-c6022ea4412c
wsrep_ready               ON


Primary Component (PC)

The last node in the cluster -in theory- has all the transactions. That means it should be the first to start next time from a power-off.


cat /var/lib/mysql/grastate.dat


# GALERA saved state
version: 2.1
uuid:    8abc6a1b-8adc-11e9-a42b-c6022ea4412c
seqno:   -1
safe_to_bootstrap: 0

if safe_to_bootstrap: 1 then you can bootstrap this node as Primary.


Common Mistakes

Sometimes DBAs want to setup a new cluster (lets say upgrade into a new scheme - non compatible with the previous) so they want a clean state/directory. The most common way is to move the current mysql directory

mv /var/lib/mysql /var/lib/mysql_BAK

If you try to start your galera node, it will fail:

# systemctl restart mariadb
WSREP: Failed to start mysqld for wsrep recovery:
[Warning] Can't create test file /var/lib/mysql/gal1.lower-test
Failed to start MariaDB 10.1.40 database server

You need to create and initialize the mysql directory first:

mkdir -pv /var/lib/mysql
chown -R mysql:mysql /var/lib/mysql
chmod 0755 /var/lib/mysql
mysql_install_db -u mysql

On another node, cluster_size = 2

# echo "SHOW STATUS LIKE 'wsrep_%';" | mysql  | egrep -i 'cluster|uuid|ready' | column -t

wsrep_cluster_conf_id     4
wsrep_cluster_size        2
wsrep_cluster_state_uuid  8abc6a1b-8adc-11e9-a42b-c6022ea4412c
wsrep_cluster_status      Primary
wsrep_gcomm_uuid          a5eaae3e-8b91-11e9-9662-0bbe68c7d690
wsrep_local_state_uuid    8abc6a1b-8adc-11e9-a42b-c6022ea4412c
wsrep_ready               ON


# systemctl restart mariadb

rsync from the Primary:

Jun 10 15:19:00 gal1 rsyncd[3857]: rsyncd version 3.1.2 starting, listening on port 4444
Jun 10 15:19:01 gal1 rsyncd[3884]: connect from gal3 (
Jun 10 15:19:01 gal1 rsyncd[3884]: rsync to rsync_sst/ from gal3 (
Jun 10 15:19:01 gal1 rsyncd[3884]: receiving file list
#  echo "SHOW STATUS LIKE 'wsrep_%';" | mysql  | egrep -i 'cluster|uuid|ready' | column -t

wsrep_cluster_conf_id     5
wsrep_cluster_size        3
wsrep_cluster_state_uuid  8abc6a1b-8adc-11e9-a42b-c6022ea4412c
wsrep_cluster_status      Primary
wsrep_gcomm_uuid          12afa7bc-8b93-11e9-88fc-6f41be61a512
wsrep_local_state_uuid    8abc6a1b-8adc-11e9-a42b-c6022ea4412c
wsrep_ready               ON

Be Aware: Try to keep your DATA directory to a seperated storage disk


Adding new Nodes

A healthy Quorum has an odd number of nodes. So when you scale your galera gluster consider adding two (2) at every step!

# echo gal4 >> /etc/hosts
# echo gal5 >> /etc/hosts

Data Replication will lock your donor-node so it is best to put-off your donor-node from your Load Balancer:


Then explicit point your donor-node to your new nodes by adding the below line in your configuration file:

wsrep_sst_donor= gal3

After the synchronization:

  • comment-out the above line
  • restart mysql service and
  • put all three nodes behind the Local Balancer


Split Brain

Find the node with the max

SHOW STATUS LIKE 'wsrep_last_committed';

and set it as master by

SET GLOBAL wsrep_provider_options='pc.bootstrap=YES';


Weighted Quorum for Three Nodes

When configuring quorum weights for three nodes, use the following pattern:

node1: pc.weight = 4
node2: pc.weight = 3
node3: pc.weight = 2
node4: pc.weight = 1
node5: pc.weight = 0


SET GLOBAL wsrep_provider_options="pc.weight=3";

In the same VPC setting up pc.weight will avoid a split brain situation. In different regions, you can setup something like this:

node1: pc.weight = 2
node2: pc.weight = 2
node3: pc.weight = 2
node4: pc.weight = 1
node5: pc.weight = 1
node6: pc.weight = 1


Friday, 07 June 2019

I'm going to Akademy!

And you should too!

Akademy is free to attend however you need to register to reserve your space, so head to and press the buttons.

Akademy is very important to meet people, discuss future plans, learn about new stuff and make friends for life!

Note this year the recommended accomodations are a bit on the expensive side, so you may want to hurry and apply for Travel support. The last round is open until July 1st.

I'm going to Akademy 2019

Thursday, 06 June 2019

Akademy-es 2019 talks announced!

Akademy-es 2019 will be happening this June 28-30 in Vigo.

The talks were just announced recently.

Check them out at it has lots of interesting talks so if you understand Spanish and are interested in KDE or Free Software in general I'd really recommend to attend!

Friday, 17 May 2019

[Some] KDE Applications 19.04.1 also available in flathub

Thanks to Nick Richards we've been able to convince flathub to momentarily accept our old appdata files as still valid, it's a stopgap workaround, but at least gives us some breathing time. So the updates are coming in as we speak.

Partition like a pro with fdisk, sfdisk and cfdisk

  • Seravo
  • 10:44, Friday, 17 May 2019

Most Linux distributions ship the hard drive partition tool fdisk by default. Knowing how to use it is a good skill for every Linux system administrator since having to rescue a system that has disk issues is a very common task. If the admin is faced with a prompt in a rescue mode boot, often fdisk is the only partitioning tool available and must be used, since if the main root filesystem is broken, one cannot install and use any other partitioning tools.

When installing Debian based systems (e.g. Ubuntu) in the text mode server installer, keep in mind that you can at any time during the installation process press Ctrl+Alt+F2 to jump to a text console running a limited shell prompt (Busybox) and manipulate the systems as you wish, among others run fdisk. When done press Ctrl+Alt+F1 to jump back to the installer screen.

In fact, fdisk is not a single utility but actually a tool that ships with three commands together: fdisk, sfdisk and cfdisk.


Most Linux sysadmins have at some point used fdisk, the classic partitioning tool. There is also a tool with the same name in Windows, but its not the same tool. Across the Unix ecosystem the fdisk tool is however nowadays the same one, even on MacOS X.

To list the current partition layout one can simply run fdisk -l /dev/sda. Below is an example of the output. One can also run fdisk -l /dev/sd* to print the partition info of all sd devices in one go. The fdisk man page lists all the command line parameters available.

Example output of fdisk -l

If one runs just fdisk it will launch in interactive mode. Pressing m will show the help. To create a new GPT (for modern disk) partition table (resetting any existing partition table) and add a new Linux partition that uses all available disk space one can simply enter the commands g, n and w in sequence and pressing enter to all questions to accept them at their default values.

In-command help in fdisk, printed when pressing m


The command cfdisk servers the same purpose as fdisk with the difference that it provides a slightly fancier user interface based on ncurses so there are menus one can browse with arrows and the tab button without having to remember the single letter commands fdisk uses.

Screenshot of cfdisk


The third tool in the suite is sfdisk. This tool is designed to be scripted, enabling administrators to script and automate partitioning operations.

The key to sfdisk operations is to first dump the current layout using the -d argument, for example sfdisk -d /dev/sda > partition-table. An example output would be:

$ cat partition-table
label: gpt
label-id: AF7B83C8-CE8D-463D-99BF-E654A68746DD
device: /dev/sda
unit: sectors
first-lba: 34
last-lba: 937703054

/dev/sda1 : start=        2048, size=      997376, type=C12A7328-F81F-11D2-BA4B-00A0C93EC93B, uuid=F35A875F-1A53-493E-85D4-870A7A749872
/dev/sda2 : start=      999424, size=   936701952, type=A19D880F-05FC-4D3B-A006-743F0F84911E, uuid=725EAB2A-F2E2-475E-81DC-9A5E18E29678

This text file describes the partition type, the layout and also includes the device UUIDs. The file above can be considered a backup of the /dev/sda partition table. If something has gone wrong with the partition table and this file was saved at some earlier time, one can recover the partition table by running: sfdisk /dev/sda < partition-table.

Copying the partition table to multiple disks

One neat application of sfdisk is that it can be used to copy the partition layout to many devices. Say you have a big server computer with 16 hard disks. Once you have partitioned the first disk, you can dump the partition table of the first disk with sfdisk -d and then edit the dump file (remember, it is just a plain-text file) to remove references to the device name and UUID’s, which are unique to a specific device and not something you want to clone to other disks. If the initial dump was the example above, the version with unique identifiers removed would look like this:

label: gpt
unit: sectors
first-lba: 34
last-lba: 937703054

start=        2048, size=      997376, type=0FC63DAF-8483-4772-8E79-3D69D8477DE4
start=      999424, size=   936701952, type=A19D880F-05FC-4D3B-A006-743F0F84911E

This can applied to the another disk, for example /dev/sdb simply by running sfdisk /dev/sdb < partition-table. Now all the admin needs to do is run this same command a couple of times with only the one character changed on each invocation.

Listing device UUID’s with blkid

Keep in mind that the Linux kernel uses the device UUIDs for indentifying partitions and file systems. Be vary not to accidentally make two disks have the same UUID with sfdisk. Technically it is possible, and maybe useful in some situation where one wants to replace a hard drive and make the new hard drive 100% identical, but in a running system different disks should all have unique UUIDs.

To list all UUIDs use blkid. Below is an example of the output:

$ blkid
/dev/sda1: UUID="F379-8147" TYPE="vfat" PARTUUID="f35a875f-1a53-493e-85d4-870a7a749872"
/dev/sda2: UUID="5f12f800-1d8d-6192-0881-966a70daa16f" UUID_SUB="2d667c5b-b9f3-6510-cf76-9231122533ce" LABEL="fi-e3:0" TYPE="linux_raid_member" PARTUUID="725eab2a-f2e2-475e-81dc-9a5e18e29678"
/dev/sdb2: UUID="5f12f800-1d8d-6192-0881-966a70daa16f" UUID_SUB="0221fce5-2762-4b06-2d72-4f4f43310ba0" LABEL="fi-e3:0" TYPE="linux_raid_member" PARTUUID="cd2a477f-0b99-4dfc-baa6-f8ebb302cbbb"
/dev/md0: UUID="dcSgSA-m8WA-IcEG-l29Q-W6ti-6tRO-v7MGr1" TYPE="LVM2_member"
/dev/mapper/ssd-ssd--swap: UUID="2f2a93bd-f532-4a6d-bfa4-fcb96fb71449" TYPE="swap"
/dev/mapper/ssd-ssd--root: UUID="660ce473-5ad7-4be9-a834-4f3d3dfc33c3" TYPE="ext4"

Extra tip: listing all disk with lsblk

While fdisk -l is nice for listing partition tables, often admins also want to know the partition sizes in human readable formats and what the partitions are used for. For this purpose the command lsblk is handy. While the default output is often enough, supplying the extra arguments -o NAME,SIZE,FSTYPE,TYPE,MOUNTPOINTmakes it even better. See below an example of the output:

NAME                  SIZE FSTYPE            TYPE  MOUNTPOINT
sda                 447,1G                   disk  
├─sda1                487M vfat              part  /boot/efi
└─sda2              446,7G linux_raid_member part  
  └─md0             446,5G LVM2_member       raid1 
    ├─ssd-ssd--swap   8,8G swap              lvm   [SWAP]
    └─ssd-ssd--root 437,7G ext4              lvm   /
sdb                 447,1G                   disk  
├─sdb1                487M                   part  
└─sdb2              446,7G linux_raid_member part  
  └─md0             446,5G LVM2_member       raid1 
    ├─ssd-ssd--swap   8,8G swap              lvm   [SWAP]
    └─ssd-ssd--root 437,7G ext4              lvm   /
sdc                 447,1G                   disk  
├─sdc1                487M                   part  
└─sdc2              446,7G                   part  

A word of warning…

Remember that modifying the partition table is a destructive process. It is something the admin does while installing new systems or recovering broken ones. If done wrongly, all data might be lost!

Wednesday, 15 May 2019

No KDE Applications 19.04.1 available in flathub

The flatpak and flathub developers have changed what they consider a valid appdata file, so our appdata files that were valid last month are not valid anymore and thus we can't build the KDE Applications 19.04.1 packages.

Wednesday, 08 May 2019

Of elitists and laypeople

Spoilers Game of Thrones.

I have been watching Game of Thrones with great interest the past few weeks. It has very strongly highlighted a struggle that has been gripping my mind for a while now: That between elitists and laypeople. And I find myself in a strange in-between.

For those not in the know, the latest season of Game of Thrones is a bit controversial to say the least. If you skip past the internet vitriol, you’ll find a lot of people disliking the season for legitimate reasons: The battle tactics don’t make any sense, characters miraculously survive after the camera cuts away, time and distance stopped being an issue in a setting that used to take it slow, and there’s a weird, forced conflict that would go away entirely if these two characters that are already in love would simply marry. And the list goes on, I’m sure.

But on the other hand, there appears to be a large body of laypeople who watch and enjoy the series. Millions of people tune in every week to watch fictional people fight over a fictional throne, and they appear to be enjoying themselves. And me? Sure, I’m enjoying myself as well. I had muscle aches from the tension of watching The Long Night, and nothing gripped me more than the half-botched assassination attempt at the end of the episode.

So what gives? On the one hand enthusiasts are rightly criticising the writers for some very strange decisions, but on the other hand millions of people are enjoying the series all the same.

Of power users and newbies

I frankly don’t know the answer to the posed question, but I do know an analogy that prompted me to write this blog post. I am a humble contributor to the GNOME Project, chiefly as translator for Esperanto, but also miniscule bits and bobs here and there. GNOME faces a similar problem with detractors: They have their complaints about systemd, customisability, missing power user features, themes breaking, and so forth. And I’m sure they have some valid points, but GNOME remains the default desktop environment on many distributions, and many people use and love GNOME as I do.

These detractors often run some heavily customised Arch Linux system with some unintuitive-but-efficient window manager, and don’t have any editor other than Vim installed. Or in other words: They run a system that the vast majority of people could not and do not want to use.

And I understand these people, because in one aspect of my digital life, I have been one of them. For at least two years, I ran Spacemacs as my primary editor. For a while I even did my e-mail through that program, and I loved it. Kind of. Sure, everything was customisable, and the keyboard shortcuts were magically fast, but the mental overhead of using that program was slowly grinding me down. Some menial task that I do infrequently would turn out to involve a non-intuitive sequence of keys that you just simply need to know, and I would spend far too long on figuring that out. Or I would accidentally open Vim inside of the Emacs terminal emulator, and :q would be sent to Emacs instead of the emulator. Sure, if you know enough Emacs wizardry, you could easily escape this situation, but that’s the point, isn’t it? The wizardry involved takes effort that I don’t always want to put in, even if I know that it pays off. Kind of.

These days I use VSCodium, a Free Software version of Visual Studio Code. I like it well enough for a multitude of reasons, but mainly because the mental overhead of using this editor is a lot lower. Even so, is Emacs a better editor? Probably. If I could be bothered to maintain my Emacs wizarding skills, I am fairly certain that it would be the perfect editor. But that’s a big if. So that’s why I settle for VSCodium. And the same line of reasoning can be extended to why I use and love GNOME.

Back to Westeros

Having made that analogy, can it be mapped onto the kerfuffle surrounding Game of Thrones? Is it a matter of a small group wanting an intricate, advanced plot and a larger group wanting a simple, rudimentary story, because they can’t or don’t want to deal with a complicated story?

It seems that way, but the damnedest thing is that I don’t know. I like the latest season of Game of Thrones for what it is: An archetypical fight of good versus evil. The living gathered together to fight an undead army, and the living won. Such a story is a lot easier to get into as a layperson, and there is nothing wrong with enjoying simple, archetypical stories.

But that’s not what Game of Thrones is. Game of Thrones is the derivative of an incredibly intricate series of books with so many details and plots, and the TV series stayed faithful to that for a long time. The latest season is a huge diversion from its roots. It is, as far as I can tell, replacing vi with nano. There is nothing wrong with either, but there is a good reason why the two are separate.

Who are these laypeople, anyway?

This question is difficult to answer, because the layperson isn’t me. It can’t possibly be, because here I am writing about the subject. The layperson must be someone who isn’t particularly interested or informed. I imagine that they just turn on the telly and enjoy it for what it is. No deep thoughts, no deep investment.

But why don’t these laypeople care? Why should we care about laypeople? Must we really dumb everything down for the lowest common denominator? Why can’t they just get on my level? This is really frustrating!

Enter cars.

I have a driving license, but I don’t really care about cars. I know how to work the pedals and the steering wheel, and that’s pretty much it. I don’t know why I don’t care about cars. I just want to get from my home to my destination and back. If I can put in as little effort as possible to do that, I’m happy. I just don’t have the time or desire to learn all the intricacies of cars.

And knowing that, I suppose that I’m the layperson I was so frustrated over a moment ago. When I walk into the garage with a minor problem, I like to imagine that I’m the sort of person who shows up at tech support because I can’t log in: I accidentally pressed Caps Lock.

So the layperson is me. Sometimes.

Then who are the elitists?

Having said all of that, something throws a wrench in the works: Game of Thrones was also immensely popular when it had all the intricacies and inter-weaving plots of the first few seasons. That appears to indicate to me that laypeople aren’t allergic to the kind of story that the enthusiasts want. But they aren’t allergic to the story that is being told in season 8, either, unlike the elitists.

So why do the elitists care? Why can’t they just appreciate the same things that laypeople do? Why must it always be so complex? Why should the complaints of a few outweigh the enjoyment of many?

And this is where I get stuck. Because frankly, I don’t know. Shouldn’t everything be as accessible as possible? The more the merrier? Why should vi exist when nano suffices?

But you can take my vi key bindings from my cold, dead hands. And I love what Game of Thrones used to be, and am sad that it morphed into an archetypical story that used to be its antithesis. I want complex things, even though I switched from Spacemacs to VSCodium and use GNOME instead of i3. Not for the sake of difficulty, but because complexity gives me something that simplicity cannot.

So the elitist is me. Sometimes.


I’m still in a limbo about this clash between elitists and laypeople. Maybe the clash is superficial and the two can exist side-by-side or separately. Maybe the writers of Game of Thrones just aren’t very good and accidentally made the story for laypeople instead of their target audience of elitists. Maybe it’s a sliding scale instead of a binary.

I don’t really know. I just wanted to get these thoughts out of my head and into a text box.

Sunday, 05 May 2019

External encrypted disk on LibreELEC

Last year I replaced, on the Raspberry Pi, the ArchLinux ARM with just Kodi installed with LibreELEC.

Today I plugged an external disk encrypted with dm-crypt, but to my full surprise this isn’t supported.

Luckily the project is open source and sky42 already provides a LibreELEC version with dm-crypt built-in support.

Once I flashed sky42’s version, I setup automated mount at startup via the script and the corresponding umount via this way:

// copy your keyfile into /storage via SSH
$ cat /storage/.config/
cryptsetup luksOpen /dev/sda1 disk1 --key-file /storage/keyfile
mount /dev/mapper/disk1 /media

$ cat /storage/.config/
umount /media
cryptsetup luksClose disk1

Reboot it and voilà!


If you want to automatically mount the disk whenever you plug it, then create the following udev rule:

// Find out ID_VENDOR_ID and ID_MODEL_ID for your drive by using `udevadm info`
$ cat /storage/.config/udev.rules.d/99-automount.rules
ACTION=="add", SUBSYSTEM=="usb", SUBSYSTEM=="block", ENV{ID_VENDOR_ID}=="0000", ENV{ID_MODEL_ID}=="9999", RUN+="cryptsetup luksOpen $env{DEVNAME} disk1 --key-file /storage/keyfile", RUN+="mount /dev/mapper/disk1 /media"

Saturday, 04 May 2019

Hardening OpenSSH Server

start by reading:

man 5 sshd_config


CentOS 6.x

Ciphers aes128-ctr,aes192-ctr,aes256-ctr
KexAlgorithms diffie-hellman-group-exchange-sha256
MACs hmac-sha2-256,hmac-sha2-512

and change the below setting in /etc/sysconfig/sshd:



CentOS 7.x

KexAlgorithms curve25519-sha256,,diffie-hellman-group-exchange-sha256,diffie-hellman-group16-sha512,diffie-hellman-group18-sha512,diffie-hellman-group14-sha256


Ubuntu 18.04.2 LTS

KexAlgorithms curve25519-sha256,,diffie-hellman-group-exchange-sha256,diffie-hellman-group16-sha512,diffie-hellman-group18-sha512,diffie-hellman-group14-sha256



KexAlgorithms curve25519-sha256,,diffie-hellman-group16-sha512,diffie-hellman-group18-sha512,diffie-hellman-group14-sha256


Renew SSH Host Keys

rm -f /etc/ssh/ssh_host_* && service sshd restart


Generating ssh moduli file

for advance users only

ssh-keygen -G /tmp/moduli -b 4096

ssh-keygen -T /etc/ssh/moduli -f /tmp/moduli 
Tag(s): openssh

Automated phone backup with Syncthing

How do you backup your phones? Do you?

I use to perform a copy of all the photos and videos from my and my wife’s phone to my PC monthly and then I copy them to an external HDD attached to a Raspberry Pi.

However, it’s a tedious job mainly because: - I cannot really use the phones during this process; - MTP works one in 3 times - often I have to fallback to ADB; - I have to unmount the SD cards to speed up the copy; - after I copy the files, I have to rsync everything to the external HDD.

The Syncthing way

Syncthing describes itself as:

Syncthing replaces proprietary sync and cloud services with something open, trustworthy and decentralized.

I installed it to our Android phones and on the Raspberry Pi. On the Raspberry Pi I also enabled remote access.

I started the Syncthing application on the Android phones and I’ve chosen the folders (you can also select the whole Internal memory) to backup. Then, I shared them with the Raspberry Pi only and I set the folder type to “Send Only” because I don’t want the Android phone to retrieve any file from the Raspberry Pi.

On the Raspberry Pi, I accepted the sharing request from the Android phones, but I also changed the folder type to “Receive Only” because I don’t want the Raspberry Pi to send any file to the Android phones.

All done? Not yet.

Syncthing main purpose is to sync, not to backup. This means that, by default, if I delete a photo from my phone, that photo is gone from the Raspberry Pi too and this isn’t what I do need nor what I do want.

However, Syncthing supports File Versioning and best yet it does support a “trash can”-like file versioning which moves your deleted files into a .stversions subfolder, but if this isn’t enough yet you can also write your own file versioning script.

All done? Yes! Whenever I do connect to my own WiFi my photos are backed up!

Friday, 03 May 2019

How I put order in my bookmarks and found a better way to organise them

I have gone through several stages of this and so far nothing has stuck as ideal, but I think I am inching towards it.

To start off, I have to confess that while I love the internet and the web, I loathe having everything in the browser. The browser becoming the OS is what seems to be happening, and I hate that thought. I like to keep things locally, having backups, and control over my documents and data. Although I changed my e-mail provider(s) several times, I still have all my e-mail locally stored from 2003 until today.

I also do not like reading longer texts on an LCD, so I usually put longer texts into either Wallabag or Mozilla’s Pocket to read them later on my eInk reader (Kobo Aura). BTW, Wallabag and Pocket both have their pros and cons themselves. Pocket is more popular and better integrated into a lot of things (e.g. Firefox, Kobo, etc.), while Wallabag is fully FOSS (even the server) and offers some extra features that are in Pocket either subject to subscription or completely missing.

Still, an enormous amount of information is (and should be!) on the web, so each of us needs to somehow keep track and make sense of it :)

So, with that intro out of the way, here is how I tackle(d) this mess.

Historic overview of methods I used so far

Hierarchy of folders

As many of us, I guess, I started with first putting bookmarks in the bookmark bar, but soon had to start organising them into folders … and subfolders … and subsubfolders …and subsubsubfolders … until the screen did not fit the whole tree of them any more when expanded.


  • can be neat and tidy
  • easy to sync between devices through e.g. Firefox Sync


  • can become a huge mess, once it grows to a behemoth
  • takes several clicks to put a bookmark into the appropriate (sub)folder

Then I decided to keep it flat and use the Firefox search bar to find what I am looking for.

To achieve that, when I bookmarked something, I renamed it to something useful and added tags (e.g.: shop, tea; or python, sql, howto).

This worked kinda OK, but a big downside is that there is a huge amount of clutter which is not easy to navigate and edit once you want to organise all the already existing bookmarks. The bookmark panel is somewhat helpful, but not a lot.


  • easy to search
  • easy to find a relevant bookmark when you are about to search for something through the combined URL/search bar
  • easy to sync between devices through e.g. Firefox Sync


  • your search query must match the name, tag(s), or URL of bookmark
  • hard to find or navigate other than searching (for name tag, URL)


Several years after that, I learnt about OneTab from an onboarding website of a company I applied to (but did not get the job). The main promise of it is to loads of open tabs into (simple) organised lists on a single page. And all that with a single click (well, two really).

This worked wonders for (still does) for decluttering my tab list. Especially when grouped with Tree Style Tabs, which I very warmly recommend trying out. Even if it looks odd and unrully at first, it is very easy to get used to and helps organise tabs immensely. But back to OneTab…

The good side of OneTab is that it really helps keep your tab bar clean and therefore reduces your computer’s resource usage. It is also super for keeping track of tabs that you may (or maybe not) need to open again later, as you can (re)open a whole group of “tabs” with a single click.

As a practical example, let us say I am travelling to Barcelona in two months. So I book flights and the hotel, and in the process also check out some touristy and other helpful info. Because I will not be needing the touristy and travel stuff for quite some time before the trip, I do not need all the tabs open. But as it is a one-off trip, it is also silly to bookmark it all. So I send them all to OneTab and name the group e.g. “Barcelona trip 2019”. If I stumble upon any new stuff that is relevant, I simply send it to the same Named Group in OneTab. Once I need that info, I either open individual “tabs” or restore the whole group with one click and have it ready. An additional cool thing is that by default if you open a group or a single link “tab” from OneTab, it will remove it from the list. You can decide to keep the links in the list as well.

In practices, I still used tagged bookmarks for links that I wanted to store long-term, while depending on OneTab for short- to mid-term storage.


  • great for decluttering your tabs
  • helps keep your browser’s resource usage low
  • great for creating (temporary) lists of tabs that you do not need now, but will in the future
  • can easily send a group of “tabs” with others via e-mail


  • no tags, categories or other means of adding meta data – you can only name groups, and cannot even rename links
  • no searching other than through the “webpage” list of “tabs”
  • as the list of “tabs”/bookmarks grows, the harder it is to keep an overview
  • cannot sync between devices
  • (proprietary plug-in)

Worldbrain’s Memex

About two months ago, I stumbled upon Worldbrain’s Memex through a FOSDEM talk. It promises to fix bookmarking, searching, note-taking and web history for you … which is quite an impressive lot.

So far, I have to say, I am quite impressed. It is super easy to find stuff you visited, even if you forgot to bookmark it, as it indexes all the websites you visit (unless you put tell Memex to ignore that page or domain).

For more order, you can assign tags to websites and/or store them into collections (i.e. groups or folders). What is more, you can do that even later, if you forgot about it the first time. If you want to especially emphasise a specific website, you can also star it.

An excellent feature missing in other bookmarking methods I have seen so far is that it lets you annotate websites – through highlights and comments and tags attached to those highlights. So, not only can you store comments and tags on the websites, but also on annotations within those websites.

One concern I have is that they might have taken more than what they can chew, but since I started using it, I have seen so much progress that I am (cautiously) optimistic about it.


  • supports both tags and collections (i.e. groups)
  • enables annotations/highlights and comments (as well as tags to both) to websites
  • indexes websites, so when you search for something it goes through both the website’s text, as well as your notes to that website and, of course, tags
  • starring websites you would like to find more easily
  • you can also set specific websites or domain names to be ignored
  • it offers quite an advanced search, including limiting by data ranges, stars, or domains
  • when you search for something (e.g. using DuckDuckGo or Google) it shows suggested websites that you already visited before
  • sharing of annotations and comments with others (as long as they also have Memex installed)
  • for annotations it uses the W3C Open Annotation spec
  • stores everything locally (with the exception of sharing annotations via a link, of course)


  • it consumes more disk space due to running its own index
  • needs an external app for backing up data
  • so far no syncing of bookmarks between devices (but it is in the making)
  • so far it does not sync annotations between different devices (but both mobile apps for iOS/Android, and Pocket integration are in the making)

Status quo and looking at the future

I currently have still a few dozen bookmarks that I need to tag in Memex and delete from my Firefox bookmarks. And a further several dozen in OneTab.

The most viewed websites, I have in the “Top Sites” in Firefox.

Most of the “tabs” in OneTab, I have already migrated to Memex and I am looking very much forward to trying to use it instead of OneTab. So far it seems a bit more work, as I need to 1) open all tabs into a tab tree (same as in OneTab), 2) open that tab tree in a separate window (extra step), and then 3) use the “Tag all tabs in window” or “Add all tabs in window” option from the extension button (similar as in OneTab), and finally 4) close the tabs by closing the window (extra step). What I usually do is to change a Tab Group from OneTab to a Collection in Memex and then take some extra time to add tags or notes, if appropriate.

So, I am quite confident Memex will be able to replace OneTab for me and most likely also (most) normal bookmarks. I may keep some bookmarks of things that I want to always keep track of, like my online bank’s URL, but I am not sure yet.

The annotations are a god-send as well, which will be very hard to get rid of, as I already got used to them.

Now, if I could only send stuff to my eInk reader (or phone), annotate it there and have those annotations auto-magically show up in the browser and therefore stored locally on my laptop … :D

Oh, oh, and if I could search through Memex from my KDE Plasma desktop and add/view annotations from other documents (e.g. ePub, ODF, PDF) and other applicatios (e.g. Okular, Calibre, LibreOffice). One may dream …

hook out → sipping Vin Santo and planning more order in bookmarks

P.S. This blog post was initially a comment to the topic “How do you organize your bookmarks?” in the ~tech group on Tildes where further discussion is happening as well.

Monday, 22 April 2019

[Some] KDE Applications 19.04 also available in flathub

The KDE Applications 19.04 release announcement (read it if you haven't, it's very complete) mentions some of the applications are available at the snap store, but forgets to mention flathub.

Just wanted to bring up that there's also some of the applications available in there

All the ones that are released along KDE Applications 19.04 were updated on release day (except kubrick that has a compilation issue and will be updated for 19.04.1 and kontact which is a best and to be honest i didn't particularly feel like updating it)

If you feel like helping there's more applications that need adding and more automation that needs to happen, so get in touch :)

Monday, 15 April 2019

Closer Look at the Double Ratchet

In the last blog post, I took a closer look at how the Extended Triple Diffie-Hellman Key Exchange (X3DH) is used in OMEMO and which role PreKeys are playing. This post is about the other big algorithm that makes up OMEMO. The Double Ratchet.

The Double Ratchet algorithm can be seen as the gearbox of the OMEMO machine. In order to understand the Double Ratchet, we will first have to understand what a ratchet is.

Before we start: This post makes no guarantees to be 100% correct. It is only meant to explain the inner workings of the Double Ratchet algorithm in a (hopefully) more or less understandable way. Many details are simplified or omitted for sake of simplicity. If you want to implement this algorithm, please read the Double Ratchet specification.

A ratchet tool can only turn in one direction, hence it is eponymous for the algorithm.
Image by Benedikt.Seidl [Public domain]

A ratchet is a tool used to drive nuts and bolts. The distinctive feature of a ratchet tool over an ordinary wrench is, that the part that grips the head of the bolt can only turn in one direction. It is not possible to turn it in the opposite direction as it is supposed to.

In OMEMO, ratchet functions are one-way functions that basically take input keys and derives a new keys from that. Doing it in this direction is easy (like turning the ratchet tool in the right direction), but it is impossible to reverse the process and calculate the original key from the derived key (analogue to turning the ratchet in the opposite direction).

Symmetric Key Ratchet

One type of ratchet is the symmetric key ratchet (abbrev. sk ratchet). It takes a key and some input data and produces a new key, as well as some output data. The new key is derived from the old key by using a so called Key Derivation Function. Repeating the process multiple times creates a Key Derivation Function Chain (KDF-Chain). The fact that it is impossible to reverse a key derivation is what gives the OMEMO protocol the property of Forward Secrecy.

A Key Derivation Function Chain or Symmetric Ratchet

The above image illustrates the process of using a KDF-Chain to generate output keys from input data. In every step, the KDF-Chain takes the input and the current KDF-Key to generate the output key. Then it derives a new KDF-Key from the old one, replacing it in the process.

To summarize once again: Every time the KDF-Chain is used to generate an output key from some input, its KDF-Key is replaced, so if the input is the same in two steps, the output will still be different due to the changed KDF-Key.

One issue of this ratchet is, that it does not provide future secrecy. That means once an attacker gets access to one of the KDF-Keys of the chain, they can use that key to derive all following keys in the chain from that point on. They basically just have to turn the ratchet forwards.

Diffie-Hellman Ratchet

The second type of ratchet that we have to take a look at is the Diffie-Hellman Ratchet. This ratchet is basically a repeated Diffie-Hellman Key Exchange with changing key pairs. Every user has a separate DH ratcheting key pair, which is being replaced with new keys under certain conditions. Whenever one of the parties sends a message, they include the public part of their current DH ratcheting key pair in the message. Once the recipient receives the message, they extract that public key and do a handshake with it using their private ratcheting key. The resulting shared secret is used to reset their receiving chain (more on that later).

Once the recipient creates a response message, they create a new random ratchet key and do another handshake with their new private key and the senders public key. The result is used to reset the sending chain (again, more on that later).

Principle of the Diffie-Hellman Ratchet.
Image by OpenWhisperSystems (modified by author)

As a result, the DH ratchet is forwarded every time the direction of the message flow changes. The resulting keys are used to reset the sending-/receiving chains. This introduces future secrecy in the protocol.

The Diffie-Hellman Ratchet


A session between two devices has three chains – a root chain, a sending chain and a receiving chain.

The root chain is a KDF chain which is initialized with the shared secret which was established using the X3DH handshake. Both devices involved in the session have the same root chain. Contrary to the sending and receiving chains, the root chain is only initialized/reset once at the beginning of the session.

The sending chain of the session on device A equals the receiving chain on device B. On the other hand, the receiving chain on device A equals the sending chain on device B. The sending chain is used to generate message keys which are used to encrypt messages. The receiving chain on the other hand generates keys which can decrypt incoming messages.

Whenever the direction of the message flow changes, the sending and receiving chains are reset, meaning their keys are replaced with new keys generated by the root chain.

The full Double Ratchet Algorithms Ratchet Architecture

An Example

I think this rather complex protocol is best explained by an example message flow which demonstrates what actually happens during message sending / receiving etc.

In our example, Obi-Wan and Grievous have a conversation. Obi-Wan starts by establishing a session with Grievous and sends his initial message. Grievous responds by sending two messages back. Unfortunately the first of his replies goes missing.

Session Creation

In order to establish a session with Grievous, Obi-Wan has to first fetch one of Grievous key bundles. He uses this to establish a shared secret S between him and Grievous by executing a X3DH key exchange. More details on this can be found in my previous post. He also extracts Grievous signed PreKey ratcheting public key. S is used to initialize the root chain.

Obi-Wan now uses Grievous public ratchet key and does a handshake with his own ratchet private key to generate another shared secret which is pumped into the root chain. The output is used to initialize the sending chain and the KDF-Key of the root chain is replaced.

Now Obi-Wan established a session with Grievous without even sending a message. Nice!

The session initiator prepares the sending chain.
The initial root key comes from the result of the X3DH handshake.
Original image by OpenWhisperSystems (modified by author)

Initial Message

Now the session is established on Obi-Wans side and he can start composing a message. He decides to send a classy “Hello there!” as a greeting. He uses his sending chain to generate a message key which is used to encrypt the message.

Principle of generating message keys from the a KDF-Chain.
In our example only one message key is derived though.
Image by OpenWhisperSystems

Note: In the above image a constant is used as input for the KDF-Chain. This constant is defined by the protocol and isn’t important to understand whats going on.

Now Obi-Wan sends over the encrypted message along with his ratcheting public key and some information on what PreKey he used, the current sending key chain index (1), etc.

When Grievous receives Obi-Wan’s message, he completes his X3DH handshake with Obi-Wan in order to calculate the same exact shared secret S as Obi-Wan did earlier. He also uses S to initialize his root chain.

Now Grevious does a full ratchet step of the Diffie-Hellman Ratchet: He uses his private and Obi-Wans public ratchet key to do a handshake and initialize his receiving chain with the result. Note: The result of the handshake is the same exact value that Obi-Wan earlier calculated when he initialized his sending chain. Fantastic, isn’t it? Next he deletes his old ratchet key pair and generates a fresh one. Using the fresh private key, he does another handshake with Obi-Wans public key and uses the result to initialize his sending chain. This completes the full DH ratchet step.

Full Diffie-Hellman Ratchet Step
Image by OpenWhisperSystems

Decrypting the Message

Now that Grievous has finalized his side of the session, he can go ahead and decrypt Obi-Wans message. Since the message contains the sending chain index 1, Grievous knows, that he has to use the first message key generated from his receiving chain to decrypt the message. Because his receiving chain equals Obi-Wans sending chain, it will generate the exact same keys, so Grievous can use the first key to successfully decrypt Obi-Wans message.

Sending a Reply

Grievous is surprised by bold actions of Obi-Wan and promptly goes ahead to send two replies.

He advances his freshly initialized sending chain to generate a fresh message key (with index 1). He uses the key to encrypt his first message “General Kenobi!” and sends it over to Obi-Wan. He includes his public ratchet key in the message.

Unfortunately though the message goes missing and is never received.

He then forwards his sending chain a second time to generate another message key (index 2). Using that key he encrypt the message “You are a bold one.” and sends it to Obi-Wan. This message contains the same public ratchet key as the first one, but has the sending chain index 2. This time the message is received.

Receiving the Reply

Once Obi-Wan receives the second message and does a full ratchet step in order to complete his session with Grevious. First he does a DH handshake between his private and the Grevouos’ public ratcheting key he got from the message. The result is used to setup his receiving chain. He then generates a new ratchet key pair and does a second handshake. The result is used to reset his sending chain.

Obi-Wan notices that the sending chain index of the received message is 2 instead of 1, so he knows that one message must have been missing or delayed. To deal with this problem, he advances his receiving chain twice (meaning he generates two message keys from the receiving chain) and caches the first key. If later the missing message arrives, the cached key can be used to successfully decrypt the message. For now only one message arrived though. Obi-Wan uses the generated message key to successfully decrypt the message.


What have we learned from this example?

Firstly, we can see that the protocol guarantees forward secrecy. The KDF-Chains used in the three chains can only be advanced forwards, and it is impossible to turn them backwards to generate earlier keys. This means that if an attacker manages to get access to the state of the receiving chain, they can not decrypt messages sent prior to the moment of attack.

But what about future messages? Since the Diffie-Hellman ratchet introduces new randomness in every step (new random keys are generated), an attacker is locked out after one step of the DH ratchet. Since the DH ratchet is used to reset the symmetric ratchets of the sending and receiving chain, the window of the compromise is limited by the next DH ratchet step (meaning once the other party replies, the attacker is locked out again).

On top of this, the double ratchet algorithm can deal with missing or out-of-order messages, as keys generated from the receiving chain can be cached for later use. If at some point Obi-Wan receives the missing message, he can simply use the cached key to decrypt its contents.

This self-healing property was eponymous to the Axolotl protocol (an earlier name of the Signal protocol, the basis of OMEMO).


Thanks to syndace and paul for their feedback and clarification on some points.

Saturday, 13 April 2019

Tenth Anniversary of AltOS

In the early days of the collaboration between Bdale Garbee and Keith Packard that later became Altus Metrum, the software for TeleMetrum was crafted as an application running on top of an existing open source RTOS. It didn't take long to discover that the RTOS was ill-suited to our needs, and Keith had to re-write various parts of it to make things fit in the memory available and work at all.

Eventually, Bdale idly asked Keith how much of the RTOS he'd have to rewrite before it would make sense to just start over from scratch. Keith took that question seriously, and after disappearing for a day or so, the first code for AltOS was committed to revision control on 12 April 2009.

Ten years later, AltOS runs on multiple processor architectures, and is at the heart of all Altus Metrum products.

Friday, 05 April 2019

Hard drive failure in my zpool 😞

I have a storage box in my house that stores important documents, backups, VM disk images, photos, a copy of the Tor Metrics archive and other odd things. I’ve put a lot of effort into making sure that it is both reliable and performant. When I was working on a modern CollecTor for Tor Metrics recently, I used this to be able to run the entire history of the Tor network through the prototype replacement to see if I could catch any bugs.

I have had my share of data loss events in my life, but since I’ve found ZFS I have hope that it is possible to avoid, or at least seriously minimise the risk of, any catastrophic data loss events ever happening to me again. ZFS has:

  • cryptographic checksums to validate data integrity
  • mirroring of disks
  • “scrub” function that ensures that the data on disk is actually still good even if you’ve not looked at it yourself in a while

ZFS on its own is not the entire solution though. I also mix-and-match hard drive models to ensure that a systematic fault in a particular model won’t wipe out all my mirrors at once, and I also have scheduled SMART self-tests to detect faults before any data loss has occured.

Unfortunately, one of my drives in my zpool has failed a SMART self-test.

Unfortunately, one of my drives in my zpool has failed a SMART self-test.

This means I now have to treat that drive as “going to fail soon” which means that I don’t have redundancy in my zpool anymore, so I have to act. Fortunately, in September 2017 when my workstation died, I received some donations towards the hardware I use for my open source work and I did buy a spare HDD for this very situation!

At present my zpool setup looks like:

% zpool status flat
  pool: flat
 state: ONLINE
  scan: scrub repaired 0 in 0 days 07:05:28 with 0 errors on Fri Apr  5 07:05:36 2019

	NAME                                            STATE     READ WRITE CKSUM
	flat                                            ONLINE       0     0     0
	  mirror-0                                      ONLINE       0     0     0
	    gptid/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx  ONLINE       0     0     0
	    gptid/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx  ONLINE       0     0     0
	  mirror-1                                      ONLINE       0     0     0
	    gptid/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx  ONLINE       0     0     0
	    gptid/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx  ONLINE       0     0     0
	  gptid/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx  ONLINE       0     0     0

errors: No known data errors

The drives in the two mirrors are 3TB drives, in each mirror is one WD Red and one Toshiba NAS drive. In this case, it is one of the WD Red drives that has failed and I’ll be replacing it with another WD Red. One important thing to note is that you have to replace the drive with one of equal or greater capacity. In this case it is the same model so the capacity should be the same, but not all X TB drives are going to be the same size.

You’ll notice here that it is saying No known data errors. This is because there hasn’t been any issues with the data yet, it is just a SMART failure, and hopefully by replacing the disk any data error can be avoided entirely.

My plan was to move to a new system soon, with 8 bays. In that system I’ll keep the stripe over 2 mirrors but one mirror will run over 3x 6TB drives with the other remaining on 2x 3TB drives. This incident leaves me with only 1 leftover 3TB drive though so maybe I’ll have to rethink this.

Free space remaining in my zpool

Free space remaining in my zpool

My current machine, an HP MicroServer, does not support hot-swapping the drives so I have to start by powering off the machine and replacing the drive.

% zpool status flat
  pool: flat
 state: DEGRADED
status: One or more devices could not be opened.  Sufficient replicas exist for
	the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using 'zpool online'.
  scan: scrub repaired 0 in 0 days 07:05:28 with 0 errors on Fri Apr  5 07:05:36 2019

	NAME                                            STATE     READ WRITE CKSUM
	flat                                            DEGRADED     0     0     0
	  mirror-0                                      ONLINE       0     0     0
	    gptid/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx  ONLINE       0     0     0
	    gptid/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx  ONLINE       0     0     0
	  mirror-1                                      DEGRADED     0     0     0
	    gptid/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx  ONLINE       0     0     0
	    xxxxxxxxxxxxxxxxxxxx                        UNAVAIL      0     0     0  was /dev/gptid/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
	  gptid/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx    ONLINE       0     0     0

errors: No known data errors

The disk that was part of the mirror is now unavailable, but the pool is still functioning as the other disk is still present. This means that there are still no data errors and everything is still running. The only downtime was due to the non-hot-swappableness of my SATA controller.

Through the web interface in FreeNAS, it is possible to now use the new disk to replace the old disk in the mirror: Storage -> View Volumes -> Volume Status (under the table, with the zpool highlighted) -> Replace (with the unavailable disk highlighted).

Running zpool status again:

% zpool status flat
  pool: flat
 state: ONLINE
status: One or more devices is currently being resilvered.  The pool will
	continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Fri Apr  5 16:55:47 2019
	1.30T scanned at 576M/s, 967G issued at 1.12G/s, 4.33T total
	4.73G resilvered, 21.82% done, 0 days 00:51:29 to go

	NAME                                            STATE     READ WRITE CKSUM
	flat                                            ONLINE       0     0     0
	  mirror-0                                      ONLINE       0     0     0
	    gptid/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx  ONLINE       0     0     0
	    gptid/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx  ONLINE       0     0     0
	  mirror-1                                      ONLINE       0     0     0
	    gptid/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx  ONLINE       0     0     0
	    gptid/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx  ONLINE       0     0     0  (resilvering)
	  gptid/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx    ONLINE       0     0     0

errors: No known data errors

And everything should be OK again soon, now with the dangerous disk removed and a hopefully more reliable disk installed.

A more optimistic message from FreeNAS

A more optimistic message from FreeNAS

This has put a dent in my plans to upgrade my storage, so for now I’ve added the hard drives I’m looking for to my Amazon wishlist.

As for the drive that failed, I’ll be doing an ATA Secure Erase and then disposing of it. NIST SP 800-88 thinks that ATA Secure Erase is in the same category as degaussing a hard drive and that it is more effective than overwriting the disk with software. ATA Secure Erase is faster too because it’s the hard drive controller doing the work. I just have to hope that my firmware wasn’t replaced with firmware that only fakes the process (or I’ll just do an overwrite anyway to be sure). According to the same NIST document, “for ATA disk drives manufactured after 2001 (over 15 GB) clearing by overwriting the media once is adequate to protect the media from both keyboard and laboratory attack”.

This blog post is also a little experiment. I’ve used a Unicode emoji in the title, and I want to see how various feed aggregators and bots handle that. Sorry if I broke your aggregator or bot.

IETF 104 in Prague

Thanks to support from Article 19, I was able to attend IETF 104 in Prague, Czech Republic this week. Primarily this was to present my Internet Draft which takes safe measurement principles from Tor Metrics work and the Research Safety Board and applies them to Internet Measurement in general.

My IETF badge, complete with additional tag for my nick

My IETF badge, complete with additional tag for my nick

I attended with a free one-day pass for the IETF and free hackathon registration, so more than just the draft presentation happened. During the hackathon I sat at the MAPRG table and worked on PATHspider with Mirja Kühlewind from ETH Zurich. We have the code running again with the latest libraries available in Debian testing and this may become the basis of a future Tor exit scanner (for generating exit lists, and possibly also some bad exit detection). We ran a quick measurement campaign that was reported in the hackathon presentations.

During the hackathon I also spoke to Watson Ladd from Cloudflare about his Roughtime draft which could be interesting for Tor for a number of reasons. One would be for verifying if a consensus is fresh, another would be for Tor Browser to detect if a TLS cert is valid, and another would be providing archive signatures for Tor Metrics. (We’ve started looking at archive signatures since our recent work on modernising CollecTor).

On the Monday, this was the first “real” day of the IETF. The day started off for me at the PEARG meeting. I presented my draft as the first presentation in that session. The feedback was all positive, it seems like having the document is both desirable and timely.

The next presentation was from Ryan Guest at Salesforce. He was talking about privacy considerations for application level logging. I think this would also be a useful draft that compliments my draft on safe measurement, or maybe even becomes part of my draft. I need to follow up with him to see what he wants to do. A future IETF hackathon project might be comparing Tor’s safe logging with whatever guidelines we come up with, and also comparing our web server logs setup.

Nick Sullivan was up next with his presentation on Privacy Pass. It seems like a nice scheme, assuming someone can audit the anti-tagging properties of it. The most interesting thing I took away from it is that federation is being explored which would turn this into a system that isn’t just for Cloudflare.

Amelia Andersdotter and Christoffer Långström then presented on differential privacy. They have been exploring how it can be applied to binary values as opposed to continuous, and how it could be applied to Internet protocols like the QUIC spin bit.

The last research presentation was Martin Schanzenbach presenting on an identity provider based on the GNU Name System. This one was not so interesting for me, but maybe others are interested.

I attended the first part of the Stopping Malware and Researching Threats (SMART) session. There was an update from Symantec based on their ISTR report and I briefly saw the start of a presentation about “Malicious Uses of Evasive Communications and Threats to Privacy“ but had to leave early to attend another meeting. I plan to go back and look through all of the slides from this session later.

The next IETF meeting is directly after the next Tor meeting (I had thought for some reason it directly clashed, but I guess I was wrong). I will plan to remotely participate in PEARG again there and move my draft forwards.

When the Duke walks, you don't notice it

This is my late submission for transgender day of visibility. It comes almost a week late, but I suppose I’ll use this proverb that is popular among the trans community to justify myself:

The best time to plant a tree was 20 years ago. The second best time is now.

Or: The best time to post about transgender day of visibility was 31st of March. The second best time is 5th of April.

I should preface this post by emphasising that all of its contents are exclusively my own experiences, and may not speak for anybody other than myself. It is written in the spirit of visibility, so that the public knows that transgender people exist, and that ultimately we are normal people.

A second justification for my lateness has been my hesitance to broadcast this to the internet. I like privacy, and I take a lot of steps to safeguard it (e.g, by using Free Software). But there is one step that I have made only halfway, and that is anonymity. I use a VPN to hide my IP address, and I take special care not to give internet giants all of my personal data—I am a nobody when I surf the web.

But when I interact with human beings on the internet, I try to be me. This is terrible privacy advice, because the internet never forgets when you make a mistake. But I find it important, because the internet is a very real place. More and more, the internet affects our collective lives. It allows us to do tangible things such as purchasing items, and it does intangible things such as morphing our perception and opinion of the world. Anonymity allows you to enter this space—this real space—as an ethereal ghost, existing perpetually out of sight, but able to interact just the same. It does not take a creative mind to imagine how this can be abused, and people do.

I could abuse such ghostly powers for good, but I am not comfortable with holding that power. So I wish to be myself in spite of knowing better. To exist online under this name, I must self-censor. I must not say things that I imagine will come back to harm my future self, and I must hide aspects about myself that I do not want everybody to know—say it once, and the whole world knows.

And for years, I haven’t said it: I am transgender. By itself this is unimportant (so what?), but the act of saying it is not. The act of saying it means that anybody, absolutely anybody until the end of the digital age, can discover this about me and hold it against me, and there is no shortage of people who would. And that is frankly quite scary.

But the act of saying it is also activism. By saying it, you assert your existence in the face of an ideology that wishes you didn’t. By saying it, you own the narrative of what it means to be trans, rather than ideologues who would paint you in a dehumanised light. By saying it, you make tangible and visible a human experience that many people do not understand. At the risk of sounding self-aggrandising, there is power in that.

The last point I find especially empowering. Until the exact moment I decided to transition, I simply did not know of the mere existence of trans people. I knew about drag queens flamboyantly dancing on boats in the canals of Amsterdam, but those people were otherworldly to me. They weren’t tangibly real, and they weren’t me. Had I known that transgender people were everyday women and men who care about the same things that I do, I would have spared myself a lot of mental anguish and made the leap a lot sooner.

Instead, something else prompted that realisation. I was reading a Christmas novel in Summer, as you do. The book was called “Let It Snow: Three Holiday Romances” authored by John Green, Lauren Myracle and Maureen Johnson. The book has three POVs in a town struck by a heavy snow storm, and there’s a lot of interplay between the POVs.

I was reading one of John Green’s chapters. His main character had long been great friends with a tomboyish girl (nicknamed “the Duke”) who struggled with her gender expression, and the two embark on a great journey through the snow storm to reach the waffle house. During this trek, there is a scene where he is walking a few paces behind her, and he is looking at her. And while looking, and through the shared experiences, a sudden thought strikes him: “Anyway, the Duke was walking, and there was a certain something to it, and I was kind of disgusted with myself for thinking about that certain something. [‌] When Brittany the cheerleader walks, you notice it. When the Duke walks, you don’t. Usually.”

These two had been friends for the longest time, and for the first time, he entertained the thought that it might be something more than that. And you read on, and on, and this wayward thought starts to become quite real and serious. And suddenly he becomes self-aware of the thought absorbing him:

“Once you think a thought, it is extremely difficult to unthink it. And I had thought the thought.”

It hit me like a brick. In that very same instance, I, too, thought the thought. What had been a feeling for so long, I finally thought out loud in my head, and it was impossible for me to unthink it. It had nothing to do with the novel, and I have no idea how I made that leap, but in that moment I realised for the first time, truly realised, that I did not want to be a boy—that I wanted to be a girl. And I was miserable for it, but eventually better off. A quick search later and I discovered that trans people exist, and that transition is an actual thing that normal people do.

So I did it. And it has been good. Whatever ailed me prior to transition is mostly gone, and I have become a functioning adult who does many non-transgender-related things such as translating GNOME into Esperanto and creating cat monsters for Dungeons & Dragons. But I never really included being transgender in any of my online activities, and I want to change that. I want to be more like the Duke. I want to walk like her, and while people may not always see that walk, I want to call attention to it every now and then. And maybe it will help someone be struck by the thought, whatever spark of madness it is that they need.

Happy transgender day of visibility.

Thursday, 04 April 2019

The Power of Workflow Scripts

Nextcloud has the ability to define some conditions under which external scripts are executed. The app which makes this possible is called “Workflow Script”. I always knew that this powerful tool exists, yet I never really had a use case for it. This changed last week. Task I heavily rely on text files for note taking. I organize them in folders, for example I have a “Projects” folders with sub-folders for each project I work on currently.

Wednesday, 03 April 2019

Shaking Hands With OMEMO: X3DH Key Exchange

This is the first part of a small series about the cryptographic building blocks of OMEMO. This post is about the Extended Triple Diffie Hellman Key Exchange Algorithm (X3DH) which is used to establish a session between OMEMO devices.
Part 2: Closer Look at the Double Ratchet

In the past I have written some posts about OMEMO and its future and how it does compare to the Olm encryption protocol used by However, some readers requested a closer, but still straightforward look at how OMEMO and the underlying algorithms work. To get started, we first have to take a look at its past.

OMEMO was implemented in the Android Jabber Client Conversations as part of a Google Summer of Code project by Andreas Straub in 2015. The basic idea was to utilize the encryption library used by Signal (formerly TextSecure) for message encryption. So basically OMEMO borrows almost all the cryptographic mechanisms including the Double Ratchet and X3DH from Signals encryption protocol, which is appropriately named Signal Protocol. So to begin with, lets look at it first.

The Signal Protocol

The famous and ingenious protocol that drives the encryption behind Signal, OMEMO,, WhatsApp and a lot more was created by Trevor Perrin and Moxie Marlinspike in 2013. Basically it consists of two parts that we need to further investigate:

  • The Extended Triple-Diffie-Hellman Key Exchange (X3DH)
  • The Double Ratchet Algorithm

One core principle of the protocol is to get rid of encryption keys as soon as possible. Almost every message is encrypted with another fresh key. This is a huge difference to other protocols like OpenPGP, where the user only has one key which can decrypt all messages ever sent to them. The later can of course also be seen as an advantage OpenPGP has over OMEMO, but it all depends on the situation the user is in and what they have to protect against.

A major improvement that the Signal Protocol introduced compared to encryption protocols like OTRv3 (Off-The-Record Messaging) was the ability to start a conversation with a chat partner in an asynchronous fashion, meaning that the other end didn’t have to be online in order to agree on a shared key. This was not possible with OTRv3, since both parties had to actively send messages in order to establish a session. This was okay back in the days where people would start their computer with the intention to chat with other users that were online at the same time, but it’s no longer suitable today.

Note: The recently worked on OTRv4 will not come with this handicap anymore.

The X3DH Key Exchange

Let’s get to it already!

X3DH is a key agreement protocol, meaning it is used when two parties establish a session in order to agree on a shared secret. For a conversation to be confidential we require, that only sender and (intended) recipient of a message are able to decrypt it. This is possible when they share a common secret (eg. a password or shared key). Exchanging this key with one another has long been kind of a hen and egg problem: How do you get the key from one end to the other without an adversary being able to get a copy of the key? Well, obviously by encrypting it, but how? How do you get that key to the other side? This problem has only been solved after the second world war.

The solution is a so called Diffie-Hellman-Merkle Key Exchange. I don’t want to go into too much detail about this, as there are really great resources about how it works available online, but the basic idea is that each party possesses an asymmetric key pair consisting of a public and a private key. The public key can be shared over insecure networks while the
private key must be kept secret. A Diffie-Hellman key exchange (DH) is the process of combining a public key A with a private key b in order to generate a shared secret. The essential trick is, that you get the same exact secret if you combine the secret key a with the public key B. Wikipedia does a great job at explaining this using an analogy of mixing colors.

Deniability and OTR

In normal day to day messaging you don’t always want to commit to what you said. Especially under oppressive regimes it may be a good idea to be able to deny that you said or wrote something specific. This principle is called deniability.

Note: It is debatable, whether cryptographic deniability ever saved someone from going to jail, but that’s not scope of this blog post.

At the same time you want to be absolutely sure that you are really talking to your chat partner and not to a so called man in the middle. These desires seem to be conflicting at first, but the OTR protocol featured both. The user has an IdentityKey, which is used to identify the user by means of a fingerprint. The (massively and horribly simplified) procedure of creating a OTR session is as follows: Alice generates a random session key and signs the public key with her IdentityKey. She then sends that public key over to Bob, who generates another random session key with which he executes his half of the DH handshake. He then sends the public part of that key (again, signed) back to Alice, who does another DH to acquire the same shared secret as Bob. As you can see, in order to establish a session, both parties had to be online. Note: The signing part has been oversimplified for sake of readability.

Normal Diffie-Hellman Key Exchange

From DH to X3DH

Perrin and Marlinspike improved upon this model by introducing the concept of PreKeys. Those basically are the first halves of a DH-handshake, which can – along with some other keys of the user – be uploaded to a server prior to the beginning of a conversation. This way another user can initiate a session by fetching one half-completed handshake and completing it.

Basically the Signal protocol comprises of the following set of keys per user:

IdentityKey (IK)Acts as the users identity by providing a stable fingerprint
Signed PreKey (SPK)Acts as a PreKey, but carries an additional signature of IK
Set of PreKeys ({OPK})Unsigned PreKeys

If Alice wants to start chatting, she can fetch Bobs IdentityKey, Signed PreKey and one of his PreKeys and use those to create a session. In order to preserve cryptographic properties, the handshake is modified like follows:

DH2 = DH(EK_A, IK_B)

S = KDF(DH1 || DH2 || DH3 || DH4)

EK_A denotes an ephemeral, random key which is generated by Alice on the fly. Alice can now derive an encryption key to encrypt her first message for Bob. She then sends that message (a so called PreKeyMessage) over to Bob, along with some additional information like her IdentityKey IK, the public part of the ephemeral key EK_A and the ID of the used PreKey OPK.

Visual representation of the X3DH handshake

Once Bob logs in, he can use this information to do the same calculations (just with swapped public and private keys) to calculate S from which he derives the encryption key. Now he can decrypt the message.

In order to prevent the session initiation from failing due to lost messages, all messages that Alice sends over to Bob without receiving a first message back are PreKeyMessages, so that Bob can complete the session, even if only one of the messages sent by Alice makes its way to Bob. The exact details on how OMEMO works after the X3DH key exchange will be discussed in part 2 of this series đŸ™‚

X3DH Key Exchange TL;DR

X3DH utilizes PreKeys to allow session creation with offline users by doing 4 DH handshakes between different keys.

A subtle but important implementation difference between OMEMO and Signal is, that the Signal server is able to manage the PreKeys for the user. That way it can make sure, that every PreKey is only used once. OMEMO on the other hand solely relies on the XMPP servers PubSub component, which does not support such behavior. Instead, it hands out a bundle of around 100 PreKeys. This seems like a lot, but in reality the chances of a PreKey collision are pretty high (see the birthday problem).

OMEMO does come with some counter measures for problems and attacks that arise from this situation, but it makes the protocol a little less appealing than the original Signal protocol.

Clients should for example keep used PreKeys around until the end of catch -up of missed message to allow decryption of messages that got sent in sessions that have been established using the same PreKey.

Monday, 25 March 2019

Another Step to a Google-free Life

I watch a lot of YouTube videos. So much, that it starts to annoy me, how much of my free time I’m wasting by watching (admittedly very interesting) clips of a broad range of content creators.

Logging out of my Google account helped a little bit to keep my addiction at bay, as it appears to prevent the YouTube algorithm, which normally greets me with a broad set of perfectly selected videos from recognizing me. But then again I use Google to log in to one service or another, so it became annoying to log in and back out again all the time. At one point I decided to delete my YouTube history, which resulted in a very bad prediction of what videos I might like. This helped for a short amount of time, but the algorithm quickly returned to its merciless precision after a few days.

Today I decided, that its time to leave Google behind completely. My Google Mail account was used only for online shopping anyways, so I figured why not use a more privacy respecting service instead. Self-hosting was not an option for me, as I only have a residential IP address on my Raspberry Pi and also I heard that hosting a mail server is a huge pain.

A New Mail Account

So I created an account at the Berlin based service They offer emails plus some cloud stuff like an office suite, storage etc., although I don’t think I’ll use any of the additional services (oh, they offer an XMPP account as well :P). The service is not free as in free beer as it costs 1â‚Ź per month, but that’s a fair price in my opinion. All in all it appears to be a good replacement for all the Google stuff.

As a next step, I went through the long list of all the websites and shops that I have accounts on, scouting for those services that are registered on my Google Mail address. All those mail settings had to be changed to the new account.

Mail Extensions

Bonus Tipp: has support for so called Mail Extensions (or Plus Extensions, I’m not really sure how they are called). This means that you can create a folder in your inbox, lets say “fsfe”. Now you can change your mail address of your FSFE account to “”. Mails from the FSFE will still go to your “” mail account, but they are automatically sorted into the fsfe inbox. This is useful not only to sort mails by sender, but also to find out, which of the many services you use messed up and leaked your mail address to those nasty spammers, so you can avoid that service in the future.

This trick also works for Google Mail by the way.

Deleting (most) the Google Services

The last step logically would be to finally delete my Google account. However, I’m not entirely sure if I really changed all the important services over to the new account, so I’ll keep it for a short period of time (a month or so) to see if any more important mails arrive.

However, I discovered that under the section “Delete Services or Account” you can see a list of all the services which are connected with your Google account. It is possible to partially delete those services, so I went ahead and deleted most of it, except Google Mail.

Additional Bonus Tipp: I use NewPipe on my phone, which is a free libre replacement for the YouTube app. It has a neat feature which lets you import your subscriptions from your YouTube account. That way I can still follow some of the creators, but in a more manual way (as I have to open the app on my phone, which I don’t often do). In my eyes, this is a good compromise đŸ™‚

I’m looking forward to go fully Google-free soon. I de-googled my phone ages ago, but for some reason I still held on to my Google account. This will be sorted out soon though!

De-Googling your Phone?

By the way, if you are looking to de-google your phone, Mike Kuketz has a great series of blog posts about that topic (in German though):

Happy Hacking!

Update (27.05.2019)

Friday, 22 March 2019

How to create good SSH keys

  • Seravo
  • 08:11, Friday, 22 March 2019

A couple years back we wrote a guide on how to create good OpenPGP/GnuPG keys and now it is time to write a guide on SSH keys for much of the same reasons: SSH key algorithms have evolved in past years and the keys generated by the default OpenSSH settings a few years ago are no longer considered state-of-the-art. This guide is intended both for those completely new to SSH and to those who have already been using it for years and who want to make sure they are following the latest best practices.

Use OpenSSH 7 or later

Related to SSH keys there have been some relevant changes in versions 5.7, 6.5 and 7.0. Latest version is 7.9. You should be running at least 7.0. Current Debian stable (”Stretch”) shipped version 7.4 and for example Ubuntu 16.04 (”Xenial”) shipped 7.2, so nobody should be running on their laptop any later versions than these.

Generate Ed25519 keys

With a recent version of OpenSSH, simply run ssh-keygen -t ed25519. This will create a private and public key pair files at .ssh/id_ed25519 (and .pub) using the Ed25519 algorithm, which is considered state of the art. Elliptic curve algorithms in general are sleek and efficient and unlike the other well known elliptic curve algorithm ECDSA, this Ed25519 does not depend on any suspicious NIST defined constants. If you encounter a server that is very old and does not support Ed25519 keys, you might need to have a more traditinal RSA keypair. A strong 4k RSA key pair can be generated with ssh-keygen -b 4096. Hopefully you won’t need to ever do that.

Running ssh-keygen will prompt for a password. If your laptop is encrypted and well protected you can omit the password and gain some speed and convenience in your SSH commands.

Store the private key securely

Hopefully your laptop is well protected with full disk encryption etc and you can trust that nobody else than yourself has access to your /home/<username>/.ssh directory. Hopefully you also have securely stored encrypted backups of your laptop so that you can recover the .ssh directory if your laptop for any reason is lost or broken.

The SSH keyfiles are stored as armored ASCII, which means that you could even print them on paper and store the printed key in a real vault just to be extra sure you never loose your private key (.ssh/id_ed25519).

The public key is indeed designed to be public

The public key .ssh/id_ed25519.pubon the other hand is meant to be public. Here is mine for example:

~/.ssh$ cat 
ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIMdBlrVoDupARk3pd1Q9sDImaGCxEalcFt7QTqBa36kH otto@XPS-13-9370

You can find a lot of public SSH keys for example on Github using the URL<username>.keys. You may also want to check out the excellent Github SSH usage docs.

This public key is the one you will distribute to remote servers. Place them on the remote server in the .ssh/authorized_keys file. The servers you try to access will use the public key to create a challenge, and only your laptop that has the private key pair can solve that challenge, and thus authenticate that your connection to the server is authorized.

On Linux machines a shorthand to copy your SSH key to a remote server is to run ssh-copy-id This will ask for a password on the first time, but once the key is in place, a SSH keys will be used instead and no password is asked anymore.

Best practices

Once you are familiar with SSH key usage basics, please adopt these policies:

  • Disable password authentication on the SSH servers you control altogether. The less there are passwords, the less there as ones that can leak or be forgotten etc. For passwords less is indeed more. Only use keys for authentication on SSH connections.
  • Use a separate key per client you SSH from. So don’t copy the private key from your laptop to another laptop for use in parallel. Each client system should have only one key, so in case a key leaks, you know which client system was compromised. If you stop using your old laptop and start using a new one it is naturally another case and then you can copy the key.
  • Avoid chaining SSH connections. Any system that is used for chaining SSH connections incurs a potential man-in-the-middle situation.
  • For the same reasons try to avoid X11 forwarding and agent forwarding. Consider putting these in your .ssh/config:
Host *
  ForwardX11   no
  ForwardAgent no

Using ProxyCommand

You shouldn’t forward SSH agent (ssh -A), but it’s ok to use ProxyCommand or ProxyJump. You can permanently configure it in your .ssh/config like this:

Port 23
ProxyCommand ssh -W %h:%p

Stay vigilant

If running ssh remote.example.comyields some error messages, don’t ignore them! SSH has an opportunistic key model, which is convenient, but it also means that if you are confronted with warnings that the connection might be eavesdropped you should really take note and not proceed.

Monday, 18 March 2019

Dataspaces and Paging in L4Re

The experiments covered by my recent articles about filesystems and L4Re managed to lead me along another path in the past few weeks. I had defined a mechanism for providing access to files in a filesystem via a programming interface employing interprocess communication within the L4Re system. In doing so, I had defined calls or operations that would read from and write to a file, observing that some kind of “memory-mapped” file support might also be possible. At the time, I had no clear idea of how this would actually be made to work, however.

As can often be the case, once some kind of intellectual challenge emerges, it can become almost impossible to resist the urge to consider it and to formulate some kind of solution. Consequently, I started digging deeper into a number of things: dataspaces, pagers, page faults, and the communication that happens within L4Re via the kernel to support all of these things.

Dataspaces and Memory

Because the L4Re developers have put a lot of effort into making a system where one can compile a fairly portable program and probably expect it to work, matters like the allocation of memory within programs, the use of functions like malloc, and other things we take for granted need no special consideration in the context of describing general development for an L4Re-based system. In principle, if our program wants more memory for its own use, then the use of things like malloc will probably suffice. It is where we have other requirements that some of the L4Re abstractions become interesting.

In my previous efforts to support MIPS-based systems, these other requirements have included the need to access memory with a fixed and known location so that the hardware can be told about it, thus supporting things like framebuffers that retain stored data for presentation on a display device. But perhaps most commonly in a system like L4Re, it is the need to share memory between processes or tasks that causes us to look beyond traditional memory allocation techniques at what L4Re has to offer.

Indeed, the filesystem work so far employs what are known as dataspaces to allow filesystem servers and client applications to exchange larger quantities of information conveniently via shared buffers. First, the client requests a dataspace representing a region of memory. It then associates it with an address so that it may access the memory. Then, the client shares this with the server by sending it a reference to the dataspace (known as a capability) in a message.

Opening a file using shared memory containing file information

Opening a file using shared memory containing file information

The kernel, in propagating the message and the capability, makes the dataspace available to the server so that both the client and the server may access the memory associated with the dataspace and that these accesses will just work without any further effort. At this level of sophistication we can get away with thinking of dataspaces as being blocks of memory that can be plugged into tasks. Upon obtaining access to such a block, reads and writes (or loads and stores) to addresses in the block will ultimately touch real memory locations.

Even in this simple scheme, there will be some address translation going on because each task has its own way of arranging its view of memory: its virtual address space. The virtual memory addresses used by a task may very well be different from the physical memory addresses indicating the actual memory locations involved in accesses.

An illustration of virtual memory corresponding to physical memory

An illustration of virtual memory corresponding to physical memory

Such address translation is at the heart of operating systems like those supported by the L4 family of microkernels. But the system will make sure that when a task tries to access a virtual address available to it, the access will be translated to a physical address and supported by some memory location.

Mapping and Paging

With some knowledge of the underlying hardware architecture, we can say that each task will need support from the kernel and the hardware to be able to treat its virtual address space as a way of accessing real memory locations. In my experiments with simple payloads to run on MIPS-based hardware, it was sufficient to define very simple tables that recorded correspondences between virtual and physical addresses. Processes or tasks would access memory addresses, and where the need arose to look up such a virtual address, the table would be consulted and the hardware configured to map the virtual address to a physical address.

Naturally, proper operating systems go much further than this, and systems built on L4 technologies go as far as to expose the mechanisms for normal programs to interact with. Instead of all decisions about how memory is mapped for each task being taken in the kernel, with the kernel being equipped with all the necessary policy and information, such decisions are delegated to entities known as pagers.

When a task needs an address translated, the kernel pushes the translation activity over to the designated pager for a decision to be made. And the event that demands an address translation is known as a page fault since it occurs when a task accesses a memory page that is not yet supported by a mapping to physical memory. Pagers are therefore present to receive page fault notifications and to respond in a way that causes the kernel to perform the necessary privileged actions to configure the hardware, this being one of the few responsibilities of the kernel.

The role of a pager in managing access to the contents of a dataspace

The role of a pager in managing access to the contents of a dataspace

Treating a dataspace as an abstraction for memory accessed by a task or application, the designated pager for the dataspace acts as a dataspace manager, ensuring that memory accesses within the dataspace can be satisfied. If an access causes a page fault, the pager must act to provide a mapping for the accessed page, leaving the application mostly oblivious to the work going on to present the dataspace and its memory as a continuously present resource.

An Aside

It is rather interesting to consider the act of delegation in the context of processor architecture. It would seem to be fairly common that the memory management units provided by various architectures feature built-in support for consulting various forms of data structures describing the virtual memory layout of a process or task. So, when a memory access fails, the information about the actual memory address involved can be retrieved from such a predefined structure.

However, the MIPS architecture largely delegates such matters to software: a processor exception is raised when a “bad” virtual address is used, and the job of doing something about it falls immediately to a software routine. So, there seems to be some kind of parallel between processor architecture and operating system architecture, L4 taking a MIPS-like approach of eager delegation to a software component for increased flexibility and functionality.

Messages and Flexpages

So, the high-level view so far is as follows:

  • Dataspaces represent regions of virtual memory
  • Virtual memory is mapped to physical memory where the data actually resides
  • When a virtual memory access cannot be satisfied, a page fault occurs
  • Page faults are delivered to pagers (acting as dataspace managers) for resolution
  • Pagers make data available and indicate the necessary mapping to satisfy the failing access

To get to the level of actual implementation, some familiarisation with other concepts is needed. Previously, my efforts have exposed me to the interprocess communications (IPC) central in L4Re as a microkernel-based system. I had even managed to gain some level of understanding around sending references or capabilities between processes or tasks. And it was apparent that this mechanism would be used to support paging.

Unfortunately, the main L4Re documentation does not seem to emphasise the actual message details or protocols involved in these fundamental activities. Instead, the library code is described in reference documentation with some additional explanation. However, some investigation of the code yielded some insights as to the kind of interfaces the existing dataspace implementations must support, and I also tracked down some message sending activities in various components.

When a page fault occurs, the first thing to know about it is the kernel because the fault occurs at the fundamental level of instruction execution, and it is the kernel’s job to deal with such low-level events in the first instance. Notification of the fault is then sent out of the kernel to the page fault handler for the affected task. The page fault handler then contacts the task’s pager to request a resolution to the problem.

Page fault handling in detail

Page fault handling in detail

In L4Re, this page fault handler is likely to be something called a region mapper (or perhaps a region manager), and so it is not completely surprising that the details of invoking the pager was located in some region mapper code. Putting together both halves of the interaction yielded the following details of the message:

  • map: offset, hot spot, flags → flexpage

Here, the offset is the position of the failing memory access relative to the start of the dataspace; the flags describe the nature of the memory to be accessed. The “hot spot” and “flexpage” need slightly more explanation, the latter being an established term in L4 circles, the former being almost arbitrarily chosen and not particularly descriptive.

The term “flexpage” may have its public origins in the “Flexible-Sized Page-Objects” paper whose title describes the term. For our purposes, the significance of the term is that it allows for the consideration of memory pages in a range of sizes instead of merely considering a single system-wide page size. These sizes start at the smallest page size supported by the system (but not necessarily the absolute smallest supported by the hardware, but anyway…) and each successively larger size is double the size preceding it. For example:

  • 4096 (212) bytes
  • 8192 (213) bytes
  • 16384 (214) bytes
  • 32768 (215) bytes
  • 65536 (216) bytes

When a page fault occurs, the handler identifies a region of memory where the failing access is occurring. Although it could merely request that memory be made available for a single page (of the smallest size) in which the access is situated, there is the possibility that a larger amount of memory be made available that encompasses this access page. The flexpage involved in a map request represents such a region of memory, having a size not necessarily decided in advance, being made available to the affected task.

This brings us to the significance of the “hot spot” and some investigation into how the page fault handler and pager interact. I must admit that I find various educational materials to be a bit vague on this matter, at least with regard to explicitly describing the appropriate behaviour. Here, the flexpage paper was helpful in providing slightly different explanations, albeit employing the term “fraction” instead of “hot spot”.

Since the map request needs to indicate the constraints applying to the region in which the failing access occurs, without demanding a particular size of region and yet still providing enough useful information to the pager for the resulting flexpage to be useful, an efficient way is needed of describing the memory landscape in the affected task. This is apparently where the “hot spot” comes in. Consider a failing access in page #3 of a memory region in a task, with the memory available in the pager to satisfy the request being limited to two pages:

Mapping available memory to pages in a task experiencing a page fault

Mapping available memory to pages in a task experiencing a page fault

Here, the “hot spot” would reference page #3, and this information would be received by the pager. The significance of the “hot spot” appears to be the location of the failing access within a flexpage, and if the pager could provide it then a flexpage of four system pages would map precisely to the largest flexpage expected by the handler for the task.

However, with only two system pages to spare, the pager can only send a flexpage consisting of those two pages, the “hot spot” being localised in page #1 of the flexpage to be sent, and the base of this flexpage being the base of page #0. Fortunately, the handler is smart enough to fit this smaller flexpage onto the “receive window” by using the original “hot spot” information, mapping page #2 in the receive window to page #0 of the received flexpage and thus mapping the access page #3 to page #1 of that flexpage.

So, the following seems to be considered and thus defined by the page fault handler:

  • The largest flexpage that could be used to satisfy the failing access.
  • The base of this flexpage.
  • The page within this flexpage where the access occurs: the “hot spot”.
  • The offset within the broader dataspace of the failing access, it indicating the data that would be expected in this page.

(Given this phrasing of the criteria, it becomes apparent that “flexpage offset” might be a better term than “hot spot”.)

With these things transferred to the pager using a map request, the pager’s considerations are as follows:

  • How flexpages of different sizes may fit within the memory available to satisfy the request.
  • The base of the most appropriate flexpage, where this might be the largest that fits within the available memory.
  • The population of the available memory with data from the dataspace.

To respond to the request, the pager sends a special flexpage item in its response message. Consequently, this flexpage is mapped into the task’s address space, and the execution of the task may resume with the missing data now available.

Practicalities and Pitfalls

If the dataspace being provided by a pager were merely a contiguous region of memory containing the data, there would probably be little else to say on the matter, but in the above I hint at some other applications. In my example, the pager only uses a certain amount of memory with which it responds to map requests. Evidently, in providing a dataspace representing a larger region, the data would have to be brought in from elsewhere, which raises some other issues.

Firstly, if data is to be copied into the limited region of memory available for satisfying map requests, then the appropriate portion of the data needs to be selected. This is mostly a matter of identifying how the available memory pages correspond to the data, then copying the data into the pages so that the accessed location ultimately provides the expected data. It may also be the case that the amount of data available does not fill the available memory pages; this should cause the rest of those pages to be filled with zeros so that data cannot leak between map requests.

Secondly, if the available memory pages are to be used to satisfy the current map request, then what happens when we re-use them in each new map request? It turns out that the mappings made for previous requests remain active! So if a task traverses a sequence of pages, and if each successive page encountered in that traversal causes a page fault, then it will seem that new data is being made available in each of those pages. But if that task inspects the earlier pages, it will find that the newest data is exposed through those pages, too, banishing the data that we might have expected.

Of course, what is happening is that all of the mapped pages in the task’s dataspace now refer to the same collection of pages in the pager, these being dedicated to satisfying the latest map request. And so, they will all reflect the contents of those available memory pages as they currently are after this latest map request.

The effect of mapping the same page repeatedly

The effect of mapping the same page repeatedly

One solution to this problem is to try and make the task forget the mappings for pages it has visited previously. I wondered if this could be done automatically, by sending a flexpage from the pager with a flag set to tell the kernel to invalidate prior mappings to the pager’s memory. After a time looking at the code, I ended up asking on the l4-hackers mailing list and getting a very helpful response that was exactly what I had been looking for!

There is, in fact, a special way of telling the kernel to “unmap” memory used by other tasks (l4_task_unmap), and it is this operation that I ended up using to invalidate the mappings previously sent to the task. Thus the task, upon backtracking to earlier pages, finds that the mappings from virtual addresses to the physical memory holding the latest data are absent once again, and page faults are needed to restore the data in those pages. The result is a form of multiplexing access to a resource via a limited region of memory.

Applications of Flexible Paging

Given the context of my investigations, it goes almost without saying that the origin of data in such a dataspace could be a file in a filesystem, but it could equally be anything that exposes data in some kind of backing store. And with this backing store not necessarily being an area of random access memory (RAM), we enter the realm of a more restrictive definition of paging where processes running in a system can themselves be partially resident in RAM and partially resident in some other kind of storage, with the latter portions being converted to the former by being fetched from wherever they reside, depending on the demands made on the system at any given point in time.

One observation worth making is that a dataspace does not need to be a dedicated component in the system in that it is not a separate and special kind of entity. Anything that is able to respond to the messages understood by dataspaces – the paging “protocol” – can provide dataspaces. A filesystem object can therefore act as a dataspace, exposing itself in a region of memory and responding to map requests that involve populating that region from the filesystem storage.

It is also worth mentioning that dataspaces and flexpages exist at different levels of abstraction. Dataspaces can be considered as control mechanisms for accessing regions of virtual memory, and the Fiasco.OC kernel does not appear to employ the term at all. Meanwhile, flexpages are abstractions for memory pages existing within or even independently of dataspaces. (If you wish, think of the frame of Banksy’s work “Love is in the Bin” as a dataspace, with the shredded pieces being flexpages that are mapped in and out.)

One can envisage more exotic forms of dataspace. Consider an image whose pixels need to be computed, like a ray-traced image, for instance. If it exposed those pixels as a dataspace, then a task reading from pages associated with that dataspace might cause computations to be initiated for an area of the image, with the task being suspended until those computations are performed and then being resumed with the pixel data ready to read, with all of this happening largely transparently.

I started this exercise out of somewhat idle curiosity, but it now makes me wonder whether I might introduce memory-mapped access to filesystem objects and then re-implement operations like reading and writing using this particular mechanism. Not being familiar with how systems like GNU/Linux provide these operations, I can only speculate as to whether similar decisions have been taken elsewhere.

But certainly, this exercise has been informative, even if certain aspects of it were frustrating. I hope that this account of my investigations proves useful to anyone else wondering about microkernel-based systems and L4Re in particular, especially if they too wish there were more discussion, reflection and collaboration on the design and implementation of software for these kinds of systems.

Sunday, 10 March 2019

Generate a random root password aka Ansible Password Plugin

I was suspicious with a cron entry on a new ubuntu server cloud vm, so I ended up to be looking on the logs.

Authentication token is no longer valid; new one required

After a quick internet search,

# chage -l root

Last password change                                    : password must be changed
Password expires                                        : password must be changed
Password inactive                                       : password must be changed
Account expires                                         : never
Minimum number of days between password change          : 0
Maximum number of days between password change          : 99999
Number of days of warning before password expires       : 7

due to the password must be changed on the root account, the cron entry does not run as it should.

This ephemeral image does not need to have a persistent known password, as the notes suggest, and it doesn’t! Even so, we should change to root password when creating the VM.


Ansible have a password plugin that we can use with lookup.

TLDR; here is the task:

- name: Generate Random Password
      name: root
      password: "{{ lookup('password','/dev/null encrypt=sha256_crypt length=32') }}"

after ansible-playbook runs

# chage -l root

Last password change                                    : Mar 10, 2019
Password expires                                        : never
Password inactive                                       : never
Account expires                                         : never
Minimum number of days between password change          : 0
Maximum number of days between password change          : 99999
Number of days of warning before password expires       : 7

and cron entry now runs as it should.

Password Plugin

Let explain how password plugin works.

Lookup needs at-least two (2) variables, the plugin name and a file to store the output. Instead, we will use /dev/null not to persist the password to a file.

To begin with it, a test ansible playbook:

- hosts: localhost
  gather_facts: False
  connection: local

    - debug:
        msg: "{{ lookup('password', '/dev/null') }}"
      with_sequence: count=5


ok: [localhost] => (item=1) => {
    "msg": "dQaVE0XwWti,7HMUgq::"
ok: [localhost] => (item=2) => {
    "msg": "aT3zqg.KjLwW89MrAApx"
ok: [localhost] => (item=3) => {
    "msg": "4LBNn:fVw5GhXDWh6TnJ"
ok: [localhost] => (item=4) => {
    "msg": "v273Hbox1rkQ3gx3Xi2G"
ok: [localhost] => (item=5) => {
    "msg": "NlwzHoLj8S.Y8oUhcMv,"


In password plugin we can also use length variable:

msg: "{{ lookup('password', '/dev/null length=32') }}"


ok: [localhost] => (item=1) => {
    "msg": "4.PEb6ycosnyL.SN7jinPM:AC9w2iN_q"
ok: [localhost] => (item=2) => {
    "msg": "s8L6ZU_Yzuu5yOk,ISM28npot4.KwQrE"
ok: [localhost] => (item=3) => {
    "msg": "L9QvLyNTvpB6oQmcF8WVFy.7jE4Q1K-W"
ok: [localhost] => (item=4) => {
    "msg": "6DMH8KqIL:kx0ngFe8:ri0lTK4hf,SWS"
ok: [localhost] => (item=5) => {
    "msg": "ByW11i_66K_0mFJVB37Mq2,.fBflepP9"


We can define a specific type of python string constants

  • ascii_letters (ascii_lowercase and ascii_uppercase
  • ascii_lowercase
  • ascii_uppercase
  • digits
  • hexdigits
  • letters (lowercase and uppercase)
  • lowercase
  • octdigits
  • punctuation
  • printable (digits, letters, punctuation and whitespace
  • uppercase
  • whitespace


msg: "{{ lookup('password', '/dev/null length=32 chars=ascii_lowercase') }}"

ok: [localhost] => (item=1) => {
    "msg": "vwogvnpemtdobjetgbintcizjjgdyinm"
ok: [localhost] => (item=2) => {
    "msg": "pjniysksnqlriqekqbstjihzgetyshmp"
ok: [localhost] => (item=3) => {
    "msg": "gmeoeqncdhllsguorownqbynbvdusvtw"
ok: [localhost] => (item=4) => {
    "msg": "xjluqbewjempjykoswypqlnvtywckrfx"
ok: [localhost] => (item=5) => {
    "msg": "pijnjfcpjoldfuxhmyopbmgdmgdulkai"


We can also define the encryption hash. Ansible uses passlib so the unix active encrypt hash algorithms are:

  • passlib.hash.bcrypt - BCrypt
  • passlib.hash.sha256_crypt - SHA-256 Crypt
  • passlib.hash.sha512_crypt - SHA-512 Crypt


msg: "{{ lookup('password', '/dev/null length=32 chars=ascii_lowercase encrypt=sha512_crypt') }}"

ok: [localhost] => (item=1) => {
    "msg": "$6$BR96lZqN$jy.CRVTJaubOo6QISUJ9tQdYa6P6tdmgRi1/NQKPxwX9/Plp.7qETuHEhIBTZDTxuFqcNfZKtotW5q4H0BPeN."
ok: [localhost] => (item=2) => {
    "msg": "$6$ESf5xnWJ$cRyOuenCDovIp02W0eaBmmFpqLGGfz/K2jd1FOSVkY.Lsuo8Gz8oVGcEcDlUGWm5W/CIKzhS43xdm5pfWyCA4."
ok: [localhost] => (item=3) => {
    "msg": "$6$pS08v7j3$M4mMPkTjSwElhpY1bkVL727BuMdhyl4IdkGM7Mq10jRxtCSrNlT4cAU3wHRVxmS7ZwZI14UwhEB6LzfOL6pM4/"
ok: [localhost] => (item=4) => {
    "msg": "$6$m17q/zmR$JdogpVxY6MEV7nMKRot069YyYZN6g8GLqIbAE1cRLLkdDT3Qf.PImkgaZXDqm.2odmLN8R2ZMYEf0vzgt9PMP1"
ok: [localhost] => (item=5) => {
    "msg": "$6$tafW6KuH$XOBJ6b8ORGRmRXHB.pfMkue56S/9NWvrY26s..fTaMzcxTbR6uQW1yuv2Uy1DhlOzrEyfFwvCQEhaK6MrFFye."
Tag(s): ansible, password

A look at’s OLM | MEGOLM encryption protocol

Everyone who knows and uses XMPP is probably aware of a new player in the game. is often recommended as a young, arising alternative to the aging protocol behind the Jabber ecosystem. However the founders do not see their product as a direct competitor to XMPP as their approach to the problem of message exchanging is quite different.

An open network for secure, decentralized communication.

During his talk at the FOSDEM in Brussels, founder Matthew Hodgson roughly compared the concept of matrix to how git works. Instead of passing single messages between devices and servers, matrix is all about synchronization of a shared state. A chat room can be seen as a repository, which is shared between all servers of the participants. As a consequence communication in a chat room can go on, even when the server on which the room was created goes down, as the room simultaneously exists on all the other servers. Once the failed server comes back online, it synchronizes its state with the others and retrieves missed messages.

Matrix in the French State

Olm, Megolm – What’s the deal?

Matrix introduced two different crypto protocols for end-to-end encryption. One is named Olm, which is used in one-to-one chats between two chat partners (this is not quite correct, see Updates for clarifying remarks). It can very well be compared to OMEMO, as it too is an adoption of the Signal Protocol by OpenWhisperSystems. However, due to some differences in the implementation Olm is not compatible with OMEMO although it shares the same cryptographic properties.

The other protocol goes by the name of Megolm and is used in group chats. Conceptually it deviates quite a bit from Olm and OMEMO, as it contains some modifications that make it more suitable for the multi-device use-case. However, those modifications alter its cryptographic properties.

Comparing Cryptographic Building Blocks

ProtocolOlmOMEMO (Signal)
Key Exchange
Triple Diffie-Hellman
Extended Triple
Diffie-Hellman (X3DH)
Ratcheting AlgoritmDouble RatchetDouble Ratchet
  1. Signal uses a Curve X25519 IdentityKey, which is capable of both encrypting, as well as creating signatures using the XEdDSA signature scheme. Therefore no separate FingerprintKey is needed. Instead the fingerprint is derived from the IdentityKey. This is mostly a cosmetic difference, as one less key pair is required.
  2. Olm does not distinguish between the concepts of signed and unsigned PreKeys like the Signal protocol does. Instead it only uses one type of PreKey. However, those may be signed with the FingerprintKey upon upload to the server.
  3. OMEMO includes the SignedPreKey, as well as an unsigned PreKey in the handshake, while Olm only uses one PreKey. As a consequence, if the senders Olm IdentityKey gets compromised at some point, the very first few messages that are sent could possibly be decrypted.

In the end Olm and OMEMO are pretty comparable, apart from some simplifications made in the Olm protocol. Those do only marginally affect its security though (as far as I can tell as a layman).


The similarities between OMEMO and Matrix’ encryption solution end when it comes to group chat encryption.

OMEMO does not treat chats with more than two parties any other than one-to-one chats. The sender simply has to manage a lot more keys and the amount of required trust decisions grows by a factor roughly equal to the number of chat participants.

Yep, this is a mess but luckily XMPP isn’t a very popular chat protocol so there are no large encrypted group chats ;P

So how does Matrix solve the issue?

When a user joins a group chat, they generate a session for that chat. This session consists of an Ed25519 SigningKey and a single ratchet which gets initialized randomly.

The public part of the signing key and the state of the ratchet are then shared with each participant of the group chat. This is done via an encrypted channel (using Olm encryption). Note, that this session is also shared between the devices of the user. Contrary to Olm, where every device has its own Olm session, there is only one Megolm session per user per group chat.

Whenever the user sends a message, the encryption key is generated by forwarding the ratchet and deriving a symmetric encryption key for the message from the ratchets output. Signing is done using the SigningKey.

Recipients of the message can decrypt it by forwarding their copy of the senders ratchet the same way the sender did, in order to retrieve the same encryption key. The signature is verified using the public SigningKey of the sender.

There are some pros and cons to this approach, which I briefly want to address.

First of all, you may find that this protocol is way less elegant compared to Olm/Omemo/Signal. It poses some obvious limitations and security issues. Most importantly, if an attacker gets access to the ratchet state of a user, they could decrypt any message that is sent from that point in time on. As there is no new randomness introduced, as is the case in the other protocols, the attacker can gain access by simply forwarding the ratchet thereby generating any decryption keys they need. The protocol defends against this by requiring the user to generate a new random session whenever a new user joins/leaves the room and/or a certain number of messages has been sent, whereby the window of possibly compromised messages gets limited to a smaller number. Still, this is equivalent to having a single key that decrypts multiple messages at once.

The Megolm specification lists a number of other caveats.

On the pro side of things, trust management has been simplified as the user basically just has to decide whether or not to trust each group member instead of each participating device – reducing the complexity from a multiple of n down to just n. Also, since there is no new randomness being introduced during ratchet forwarding, messages can be decrypted multiple times. As an effect devices do not need to store the decrypted messages. Knowledge of the session state(s) is sufficient to retrieve the message contents over and over again.

By sharing older session states with own devices it is also possible to read older messages on new devices. This is a feature that many users are missing badly from OMEMO.

On the other hand, if you really need true future secrecy on a message-by-message base and you cannot risk that an attacker may get access to more than one message at a time, you are probably better off taking the bitter pill going through the fingerprint mess and stick to normal Olm/OMEMO (see Updates for remarks on this statement).

Note: End-to-end encryption does not really make sense in big, especially public chat rooms, since an attacker could just simply join the room in order to get access to ongoing communication. Thanks to Florian Schmaus for pointing that out.

I hope I could give a good overview of the different encryption mechanisms in XMPP and Matrix. Hopefully I did not make any errors, but if you find mistakes, please let me know, so I can correct them asap đŸ™‚

Happy Hacking!



Thanks for Matthew Hodgson for pointing out, that Olm/OMEMO is also effectively using a symmetric ratchet when multiple consecutive messages are sent without the receiving device sending an answer. This can lead to loss of future secrecy as discussed in the OMEMO protocol audit.

Also thanks to Hubert Chathi for noting, that Megolm is also used in one-to-one chats, as matrix doesn’t have the same distinction between group and single chats. He also pointed out, that the security level of Megolm (the criteria for regenerating the session) can be configured on a per-chat basis.

Planet FSFE (en): RSS 2.0 | Atom | FOAF |

        /var/log/fsfe/flx » planet-en  Albrechts Blog  Alessandro at FSFE » English  Alessandro's blog  Alina Mierlus - Building the Freedom » English  Andrea Scarpino's blog  André Ockers on Free Software  Being Fellow #952 of FSFE » English  Bela's Internship Blog  Bernhard's Blog  Bits from the Basement  Blog of Martin Husovec  Blog » English  Blog – Think. Innovation.  Bobulate  Brian Gough's Notes  Chris Woolfrey -- FSFE UK Team Member  Ciarán's free software notes  Colors of Noise - Entries tagged planetfsfe  Communicating freely  Daniel Martí's blog  David Boddie - Updates (Full Articles)  ENOWITTYNAME  English Planet – Dreierlei  English on Björn Schießle - I came for the code but stayed for the freedom  English – Max's weblog  Escape to freedom  Evaggelos Balaskas - System Engineer  FSFE Fellowship Vienna » English  FSFE interviews its Fellows  Fellowship News  Florian Snows Blog » en  Frederik Gladhorn (fregl) » FSFE  Free Software & Digital Rights Noosphere  Free Software with a Female touch  Free Software –  Free Software – Frank Karlitschek_  Free Software – hesa's Weblog  Free as LIBRE  Free speech is better than free beer » English  Free, Easy and Others  From Out There  Giacomo Poderi  Green Eggs and Ham  Handhelds, Linux and Heroes  HennR's FSFE blog  Henri Bergius  Hook’s Humble Homepage  Hugo - FSFE planet  Inductive Bias  Jelle Hermsen » English  Jens Lechtenbörger » English  Karsten on Free Software  Losca  MHO  Mario Fux  Martin's notes - English  Matej's blog » FSFE  Matthias Kirschner's Web log - fsfe  Michael Clemens  Myriam's blog  Mäh?  Nice blog  Nico Rikken » fsfe  Nicolas Jean's FSFE blog » English  Nikos Roussos - opensource  PB's blog » en  Paul Boddie's Free Software-related blog » English  Planet FSFE on Iain R. Learmonth  Po angielsku —  Posts - Carmen Bianca Bakker  Posts on Hannes Hauswedell's homepage  Pressreview  Ramblings of a sysadmin (Posts about planet-fsfe)  Rekado  Repentinus » English  Riccardo (ruphy) Iaconelli - blog  Saint's Log  Seravo  TSDgeos' blog  Tarin Gamberini  Technology – Intuitionistically Uncertain  The Girl Who Wasn't There » English  The trunk  Thib's Fellowship Blog » fsfe  Thinking out loud » English  Thomas Løcke Being Incoherent  Told to blog - Entries tagged fsfe  Tonnerre Lombard  Torsten's FSFE blog » english  Viktor's notes » English  Vitaly Repin. Software engineer's blog  Weblog  Weblog  Weblog  Weblog  Weblog  Weblog  With/in the FSFE » English  a fellowship ahead  agger's Free Software blog  anna.morris's blog  ayers's blog  bb's blog  blog  drdanzs blog » freesoftware  egnun's blog » FreeSoftware  english – Davide Giunchi  foss – vanitasvitae's blog  free software blog  freedom bits  gollo's blog » English  julia.e.klein's blog  marc0s on Free Software  mkesper's blog » English  pichel's blog  polina's blog  rieper|blog » en  softmetz' anglophone Free Software blog  stargrave's blog  tobias_platen's blog  tolld's blog  wkossen's blog  yahuxo's blog