SensorsIot/IOTstack

Urgent Raspberry Pi OS patch: "Fatal error - unreachable code" or "Error obtaining system time" or "libseccomp2"

Paraphraser opened this issue · 7 comments

This is an advisory for IOTstack users running Raspberry Pi OS Buster

Alpine Linux 3.13, introduced a dependency on libseccomp2 which isn't satisfied via the normal Raspberry Pi OS update mechanism. The dependency has to do with 64-bit date support which is needed to avoid the Y2038 date problem (akin to the Y2K problem).

Many containers supported by IOTstack are based on Alpine Linux. This is a small sample:

  • containers not yet affected:

    • Node-RED 2.0.6 (Alpine Linux 3.11.12)
    • Zigbee2MQTT 1.21.1 (Alpine Linux 3.12.8)
  • containers known to be affected:

    • Grafana 8.1.2 (Alpine Linux 3.13.5)
    • Mosquitto 2.0.12 (Alpine Linux 3.14.2)

When the Dockerhub image for any container is updated to Alpine Linux 3.13 or later and the updated image is pulled down to your Raspberry Pi, the container will go into a restart loop. The actual error messages vary but may include:

  • "Fatal error - unreachable code" or
  • "Error obtaining system time"

If you have just updated a container that was previously working quite happily but which has now started to crash, this dependency on libseccomp2 may be the explanation.

It is easy to check if a container is built on top of Alpine Linux. For example, to perform the check for Mosquitto:

$ docker exec mosquitto cat /etc/os-release | head -1
NAME="Alpine Linux"

Unfortunately, that check only works if the container stays up long enough to run the command. If you've just updated a container and it has started crashing, you're in a "catch-22" situation.

Regardless of whether any of your containers is affected by this problem right now, it is highly likely that you will run into this problem soon, so it is prudent to apply the patch as soon as possible.

The following repeats the material in Getting Started - recommended patch #2. Do not apply this patch on Bullseye. Only run these commands on Buster.

$ sudo apt-key adv --keyserver hkps://keyserver.ubuntu.com:443 --recv-keys 04EE7237B7D453EC 648ACFD622F3D138
$ echo "deb http://httpredir.debian.org/debian buster-backports main contrib non-free" | sudo tee -a "/etc/apt/sources.list.d/debian-backports.list"
$ sudo apt update
$ sudo apt install libseccomp2 -t buster-backports

Acknowledgement:

MariaDB and/or Ubuntu containers may also be affected

@ARodenboog reported the same problem with MariaDB - see Issue 400.

I run MariaDB as my nextcloud_db container. When I queried the underlying OS:

$ docker exec nextcloud_db cat /etc/os-release
cat: /etc/os-release: No such file or directory
	
$ docker exec nextcloud_db cat /proc/version
Linux version 5.10.60-v7l+ (dom@buildbot) (arm-linux-gnueabihf-gcc-8 (Ubuntu/Linaro 8.4.0-3ubuntu1) 8.4.0, GNU ld (GNU Binutils for Ubuntu) 2.34) #1449 SMP Wed Aug 25 15:00:44 BST 2021

my container reports being based on Ubuntu. It isn't clear whether Alfred is running a MariaDB container based on a different image, or if Ubuntu also has a dependency on libseccomp2. Either way, the best advice is to apply the patch now.

I rebuilt a test system without the libseccomp2 patch and analysed a selection of containers:

Container Status Remark
adguardhome ⛔️ Alpine 3.12.7
chronograf Debian
gitea Restart loop
grafana ⛔️ Alpine 3.13.5 (see note 1)
homeassistant log "can't initialize time" (see note 2)
homebridge ⛔️ Alpine 3.12.7
httpd Debian
influxdb Debian
kapacitor Debian
mariadb Ubuntu (log complains about outdated libseccomp)
mosquitto Fails during Dockerfile run
nextcloud Dependency on nextcloud_db
nextcloud_db Ubuntu (log complains about outdated libseccomp)
nodered ⛔️ Alpine 3.11.12
octoprint Debian
pihole Debian
portainer-ce (see note 3)
postgres Debian
telegraf Debian (but has dependency on Mosquitto)
wireguard Ubuntu
zigbee2mqtt ⛔️ Alpine 3.12.8
Key Meaning
Probably OK
⛔️ At risk if underlying Alpine is updated to 3.13 or later
Failing now if libseccomp2 is not installed

Note:

  1. The situation with Grafana 8.1.2 is unclear. It is based on Alpine 3.13.5 so it should be failing because of the missing libseccomp2, yet it seems to be working.
  2. This is the HomeAssistant container not hass.io.
  3. Portainer's ultra-secret approach to building the container makes internal investigation tricky so it is difficult to be sure whether it is ✅ or ⛔️.

Sometimes the symptoms of this problem are obvious, sometimes subtle. Here are some examples:

  • Mosquitto

    mosquitto
  • Nextcloud_DB and MariaDB

    nextcloud_db-mariadb-log nextcloud_db-mariadb-exec
  • Gitea

    gitea-log
  • Home Assistant

    homeassistant-log

Thank you for posting this! 🙇 I just got hit by this. mosquitto definitely failed, zigbee2mqtt possibly but not sure as I messed up with rebuilding the image couple of times and it didn't actually update before running the commands.

What a mess...

I only had an issue with homeassistant, grafana was fine. Odd

Hi, I'm also affected by this issue with the HomeAssistant image.
The container loops with this error.

So if it is not clear (it was not for me...), you have to upgrade your Pi OS and install a backport of the lib then reboot:
home-assistant/core#52855 (comment)

The advice has changed, slightly, because it now depends on whether you are running Raspbian Buster or Bullseye. The updated doco is currently here. When PR440 is applied, that will move into the official Getting Started document.

It's really kinda important not to make the mistake of applying the libseccomp2 patch to Bullseye because you just wind up with a broken mess and have to start over.

@Paraphraser thanks . I got lucky in applying patch 2 having that distro.

For the first patch, it seems I already had somehow "fixed it" by forcing the restart of dhcpcd every time it went down for the eth0 interface... (#bruteforce):

# crontab 
# */5 * * * * /home/pi/checknetworkup.sh
# checknetworkup.sh
#!/bin/bash

ip link show | grep eth0 | egrep -q 'UP,LOWER_UP.* state UP'
if [[ $? -eq 0 ]] ; then
#  echo "checknetworkup test... all fine !"
  exit 0
else
  echo "checknetworkup test... eth0 not up, here is the current state:"
  echo "checknetworkup : $(ip link show | egrep 'UP,LOWER_UP.* state UP')"
  echo "checknetworkup test... eth0 not up, restarting dhcpcd..."
  sudo systemctl restart dhcpcd
  echo "checknetworkup test... restarted."
fi

I did it slightly differently - see gist.

The title is slightly misleading - implies WiFi - but also works for eth. I have this on all my systems. If a system doesn't have eth0 active, I just comment out that line.