ironsheep/RPi-Reporter-MQTT2HA-Daemon

No updates when running "in background"

aebgit opened this issue · 9 comments

Checklist:

  • I updated to the latest version available
  • I checked that my MQTT broker is otherwise working

Release with the issue:
[2023-04-28 14:00:16] - ISP-RPi-mqtt-daemon.py v1.8.4

Last working release (if known):
n/a

Hardware, Operating System, Python version:

  • Raspberry Pi 4 Modell B; 4 GB, ARM-Cortex-A72 4 x, 1,50 GHz, 4 GB RAM
# uname -a
Raspberry Pi 4 Modell B; 4 GB, ARM-Cortex-A72 4 x, 1,50 GHz, 4 GB RAM

# cat /etc/debian_version 
11.6

# cat /etc/os-release 
PRETTY_NAME="Raspbian GNU/Linux 11 (bullseye)"
NAME="Raspbian GNU/Linux"
VERSION_ID="11"
VERSION="11 (bullseye)"
VERSION_CODENAME=bullseye
ID=raspbian
ID_LIKE=debian
HOME_URL="http://www.raspbian.org/"
SUPPORT_URL="http://www.raspbian.org/RaspbianForums"
BUG_REPORT_URL="http://www.raspbian.org/RaspbianBugs"

Description of problem:

I run 5 Raspis with RPi-Reporter-MQTT2HA-Daemon in a plain network connected to Mosquitto broker and then to Homeassistant.
I followed the installation guidelines straight forward and experienced no problems at all.

Four of them work fine, one has an issue. When I start the script manually via shell as unprivileged user or via sudo , it runs without problems:
# python3 /opt/RPi-Reporter-MQTT2HA-Daemon/ISP-RPi-mqtt-daemon.py.

However, the moment I start the script via systemctl, the following happens:

  • the first set of telemetry is sent to the MQTT broker
  • after that, there is NO other refresh of the information.
  • The process is still alive (#ps -fax)

As a workaround I disabled it in systemctl and have added it to roots crontab, but the issue remained.
@reboot /usr/bin/python3 -u /opt/RPi-Reporter-MQTT2HA-Daemon/ISP-RPi-mqtt-daemon.py

I have also started it on a shell with debugging and verbose logging enabled, but could not spot any issues.
sudo python3 /opt/RPi-Reporter-MQTT2HA-Daemon/ISP-RPi-mqtt-daemon.py -d -v
...the only curious line I could find is this one, which appears occasionally in a different color:
[2023-04-28 14:00:19] - << INTR(0) >> Time to report! (14:00:19 - 2023/04/28)

Run our report script 'genBugInfo' on your failing device and include the output here:

# SCRIPT genBugInfo v1.1 run 23/04/28-14:16:54
# ----------------------------------------------------------------------

# /bin/cat /etc/apt/sources.list | /bin/egrep -v '#'
  
deb http://raspbian.raspberrypi.org/raspbian/ bullseye main contrib non-free rpi

 ----

# /bin/cat /etc/apt/sources.list | /bin/egrep -v '#' | /usr/bin/awk '{ print $3 }' | /bin/grep . | /usr/bin/sort -u | head -1
  
bullseye

 ----

# /bin/uname -r
  
6.1.21-v8+

 ----

# /bin/hostname -f
  
display1

 ----

# /usr/bin/uptime
  
 14:16:54 up 11:16,  3 users,  load average: 0.19, 0.25, 0.21

 ----

# /sbin/ifconfig
  
br-e454e7933844: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet <redacted>  netmask 255.255.0.0  broadcast <redacted>
        inet6 <redacted>  prefixlen 64  scopeid 0x20<link>
        ether <redacted>  txqueuelen 0  (Ethernet)
        RX packets 124251  bytes 11953095 (11.3 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 260768  bytes 347705178 (331.5 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

docker0: flags=4099<UP,BROADCAST,MULTICAST>  mtu 1500
        inet <redacted>  netmask 255.255.0.0  broadcast <redacted>
        ether <redacted>  txqueuelen 0  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

eth0: flags=4099<UP,BROADCAST,MULTICAST>  mtu 1500
        ether <redacted>  txqueuelen 1000  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10<host>
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 6508  bytes 631412 (616.6 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 6508  bytes 631412 (616.6 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

veth4c9b962: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet <redacted>  netmask 255.255.0.0  broadcast <redacted>
        inet6 <redacted>  prefixlen 64  scopeid 0x20<link>
        ether <redacted>  txqueuelen 0  (Ethernet)
        RX packets 124213  bytes 13686696 (13.0 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 261413  bytes 347910389 (331.7 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

wlan0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet <redacted>  netmask 255.255.255.0  broadcast <redacted>
        inet6 <redacted>  prefixlen 64  scopeid 0x0<global>
        inet6 <redacted>  prefixlen 64  scopeid 0x20<link>
        ether <redacted>  txqueuelen 1000  (Ethernet)
        RX packets 573845  bytes 441732653 (421.2 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 142838  bytes 17020364 (16.2 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0


 ----

# /sbin/ifconfig | /bin/egrep 'Link|flags|inet|ether' | /bin/egrep -v -i 'lo:|loopback|inet6|\:\:1|127\.0\.0\.1'
  
br-e454e7933844: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet <redacted>  netmask 255.255.0.0  broadcast <redacted>
        ether <redacted>  txqueuelen 0  (Ethernet)
docker0: flags=4099<UP,BROADCAST,MULTICAST>  mtu 1500
        inet <redacted>  netmask 255.255.0.0  broadcast <redacted>
        ether <redacted>  txqueuelen 0  (Ethernet)
eth0: flags=4099<UP,BROADCAST,MULTICAST>  mtu 1500
        ether <redacted>  txqueuelen 1000  (Ethernet)
veth4c9b962: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet <redacted>  netmask 255.255.0.0  broadcast <redacted>
        ether ea:af:32:f6:f8:e4  txqueuelen 0  (Ethernet)
wlan0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet <redacted>  netmask 255.255.255.0  broadcast <redacted>
        ether e4:5f:01:f3:71:30  txqueuelen 1000  (Ethernet)

 ----

# /sbin/route
  
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
default         <redacted> 0.0.0.0         UG    303    0        0 wlan0
link-local      0.0.0.0         255.255.0.0     U     209    0        0 veth4c9b962
172.17.0.0      0.0.0.0         255.255.0.0     U     0      0        0 docker0
172.18.0.0      0.0.0.0         255.255.0.0     U     0      0        0 br-e454e7933844
192.168.178.0   0.0.0.0         255.255.255.0   U     303    0        0 wlan0

 ----

# /bin/ls -l /var/log/dpkg.log /var/log/dpkg.log.1 2>/dev/null
  
-rw-r--r-- 1 root root 45433 Apr 16 13:01 /var/log/dpkg.log
-rw-r--r-- 1 root root  6847 Mar  2 19:36 /var/log/dpkg.log.1

 ----

# /bin/grep 'status installed' /var/log/dpkg.log /var/log/dpkg.log.1 2>/dev/null | sort | tail -1
  
/var/log/dpkg.log:2023-04-16 13:01:02 status installed mailcap:all 3.69

 ----

# /bin/df -m
  
Filesystem     1M-blocks  Used Available Use% Mounted on
/dev/root          29644  6969     21418  25% /
devtmpfs            1640     0      1640   0% /dev
tmpfs               1898     0      1898   0% /dev/shm
tmpfs                759     2       758   1% /run
tmpfs                  5     1         5   1% /run/lock
/dev/mmcblk0p1       255    52       204  21% /boot
tmpfs                380     1       380   1% /run/user/1000

 ----

# /bin/df -m | /usr/bin/tail -n +2 | /bin/egrep -v 'tmpfs|boot'
  
/dev/root          29644  6969     21418  25% /

 ----

# ls -l /opt/vc/bin/vcgencmd /usr/bin/vcgencmd
  
ls: cannot access '/opt/vc/bin/vcgencmd': No such file or directory
-rwxr-xr-x 1 root root 13948 Mar 22 17:33 /usr/bin/vcgencmd

 ----

Python errors shown in the logs (if applicable):


Additional information:

...should have done this earlier. Here is some additional detail from journalctl -b --no-pager -u isp-rpi-reporter.service , but I have difficulties interpreting it:

Apr 28 13:38:16 display1 systemd[1]: isp-rpi-reporter.service: Succeeded.
Apr 28 13:38:16 display1 systemd[1]: isp-rpi-reporter.service: Consumed 15.902s CPU time.
Apr 28 13:38:19 display1 systemd[1]: isp-rpi-reporter.service: Scheduled restart job, restart counter is at 4.
Apr 28 13:38:19 display1 systemd[1]: Stopped RPi Reporter MQTT Client/Daemon.
Apr 28 13:38:19 display1 systemd[1]: isp-rpi-reporter.service: Consumed 15.902s CPU time.
Apr 28 13:38:19 display1 systemd[1]: Starting RPi Reporter MQTT Client/Daemon...
Apr 28 13:38:23 display1 systemd[1]: Started RPi Reporter MQTT Client/Daemon.
Apr 28 13:38:24 display1 python3[16402]: Traceback (most recent call last):
Apr 28 13:38:24 display1 python3[16402]:   File "/opt/RPi-Reporter-MQTT2HA-Daemon/ISP-RPi-mqtt-daemon.py", line 1809, in <module>
Apr 28 13:38:24 display1 python3[16402]:     afterMQTTConnect()  # now instead of after?
Apr 28 13:38:24 display1 python3[16402]:   File "/opt/RPi-Reporter-MQTT2HA-Daemon/ISP-RPi-mqtt-daemon.py", line 1799, in afterMQTTConnect
Apr 28 13:38:24 display1 python3[16402]:     handle_interrupt(0)
Apr 28 13:38:24 display1 python3[16402]:   File "/opt/RPi-Reporter-MQTT2HA-Daemon/ISP-RPi-mqtt-daemon.py", line 1782, in handle_interrupt
Apr 28 13:38:24 display1 python3[16402]:     update_values()
Apr 28 13:38:24 display1 python3[16402]:   File "/opt/RPi-Reporter-MQTT2HA-Daemon/ISP-RPi-mqtt-daemon.py", line 1763, in update_values
Apr 28 13:38:24 display1 python3[16402]:     getSystemTemperature()
Apr 28 13:38:24 display1 python3[16402]:   File "/opt/RPi-Reporter-MQTT2HA-Daemon/ISP-RPi-mqtt-daemon.py", line 968, in getSystemTemperature
Apr 28 13:38:24 display1 python3[16402]:     interpretedTemp = float(rpi_gpu_temp_raw)
Apr 28 13:38:24 display1 python3[16402]: ValueError: could not convert string to float: "Can't open device file: /dev/vcio\nTry creating a device file with: sudo mknod /dev/vcio c 100 0"
Apr 28 13:43:24 display1 python3[16402]: Exception in thread Thread-5:
Apr 28 13:43:24 display1 python3[16402]: Traceback (most recent call last):
Apr 28 13:43:24 display1 python3[16402]:   File "/usr/lib/python3.9/threading.py", line 954, in _bootstrap_inner
Apr 28 13:43:24 display1 python3[16402]:     self.run()
Apr 28 13:43:24 display1 python3[16402]:   File "/usr/lib/python3.9/threading.py", line 1266, in run
Apr 28 13:43:24 display1 python3[16402]:     self.function(*self.args, **self.kwargs)
Apr 28 13:43:24 display1 python3[16402]:   File "/opt/RPi-Reporter-MQTT2HA-Daemon/ISP-RPi-mqtt-daemon.py", line 1490, in periodTimeoutHandler
Apr 28 13:43:24 display1 python3[16402]:     handle_interrupt(TIMER_INTERRUPT)  # '0' means we have a timer interrupt!!!
Apr 28 13:43:24 display1 python3[16402]:   File "/opt/RPi-Reporter-MQTT2HA-Daemon/ISP-RPi-mqtt-daemon.py", line 1782, in handle_interrupt
Apr 28 13:43:24 display1 python3[16402]:     update_values()
Apr 28 13:43:24 display1 python3[16402]:   File "/opt/RPi-Reporter-MQTT2HA-Daemon/ISP-RPi-mqtt-daemon.py", line 1763, in update_values
Apr 28 13:43:24 display1 python3[16402]:     getSystemTemperature()
Apr 28 13:43:24 display1 python3[16402]:   File "/opt/RPi-Reporter-MQTT2HA-Daemon/ISP-RPi-mqtt-daemon.py", line 968, in getSystemTemperature
Apr 28 13:43:24 display1 python3[16402]:     interpretedTemp = float(rpi_gpu_temp_raw)
Apr 28 13:43:24 display1 python3[16402]: ValueError: could not convert string to float: "Can't open device file: /dev/vcio\nTry creating a device file with: sudo mknod /dev/vcio c 100 0"

I should have checked the list before upgrading. I am now having the same problem after updating to latest version.

I was 3 versions back and worked ok.

I am having the same issue on the same platform on both .83 and .84. Runs fine when started interactively, but has the same problem shown in the journalctl shown above.

@jandamf and @ornative something we don't understand but seems to be happening is that the daemon user appears to be having permissions reset when the OS update happened. As a result, the fix is to apply once again the permissions needed to run the script. (please let me know if this solves your problem.)

Here's my response to duplicate issue #94:

Since the errors you show appear to be access to temps can you please recheck that your daemon account has the correct permissions?
https://github.com/ironsheep/RPi-Reporter-MQTT2HA-Daemon#set-up-daemon-account-to-allow-access-to-temperature-values

(Since you can run interactively, it seems to be an issue with the Daemon account.)

@ironsheep

Thanks for the suggestion. I re-applied the permissions and that solved the problem.

That did correct the problem. Interesting, would be nice to know what element in the update is messing with the permissions.

Yep, if we ever find out I'll post more in these closed issues...

Solved. Thank you!!

I know this issue is closed, but could someone please tell me what their permissions are on /dev/vcio I'm running this in dietpi, and /dev/vcio exists, but has perms of crw------- 1 root root 10, 125 Jan 16 14:20 /dev/vcio. I suspect others are seeing root:video. Before I chase this down with dietpi, what's the output of ls -hl /dev/vcio?