rbonghi/jetson_stats

Crash ZeroDivsionError

Closed this issue · 4 comments

Describe the bug

During high load where processes are often started and run for short periods, it is possible to crash jtop with a ZeroDivisionError. This can happen when jtop enumerate processes almost exactly as they started resulting in the following expression being so close to zero that it is considered zero.

proc_uptime = uptime - starttime

To Reproduce

Steps to reproduce the behavior:

  1. Create a script that quickly starts a short-lived process in an infinte loop and run it in the background
  2. Open jtop
  3. Wait for the jtop service to crash
  4. See error

Expected behavior

jtop not crashing during such system loads.

proc_uptime = max(1, uptime - starttime) would solve the issue but may cause newly spawned processes load to be underrepresented (<1s), however this would also prevent jtop from attempting to extrapolate process load when basically no statisics exist yet.

Board

  • jetson-stats version: 2.4.7
  • Jetpack: 5.1.3
  • L4T: 35.5.0

It might not be super helpful, but I can confirm that this error has occurred on my system as well (or at least it does look the same). I had no idea how to reproduce it though, just experienced crashes from time to time.

Traceback
Process JtopServer-1:
Traceback (most recent call last):
  File "/usr/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "/usr/local/lib/python3.8/dist-packages/jtop/service.py", line 466, in run
    if self._timer_reader.close(timeout=TIMEOUT_SWITCHOFF):
  File "/usr/local/lib/python3.8/dist-packages/jtop/core/timer_reader.py", line 75, in close
    self._error_status()
  File "/usr/local/lib/python3.8/dist-packages/jtop/core/timer_reader.py", line 90, in _error_status
    raise ex_value
  File "/usr/local/lib/python3.8/dist-packages/jtop/core/timer_reader.py", line 46, in _timer_callback
    self._callback()
  File "/usr/local/lib/python3.8/dist-packages/jtop/service.py", line 605, in jtop_stats
    data = self.jtop_decode()
  File "/usr/local/lib/python3.8/dist-packages/jtop/service.py", line 569, in jtop_decode
    total, table = self.processes.get_status()
  File "/usr/local/lib/python3.8/dist-packages/jtop/core/processes.py", line 136, in get_status
    table = [self.get_process_info(prc[0], prc[3], prc[2], uptime) for prc in table]
  File "/usr/local/lib/python3.8/dist-packages/jtop/core/processes.py", line 136, in <listcomp>
    table = [self.get_process_info(prc[0], prc[3], prc[2], uptime) for prc in table]
  File "/usr/local/lib/python3.8/dist-packages/jtop/core/processes.py", line 110, in get_process_info
    cpu_percent = 100 * (total_time / proc_uptime)
ZeroDivisionError: float division by zero

Board

  • jtop: 4.2.6 (Python 3.8.10)
  • Jetpack: 5.1.2
  • L4T: 35.4.1

@soyszala I'll confirm that traceback is indeed the one I got for this issue

Hi @evildeeds

Apologies for this very late reply, but I understand your issue. I'm implementing this check in the jtop code.

Fixed with the new release. Please update to 4.2.9

sudo pip3 install -U jetson-stats