toitlang/jaguar

WATCHDOG INTERRUPT error

muhlpachr opened this issue · 9 comments

jag version
Version:         v1.0.5
SDK version:     v1.6.7
Build date:      2022-02-16T17:03:36Z

Stupid program:

main:
  i := 0
  while true:
    print "$i"
    i++

Reproducible every run: with slight variation of final count:

18059
18060
18061
----
Decode system message with:
----
jag decode WyNVBVVYU1UGdjEuNi43U1UAWyRVI1UQAAAAAAAAAAAAAAAAAAAAAFsjVQRVRVNVEldBVENIRE9HIElOVEVSUlVQVFNVAFsjVQJVU1sjVQtbI1UDVUZVAEkFnlsjVQNVRlUBVUVbI1UDVUZVAkkDSlsjVQNVRlUDSQhqWyNVA1VGVQRJEXNbI1UDVUZVBUkD/1sjVQNVRlUGSQQUWyNVA1VGVQdJA+9bI1UDVUZVCEkRhlsjVQNVRlUJSRGbWyNVA1VGVQpJA28=
[jaguar] INFO: program 3 terminated with exit code 1
WATCHDOG INTERRUPT error.
  0: watchdog_                 <sdk>\core\exceptions.toit:202:1
  1: main                      test.toit:1:1
  2: __entry__.<lambda>        <sdk>\core\entry.toit:48:20

Why it failed with WATCHDOG ?
I would expect something like output stream overrun etc.

I think there are two potential issues here (which are mostly related):

  1. print doesn't yield
  2. print is blocking.

In Toit a program should yield from time to time to give other tasks the option to run. If it doesn't, then a watchdog wakes up and interrupts the program. This is for two reasons:

  • make sure that multiple tasks can make progress
  • avoid a device being "bricked" because it is in an infinite loop.

Yielding can be done in several ways:

  • yield is the obvious one, but
  • primitives that block (like sending data over WiFi, ...) also yield so that other tasks can run.

In the toit.io version, print yields (also because the data is sent to the cloud).
However with Jaguar, we have replaced the print with a call to the Uart, and that doesn't seem to yield anymore.

In theory, the program can generate data much faster than the uart can consume it, so it should automatically yield, as the program is blocked by the speed of the UART. This is a bug that is in the process of being fixed: toitlang/toit#469

However, this doesn't mean that the print yields yet when it has to wait for the UART buffer to flush. This is the second issue which we need to address.

/cc @erikcorry

Thank you for detailed explanation.

When you will be satisfied with UART and print implementations, there should be probably some highlighting for blocking/non-blocking yielding/non-yielding functions behavior in https://docs.toit.io/tutorials with examples.

Yielding description in https://docs.toit.io/language/tasks is good but may be too late.

May be function behavior visibility in source code (at places where functions are called) would also help. I am not sure, how to make that right.

Why this program has same behavior ?

main:
  i := 0
  while true:
    print "$i"
    i++
    yield

And this also:

main:
  i := 0
  while true:
    with_timeout --ms=1: print "Test: $i"
    i++
    yield

Florian's explanation is a little bit off. It is all about yielding from the process, not the individual tasks. So with the current setup, a process needs to yield at least every 10s in order for it not to trigger the watchdog interrupt. All processes are automatically preempted on regular intervals, so we can run N processes on M operating system threads (N > M) without starving any off them. We don't need them to yield for that to work.

If you have a single task running in your process it is enough to wait for a resource or simply sleep --ms=1, but it is somewhat annoying. I'm considering adding programmatic support for turning the watchdog on/off for processes.

The idea with the watchdog was to avoid having processes that get stuck in a busy loop, but maybe it is less helpful than we thought.

Thank you for explanation. I consider watchdog as good idea, disabling watchdog could be dangerous by my opinion.
It can make wrong implementation working even in case there are issues.
How can I wait for cpu resource or make process yield (not task yield) in my process without sleep then please ?

I think you have to sleep --ms=1 at the moment.

OK, thank you.

Fixed in SDK v2.0.0-alpha.10.