OPEnSLab-OSU/FeatherFault

WDT reset synchronization causes MARK to run slowly

kamocat opened this issue · 6 comments

Currently the the MARK macro takes nearly 5 milliseconds, which can add up quickly if used 20 times in a loop. I think most of this is in the string processing - extracting the filename from the FILE constant. If we move this step into the interrupt handler, we could reduce the time to copying and int and a pointer, plus the time to actually feed the watchdog. I expect this to take around 2 microseconds if the peripheral clock runs at 1MHz.

Here's my test code:

#include "FeatherFault.h"

void setup() {
  digitalWrite(13, LOW);
  delay(100);
  digitalWrite(13, HIGH);
  Serial.begin(115200);
  while(!Serial);
  FeatherFault::StartWDT(FeatherFault::WDTTimeout::WDT_8S); MARK;
  FeatherFault::PrintFault(Serial); MARK;
  Serial.flush(); MARK;
}

void loop() {
  long tic = micros();
  MARK;
  long toc = micros();
  Serial.println(toc-tic);
}

And a graph of the result
featherfault_MARK_time

I though the filename was extracted at compile time using the constexpr function _ShortFilePrivate::past_last_slash, I'm surprised this is not the case. My goal was for the string to be determined at compile time so that the overhead of MARK is simply copying two variables, as you mentioned.

Before we attempt to fix string parsing algorithm, I should check if it is actually possible to determine the filename entirely at compile time. If not, placing the processing in the interrupt is a good idea; In that case I would also rewrite _ShortFilePrivate::past_last_slash to use a for loop instead of recursion, which would speed it up immensely.

After examining the disassembly, it seems that the __SHORT_FILE__ macro is working as intended: the filename pointer is in fact generated at compile time, and the line number and filename are copied and stored using a grand total of eight instructions (movs * 2, ldr, str).

Examining FeatherFault::_Mark, there are still several possible causes. Every time the WDT is reset, there is a synchronization period to allow the WDT clock to catch up with the system clock:

static void WDTReset() {
while(WDT->STATUS.bit.SYNCBUSY);
WDT->CLEAR.reg = WDT_CLEAR_CLEAR_KEY;
}
Since the WDT is probably clocked very slowly, this could be the issue.

Another possible hangup is the use of atomic variables in FeatherFault::_Mark:

is_being_written.store(true);
last_line = line;
last_file = file;
is_being_written.store(false);
My goal for is_being_written was to allow the fault handler to detect if the fault data has been corrupted, and viewing the disassembly it seems fairly lightweight (compiling to dmb * 2 and some load/store instructions). It does create a barrier, however, and as a result may be the cause of the problem.

I found removing the atomic variable to make almost no difference in execution time, but removing the WDT reset to cut that time by a factor of ten. I will investigate if we can remove the synchronization for the WDT reset without breaking the WDT.

Slow synchronization seems to be an issue with the SAMD21 processor as a whole. Since there is no way to read the WDT counter, the only way I was able to lower the WDT synchronization time was to increase the WDT clock speed, which decreased the sync delay proportionally. I was not able to remove synchronization from the WDT reset without stalling the CPU, which caused other interesting problems with the serial monitor.

Hey, I have a crazy idea.
What if we set a global flag in MARK, and check the flag in the watchdog interrupt to decide whether to reset the MCU or to feed the watchdog? (Of course if we feed the watchdog we should clear the flag)
This effectively doubles the maximum watchdog period, and has the lowest possible overhead since the watchdog will only be fed at most once per period.

Good idea! I've implemented this solution in v1.1.4, and I'm going to consider this solved.