heliosproj/HeliOS

What happens if totalRuntime has an overflow?

kwrtz opened this issue · 5 comments

kwrtz commented

Sorry, I dind't went through all the code lines, but I have a question regarding coop scheduling.

In:

 if (task->state == TaskStateRunning && task->totalRuntime < leastRuntime) {
        leastRuntime = task->totalRuntime;
        runningTask = task;

you are searching for the task with the least runtime.

In:

  if (runningTask) {
    taskStartTime = NOW();
    (*runningTask->callback)(runningTask->id);
    runningTask->lastRuntime = NOW() - taskStartTime;
    runningTask->totalRuntime += runningTask->lastRuntime;
  }

you write the totalRuntime runningTask->totalRuntime += runningTask->lastRuntime;.

What happens, if the totalRuntime has an overflow?
And all other coop tasks are just before an overflow.
Then this task has the least runtime and will always selected in

      if (task->state == TaskStateRunning && task->totalRuntime < leastRuntime) {
        leastRuntime = task->totalRuntime;
        runningTask = task;
      }

Is that correct? I haven't found a correction in the code for that issue.

@kwrtz that is a good observation. In that case, the task that overflows first could dominate the schedule until it's totalRuntime catches up to the other tasks which is a problem. Thank you for the observation, I will keep this issue open until it is corrected. In the meantime if there are other contributors who have suggestions I am open to ideas! Thanks.

I should have a fix for this. I pushed the changes to the develop branch. I haven't tested yet and there is some clean-up but hopefully I have fixed this by detecting the overflow and resetting all totalRuntime counters to the current lastRuntime. The good thing about this solution is HeliOSResetRuntime() only runs when totalRuntime overflows.

I copied the important bits of code below so you don't have to go hunt down the changes in the source tree.

  if (heliOSTimerOverfow)
    HeliOSResetRuntime();
inline void HeliOSRunTask(Task *task_) {
  HeliOSTime taskStartTime = 0;

  task_->prevTotalRuntime = task_->totalRuntime;
  taskStartTime = NOW();
  (*task_->callback)(task_->id);
  task_->lastRuntime = NOW() - taskStartTime;
  task_->totalRuntime += task_->lastRuntime;
  if (task_->totalRuntime < task_->prevTotalRuntime)
    heliOSTimerOverfow = TRUE;
}
void HeliOSResetRuntime() {
  Task *task = NULL;

  TaskListRewind();
  do {
    task = TaskListGet();
    if (task) {
      task->totalRuntime = task->lastRuntime;
      task->prevTotalRuntime = 0;
    }
  } while (TaskListMoveNext());
  heliOSTimerOverfow = FALSE;
}

If there are opportunities to optimize please let me know.

I may not need to store prevTotalRuntime in the Task struct. I only need it long enough to detect the overflow condition inside of HeliOSRunTask().

After some clean-up/refactoring. Compiles but will need to do some testing to ensure overflow is handled correctly.

/*
 * A new struct to keep various flags.
 */
typedef struct {
  bool	setupCalled;
  bool	critBlocking;
  bool	runtimeOverflow;
} Flags_t;
/*
 * Appears in of xHeliOSLoop() on line #49
 */
  if (flags.runtimeOverflow)
    RuntimeReset();
 inline void TaskRun(Task_t *task_) {
  Time_t taskStartTime = 0;
  Time_t prevTotalRuntime = 0;

  prevTotalRuntime = task_->totalRuntime;
  taskStartTime = NOW();
  (*task_->callback)(task_->id);
  task_->lastRuntime = NOW() - taskStartTime;
  task_->totalRuntime += task_->lastRuntime;
  if (task_->totalRuntime < prevTotalRuntime)
    flags.runtimeOverflow = true;
}
inline void RuntimeReset() {
  Task_t *task = NULL;

  TaskListRewind();
  do {
    task = TaskListGet();
    if (task)
      task->totalRuntime = task->lastRuntime;
  } while (TaskListMoveNext());
  flags.runtimeOverflow = false;
}

This is now fixed and will be available in the 0.2.5 release. Closing.