dhiltonp/hexbright

set_light is not guaranteed to reach end_level in given time

Opened this issue · 12 comments

When calling hb.set_light(startLevel,endLevel,duration), the underlying code does not ensure that endLevel is actually reached in the duration specified.

tl;dr version: light level changes should be actual time-based, rather than assumed time and a counter.

This can easily be checked with code that runs a triangle wave, with a 1000ms timer used to toggle between the two:
hb.set_light(0,500,1000);
hb.set_light(500,0,1000);

If the light does not pulse reasonably smoothly (specifically, it ramps up, then flashes bright before ramping back down), this is an indication that the commands did not reach the endLevel desired, while the second sets its startLevel explicitly, thus making it appear to 'flash' bright / dark.
This can become readily apparent just by setting DEBUG = 2 (DEBUG_ON).

The main cause of this problem is in how change_done is treated, but to explain that, I'll backtrack a bit.
set_light() checks the startLevel and endLevel values and sets up the actual time over which the light should change - change_duration - between these two by taking the number of milliseconds specified, and dividing that by update_delay [8.3333333].
The main loop calls adjust_light(), which calls get_max_light_level(), which calls get_light_level() which in turn performs the following pseudomath...
intensity = startIntensity + (intensityDifference) * (percentChangeComplete).
...where percentChangeComplete is calculated as:
change_done / change_duration.

So where and how is change_done - essentially how far along the intensity change process things are - actually calculated? It turns out that this is initialized at 0 in set_light, and is them modified in adjust_light(), specifically as follows:
change_done++;

This, too, may seem reasonable.. every single time adjust_light is called, the change_done is incremented, and eventually it will reach the number set in change_duration, and the process is done. The problem, however, is that this assumes that adjust_light is called exactly at update_delay intervals. I.e. given the above example code:
change_delay = 1000ms / 8.3333333ms = 120
If adjust_light is called every 8.3333333ms, then change_done has to be called 120 times, and it will match the 1000ms:
120 * 8.3333333ms = 1000ms
But if, instead, any code slows that down even just a little, like the debug code, it instead ends up as:
120 * 10ms = 1200ms
Now generally, that's not a big deal. 1 second, 1.2 seconds, it's not something you would generally notice. Except if you use the example above, where after 1000ms as determined by a timer the code is to ramp down again, putting an actual time constraint on the process. Given that each step is supposed to increase lighting as follows:
(500 - 0) / 120 ~= 4.167
And given that in 1000ms, adjust_light getting called every 10ms means there's only 100 steps being performed, the actual intensity at the end of those 1000ms is:
100 * 4.167 = 417.
This is quite a bit less bright than was intended.

When writing programs, this can be partially mitigated - beyond adjusting the timing values to counteract the above effects.
The author can use CURRENT_LEVEL as the start level of any ends-matched sequence programs instead. The desired level may not be reached, but since the start level of the new sequence matches the actual 'end' level of the previous sequence, there's no visual discontinuity.
Alternatively, light_change_remaining() could be used to detect when set_light() is actually done before starting a new sequence. This ensures that the desired level is reached, at the expense of highly variable timing.

In terms of the library, this could be fixed by mapping the change across actual time - e.g. using millis(). The up side of this is that the actual behavior then matches the expected behavior - set_light(0,500,1000) will ramp up the light in 1000ms with an error margin no greater than a single run through the library and user's code, rather than the accumulated effect of a great number of runs. The down side is that this is likely to take up more space; a rather naive implementation (lots of intermediate variables) added 60 bytes. Ouch.

This is true, and is important for users of the library to realize.

update_delay used to be a constructor argument, but was set to 8.333333
when accelerometer support was added; the accelerometer updates 120 times
per second.

I'm afk atm, and will not have service the next 2 days - keep thinking
about things, and I should be able to compose an in-depth response on
Monday.
On Jul 12, 2013 6:59 PM, "EagleWorks" notifications@github.com wrote:

When calling hb.set_light(startLevel,endLevel,duration), the underlying
code does not ensure that endLevel is actually reached in the duration
specified.

tl;dr version: light level changes should be actual time-based, rather
than assumed time and a counter.

This can easily be checked with code that runs a triangle wave, with a
1000ms timer used to toggle between the two:
hb.set_light(0,500,1000);
hb.set_light(500,0,1000);

If the light does not pulse reasonably smoothly (specifically, it ramps
up, then flashes bright before ramping back down), this is an indication
that the commands did not reach the endLevel desired, while the second sets
its startLevel explicitly, thus making it appear to 'flash' bright / dark.
This can become readily apparent just by setting DEBUG = 2 (DEBUG_ON).

The main cause of this problem is in how change_done is treated, but to
explain that, I'll backtrack a bit.
set_light() checks the startLevel and endLevel values and sets up the
actual time over which the light should change - change_duration - between
these two by taking the number of milliseconds specified, and dividing that
by update_delay [8.3333333].
The main loop calls adjust_light(), which calls get_max_light_level(),
which calls get_light_level() which in turn performs the following
pseudomath...
intensity = startIntensity + (intensityDifference) *
(percentChangeComplete).
...where percentChangeComplete is calculated as:
change_done / change_duration.

So where and how is change_done - essentially how far along the intensity
change process things are - actually calculated? It turns out that this is
initialized at 0 in set_light, and is them modified in adjust_light(),
specifically as follows:
change_done++;

This, too, may seem reasonable.. every single time adjust_light is called,
the change_done is incremented, and eventually it will reach the number set
in change_duration, and the process is done. The problem, however, is that
this assumes that adjust_light is called exactly at update_delay intervals.
I.e. given the above example code:
change_delay = 1000ms / 8.3333333ms = 120
If adjust_light is called every 8.3333333ms, then change_done has to be
called 120 times, and it will match the 1000ms:
120 * 8.3333333ms = 1000ms
But if, instead, any code slows that down even just a little, like the
debug code, it instead ends up as:
120 * 10ms = 1200ms
Now generally, that's not a big deal. 1 second, 1.2 seconds, it's not
something you would generally notice. Except if you use the example above,
where after 1000ms as determined by a timer the code is to ramp down again,
putting an actual time constraint on the process. Given that each step is
supposed to increase lighting as follows:
(500 - 0) / 120 ~= 4.167
And given that in 1000ms, adjust_light getting called every 10ms means
there's only 100 steps being performed, the actual intensity at the end of
those 1000ms is:
100 * 4.167 = 417.
This is quite a bit less bright than was intended.

When writing programs, this can be partially mitigated - beyond adjusting
the timing values to counteract the above effects.
The author can use CURRENT_LEVEL as the start level of any ends-matched
sequence programs instead. The desired level may not be reached, but since
the start level of the new sequence matches the actual 'end' level of the
previous sequence, there's no visual discontinuity.
Alternatively, light_change_remaining() could be used to detect when
set_light() is actually done before starting a new sequence. This ensures
that the desired level is reached, at the expense of highly variable timing.

In terms of the library, this could be fixed by mapping the change across
actual time - e.g. using millis(). The up side of this is that the actual
behavior then matches the expected behavior - set_light(0,500,1000) will
ramp up the light in 1000ms with an error margin no greater than a single
run through the library and user's code, rather than the accumulated effect
of a great number of runs. The down side is that this is likely to take up
more space; a rather naive implementation (lots of intermediate variables)
added 60 bytes. Ouch.


Reply to this email directly or view it on GitHubhttps://github.com//issues/29
.

What about this?

unsigned long start_time = 0;
unsigned long end_time = 0;

void hexbright::set_light(int start_level, int end_level, long time) {
  int current_level = get_light_level();
  start_light_level = start_level == CURRENT_LEVEL ? current_level : start_level;
  end_light_level   = end_level   == CURRENT_LEVEL ? current_level : end_level;

  start_time = millis();
  end_time = start_time + time;

  change_duration = time;  
}

int hexbright::get_light_level() {
  long now = millis();
  if (now > end_time) {
    change_duration = 0;
    return end_light_level; }
  else {
    long elapsed = now - start_time;
    return (end_light_level-start_light_level)*((float)elapsed/change_duration) +start_light_level;
  }
}

int hexbright::get_max_light_level() {
  int light_level = get_light_level();

  if(light_level>max_light_level)
    return max_light_level;
  return light_level;
}

int hexbright::light_change_remaining() {
  if (change_duration)
    return end_time - millis();
  else
    return 0;
}

void hexbright::adjust_light() {
  if(change_duration) {
    int light_level = hexbright::get_max_light_level();
    set_light_level(light_level);
  }
}

void hexbright::apply_max_light_level() {
  // if max_light_level has changed, guarantee a light adjustment:
  // the second test guarantees that we won't turn on if we are
  //  overheating and just shut down
  if(max_light_level < MAX_LEVEL && get_light_level()>MIN_OVERHEAT_LEVEL) {
#if (DEBUG!=DEBUG_OFF && DEBUG!=DEBUG_PRINT)
    Serial.print("Max light level: ");
    Serial.println(max_light_level);
#endif
    if (change_duration==0)
      change_duration = 1;
  }
}

Josh, there are definitely some positives to this sort of code.

  • Probably easier to understand for newcomers.
  • Would work better for a dynamic event loop, a change that has been discussed in IRC.
  • Should be slightly faster (4 floating point operations instead of 5).

The main advantage with the current code is that it should be fairly obvious when the amount of user code is high enough to cause accelerometer sample loss.

With the advantages, I'm not opposed to merging a change like this. If you're interested in developing this, I recommend using tests/light_test. DEBUG_LOOP will probably be useful as well.

The main advantage with the current code is that it should be fairly obvious when the amount of user code is high enough to cause accelerometer sample loss.

How so? You mean because the light starts acting funny?

The hope is that when they notice that the light is updating more slowly, they ask why.

I need to do some testing. I just got my first Hexbright. Do you know the typical cost of the update() routine itself off the top of your head? IE, how many of the 8333 micros does the "OS" itself consume before we start counting user code? I'm trying to imagine how users could burn thru all those cycles (other than laggy Serial code).

about 3000

If you run code with DEBUG_LOOP it will tell you.

Leaving in lots of Serial code is probably the main contributor.

Have you thought about moving the accelerometer stuff to a timer interrupt, since it seems to be the most time critical code? Then you'd never miss a 120hz tick and if the rest of the loop code was slower I bet in most cases it wouldn't matter.

Or isn't there and accelerometer interrupt also? I haven't gotten around to looking into interrupts yet.

Ah: the other thing that could cause long-running code is heavy math for the accelerometer. For example, get_spin consumes 2000 microseconds to do its calculation, the core of which is 2 atan operations.

I recall looking at the interrupt stuff, but I don't recall why I didn't use them. It was probably either due to some side-effect I read about, or simply that it looked like a lot of work!