rust-embedded/rust-raspberrypi-OS-tutorials

Unequal delays between CPU and BCM timers

tylerchr opened this issue · 1 comments

I think there are two timing-related logic bugs in the implementation of delays.rs in 09_delays. Each is a source of divergent behavior between the CPU-based timers and the BCM ones, such that delays::SysTmr::wait_msec_st(n) and delays::wait_msec(n) wait for different durations.

Bug 1: Bad Comparison

The first is in SysTmr::wait_msec_st, which is documented to "Wait N microsec" using the Pi's BCM system timer (empirically this seems to increment monotonically at a rate of 1Mhz).

loop {
    if self.get_system_timer() < (t + n) {
        break;
    }
}

I think this needs to be > (t + n). Since the timer increases monotonically, don't we want to break only when the system timer reaches t + n? As written, this code will break in the very first iteration for any meaningful value of n because the system timer will be < (t + n) for the entire duration of the intended wait.

UPDATE: Looking at the original C source, I can see how this came to be. A more direct port of the [correct] original could be this:

while self.get_system_timer() < (t + n) {
    asm::nop();
}

Bug 2: Bad Unit Conversion

The second bug is an incorrect unit conversion in the delays::wait_msec function. It's also documented to "Wait N microsec" but uses the ARM CPU timer. The conversion occurs in this line:

// Calculate number of ticks
let tval = (frq as u32 / 1000) * n;

Dividing by 1000 gives the number of ticks per millisecond but as n is in microseconds, tval winds up being "milliticks" rather than just ticks.

A more correct implementation would divide by 1,000,000:

// Calculate number of ticks
let tval = ((frq as u64) * (n as u64) / 1_000_000) as u32;

Funnily, this doesn't always result in 1000X longer waits. On my Pi hardware, where frq is reported as 0x0124F800 = 19,200,000 Hz, calling delays::wait_msec(1_000_000) causes tval to overflow the 32-bit register and waits 2_020_130_816 ticks, or only 1m45s.

(Also, this version does the math in u64 for accuracy; the simpler (frq as u32 / 1000000) * n implementation results delays::wait_msec(1_000_000) ending nearly 11 milliseconds early due to integer truncation of 19.2 to 19.)

Hi,

Thanks for the detailed report. Really looks like some blunders happened there...
Will fix it in a few days, currently no access to a suitable machine.

I'm glad to take a PR as well if you want to take it :)

For bug 1 I would prefer the loop version, it is more readable imo.