djphazer/O_C-Phazerville

Display sometimes gets messed up with Teensy 4

Closed this issue ยท 10 comments

Pretty sure I made a mistake somewhere in the display driver for Teensy 4. Happens only rarely, but when it does the screen gets reversed or swapped or garbled.

HELP WANTED with finding ways to reproduce the problem. Game/Life applet (removed) was said to sometimes cause it, in Discord conversation on Feb 6, 2024.

image

FWIW there was a similar issue in the OG firmware. At least the symptoms match :)

The cause there was app processing in the ISR handler taking too long occasionally. This shifted the timing for the next two ISRs (one is short and late, then the next back on time, if that makes sense). It would then toggle the CS pin and update the DAC before the screen transfer was complete -- it wasn't waiting, just assumed it would be fine, so waiting for completion fixed it.

I might have a `scope shot somewhere, I eventually I tracked it down by triggering on deviation from the normal ISR timing.

@PaulStoffregen
So there is a scope capture here. In ASCII-art what was happening is

|60us  |      |      |      |      |      |
|dxxxx |dxxxx |dxxxx    |dxx|dxxxx |...
|.aa   |.aa   |.aaaaaaaa|.aa|.aa   |      |

where 'd' is DAC update, 'xxxx' the 128byte screen DMA, 'a' app processing which happens in parallel to DMA.
So the app takes longer than usual for "reasons", the next ISR is late, the next next one intrudes on the running DMA (instead of waiting). Perhaps it's not applicable with the new drivers and performance but might be a clue...

The DMA transfer corruption theory seems quite plausible.
I know we have more processing power now, but the offending GameOfLife applet is doing some crazy stuff on every ISR call - a bunch of nested iteration and a local array of 640 bytes that gets computed and then memcpy'd to the main one. I can see how it occasionally overruns...

Another theory: maybe something is somehow drawing graphics out of bounds, occasionally corrupting memory beyond the framebuffer?

There were some paths in the graphics code that don't clip so that's not impossible :)
I'd think that would be more reproducible though? A simple test might be to put a fence buffer before & after the framebuffers and check it periodically for modification.

The original fix (and a gfx bugfix) are in this branch

On the OG o_C the corruption was pretty random even if the app was fairly consistently using too much time (e.g. Piqued was pretty high load). IIRC it became more obvious (i.e. no longer ignorable) after a compiler change which probably shifted some timing around. Since it doesn't recover I assume it's mostly the DC pin being inconveniently toggled and reconfiguring something...

So from a quick skim (I don't have hardware here to run anything) it looks to me like the new implementation is susceptible to the same scenario. At least if I'm parsing the #ifdef-ery right :)

  • SH1106_128x64_Driver::SendPage starts an interrupt-driven SPI transfer (instead of DMA).
  • SH1106_128x64_Driver::Flush is a NOP so it isn't guaranteed that sendpage_state has reached 3 and display CS was deasserted.
  • set8565_CHA just writes into LPSPI4_TCR and LPSPI4_TDR so if the transfer wasn't complete, it's now borked.

I think Flush needs to wait for a flag and/or completion of transfer in the 4.0 case, for 4.1 perhaps sync in SendPage.

At test of the theory (and fix) might be to simulate a "complex app" with a delay of 100us in every 1000th ISR call?

grsr commented

I am a brand new o_C user with a new module that uses a teensy 4.0. I've also observed these display glitches, and was also mucking around with the game of life applet around the time it glitched out. Anything I can do to help debug? I am new to the teensy but not to dev. I will try to get a build with the patch above to see if I can reproduce it. It's still a lot of fun so far though - thanks!

@grsr This build has the patch: https://github.com/djphazer/O_C-Phazerville/actions/runs/8614265432

Or if you get your toolchain set up, build from the dev/1.7 branch and try it out!

grsr commented

Thanks @djphazer - I have installed your build and haven't seen any glitches yet. In particular the scope applet seems to behave more like I would expect (yesterday I saw some glitches there as well). I don't see the potentially problematic game of life applet in hemispheres any more, so haven't tested that. Did you remove it from the build?

Passencore is also display crashing as soon as you plug some trigger or cv in - v1.7 Teensy 4.0

This is fixed in v1.7.1 with patch #66

The display doesn't crash on Teensy 4.x anymore, but excessive processing can cause timing inaccuracies. Eventually, it should be tested/profiled to see how bad it is and maybe correct it.