LPSPI Rework Open Discussion Points

Question

LPSPI Rework Open Discussion Points

Opened this issue 9 months ago · 5 comments

Answer 1 · 2023-12-19T20:10:26.000Z

@mciantyre In case you have some time, I'd appreciate your though on this, or if you have additional open questions I missed. I'm sure I missed plenty that will pop up later.

Answer 2 · 2023-12-26T15:14:59.000Z

Split into low-level and high-level driver? If so, naming?

I have some ideas for naming the async components here. These names depends on the design, so if that design doesn't work, we'll think of something different.

Other ways to re-use the Lpspi name for different things include

putting them in different modules within imxrt-hal.
putting them in different crates.

Regarding 2: it would be OK with me to consider an imxrt-hal-async package. That package could depend on imxrt-hal for its drivers and configuration modules. This would be symmetric to the interface packages: we have embedded-hal and embedded-hal-async, and embedded-hal-async depends on embedded-hal.

Reuse existing driver as low-level driver? If so, how to deal with the fact that e-h 1.0 doesn't use the CS pin?

After removing the CS pin from the interface, I believe today's LPSPI driver can be a foundation for an async driver. It needs tweaks and extensions, but its lower-level API tries to support this use case. If the existing driver doesn't work, then reach for the RAL.

We'll need to remove the chip select pin from the driver, turning it into something supporting SpiBus. Users who want software-managed chip selects can reach for embedded-hal-bus. Users who want hardware-managed chip selects will need to wait for an LPSPI-capable SpiDevice solution.

Build high-level driver from a read and a write backend, which can be Polling/DMA/Interrupt based?

DMA + interrupts should work. Only supporting u32 is also fine; that's all we support today.

How should interrupts be handled?

Some status flags, like RDF and TDR, are read-only and cannot be cleared. Disabling the interrupt enable bits should work for these bits, and should also work for all other events. This allows a custom future to check and maintain the status register in its poll() implementation.

How should embedded-hal and embedded-hal-async be implemented? Should embedded-hal simply be a cassette poll wrapper around the async version or be its own thing?

I'm not a fan of "use cassette to go from async to blocking" for two reasons:

We have a blocking implementation that isn't based on async, and it can be tweaked to meet 1.0 embedded-hal requirements. Taking this approach excludes maintainers / contributors who only care about blocking I/O and aren't prepared for async Rust. I'm willing to maintain a blocking I/O implementation that's distinct from its async implementation.
It's a general design pattern baked into a specific HAL. If folks think this is a useful design pattern, how about doing something like embedded-spinny, which might solve the same problem for other HALs?

Should the implementation be reference or raw pointer based?

Raw pointers are OK. We can build our own / seek out existing safe abstractions.

Any considerations for introducing an intermediary (ring) buffer within an async LPSPI driver? Let the user decide the buffer size when they create their async LPSPI driver. We use it as another FIFO for the interrupt case, and as the source / destination of DMA transfers. We incur extra copying and buffering in the driver, and the user pays in extra program memory.

The extra buffer might change where / how we interact with the hardware FIFO, interrupt flags, status registers; not sure if that change is helpful. And if we require that the user supply us with a u32 buffer, we can then implement u8-based DMA without worrying about alignment.

Answer 3 · 2023-12-26T17:05:19.000Z

Any considerations for introducing an intermediary (ring) buffer within an async LPSPI driver?

What would be the advantage of this, opposed to streaming into the hardware fifos directly?

Answer 4 · 2023-12-26T19:51:56.000Z

For DMA use-case: If we require that the user supply us with a u32 buffer, we can then implement u8-based DMA without worrying about buffer alignment.

For interrupts, we can reduce the number of executor wake-ups (different than the number of activating interrupts). We achieve this by moving hardware FIFO management into the interrupt handler. The user yields at await points when they need space in their flexibly-sized ring buffer; they're not awaiting space in a fixed-sized hardware FIFO. Thread-safe, lock-free queues / buffers from heapless / bbqueue might help us out.

Reducing the number of executor wake-ups might avoid cooperative multitasking issues. If we're managing the hardware FIFOs through code paths in the executor, we could over- or under-run a FIFO while another task isn't yielding. This might be less of a concern; I think executors like RTIC and embassy have ways to express task priorities. It's also not a perfect solution, since avoiding the problem still depends on the ring buffer's size (but at least the user controls that size).

And for both cases, we have a consistent way to realize interrupt and DMA I/O. In the transmit path, we're filling a ring buffer, then triggering some state machine to empty that buffer. In the receive path, we're triggering some state machine to fill a ring buffer, then emptying that buffer. The distinction of interrupt and DMA is the answer to "what's that state machine on the other side of this buffer?" Once we establish the design, it naturally translates into an async LPUART driver implementation, supporting a peripheral that only has a four byte hardware FIFO.

Answer 5 · 2023-12-26T20:37:58.000Z

Uff, that's a lot to think about :D
I might have to take a step back for a couple of weeks, if someone wants to try himself on this, please go ahead :)