egnor/pivid

Support for dynamic still images and faster interface to provide data from an outside process.

pinballpower opened this issue · 3 comments

Great project! I'm thinking about using it in a Pinball-related project as the code base seems to be a lot more stable than my own code.

However, there are a few things I would need:

  1. Support for raw image data (RGBA). This is needed to render an image on top of the videos that gets updated at least 50 times per second.
  2. A faster interface to send these data. I'm thinking about something like a named pipe or shared memory.

I'm happy to look into these and contribute code. However, I first want to understand if this is interesting for your project and make sure, it fits correctly into the overall architecture.

This is the project I will try to use it with: https://github.com/pinballpower/code_dmdreader

BTW: I'm happy to discuss the details off-line if you wish.

egnor commented

Intriguing! (I spent a while trying to figure out what software package "pinball" was until I realized that it's referring to literal pinball, like with the little steel balls bouncing around a table, ha.) I expect for a lot of use cases there will be the desire to generate dynamic images from other processes and feed it in. The trick will be to avoid turning into another implementation of X windows (or Wayland) or generally complicating things a lot.

Do you have thoughts on what the best protocol would be? Ideally it's something that

  • could easily be done from e.g. Python without auxiliary libraries
  • but also passes memory directly, so zero-copy is possible
  • can work in "immediate" mode (display whatever images are sent ASAP) and also "timed" mode (a sequence of frames with display timestamps for well-synchronized output)

The first two might seem impossible, but Python does support both mmap and unix-domain fd-passing natively. (Nothing's magical about Python here, it's just my proxy for "popular, approachable language", i.e. "not C++".) So it's theoretically possible!

So, maybe something like this?

  • in the pivid play script, allow defining "live" or "external" media files
  • instead of opening a static file, pivid creates & listens on a unix domain datagram socket in the media tree
  • clients connect to that unix domain socket and send dma-buf file descriptors along with short control messages
  • each control message (JSON??) gives image pixel format (fourcc) & size & display timestamp (or none for "show asap")
  • (clients can get dma-buf file descriptors from /dev/dma_heap/linux,cma)
  • we include a little demo Python program to create & send a test animation (live clock?), as an example

Not having actually done it I'm not 100% sure this would work, but it seems like it might? And yeah the images would be raw (RGBA or any other format supported by DRM) so should be straightforward to generate.

Yes, we're really talking about REAL Pinballs here ;-) Modern machines already have video screens with this kind of functionality, but the project linked above is about bringing this stuff to OLD machines.

While I did a lot of Python stuff in the past, I personally don't need a Python interface here (the existing stuff is C++). However, things like shared memory or named pipes are basically language agnostic and should therefore work with many programming languages. And I understand that sample code might be easier to understand in Python. I just guess, Python codes that does low-level DMA stuff might also not be very easy to understand for most people ;-)

I'm not sure how to exchange DMA descriptors, but I should be able to do this if there is some sample code around. I never did this kind of low-level stuff before. But hey, I also didn't do any C++ code for almost 30 years ;-)

Right now I'm still trying to figure out what the limits are that a Pi4 can handle. When adding more layers, at some point the stuff doesn't run properly anymore. The screen goes black for some time, the picture comes back again and so on. I had the same issue in my own code and couldn't figure out what to do about it and if there is a clear indication what are the limits in term of layers/videos/resolution/bandwidth. I saw some bandwidth calculations in your code, but I'm not sure if this can be used to determine the limits of the DRM. But this is something we probably should discuss offline.

egnor commented

(Split off discussion of hardware limits: #3)

The "deep magic" involved in exchanging DMA descriptors is actually fairly straightforward once you get to it, but wading through all the various documents (especially the interactions with various kernel subsystems) is the hard part. So a simple example (in whatever language, I'm just picking Python as a "lingua franca"-- could also use C) would go a long way I think.

Helpfully, on the Raspberry Pi, there is no physical GPU memory, hardware can freely work with any memory allocated by the kernel. This is not true on other platforms with dedicated GPU memory, where you have to worry a lot about exactly which memory bank any given buffer is located in. (Consider for example those systems which switch dynamically between integrated GPU and discrete GPU on a frame by frame basis based on load factors, where each of them has its own memory...!) So a lot of the complexity in these interfaces is mooted.