High Level Design Discussion
wsc1 opened this issue · 6 comments
The sio module is organised in terms of data types into devices, inputs, outputs, duplex, and packets.
The Device struct contains fields related to common functionality in OS audio device's: a way of
defining (probably) supported sample rates, PCM data types, buffer sizes.
Devices can create inputs, duplex, and outputs.
Inputs manage a simplified interface to a ringbuffer, providing the consumer with
access to the part of the ring buffer that is available for consumption, synchronised by channels.
Similarly, Outputs manage a simplified interface to a ringbuffer, providing the consumer with
access to that part of the ring buffer which is available for writing and a way to send it back
once written.
Similarly, Duplex acts just like output but assumes 2 synchronised ring buffers behind it, one for capture and one for play, or one ring buffer with space for both in and out data. These objects all have adaptors to sound.{Source,Sink,Duplex}.
A Packet allows access to the ring buffer exported as a []float64, and implementations do whatever translation might be necessary to handle this format.
This design lacks a structured way of interacting with the host, such as periph.io or even the calling context, such as generic codec interface at zikichombo.org/codec. It also lacks a mechanism to enforce or directly support the scheduling relationship between the presumably real-time device and goroutines.
I wanted to invite other criticisms and concerns.
I also wanted to state that I think, at least at this time, the lack of mechanism to enforce or directly support the scheduling relationship between the (presumably) real-time implementation and goroutines
may be good, as it would open up and facilitate the possibility of assessing this for a given application or application context, which in turn is outside of the scope of a support library. I would guesstimate that in many contexts this design could achieve latency corresponding to buffer sizes of 128 frames or less at 44.1KHz, which is considered low in many contexts.
Of course, not having support for direct enforcement of latency also presents risks in terms of reliability.
After examining the work in oto and elsewhere, and the corresponding reports of successes, I do not see any design-level reasons why this should not work at least as reliably as they work. I would like to invite debate or assertions to the contrary.
Thoughts?
Here's one proposed solution to the structuring with respect to the host.
The main issue is that many hosts have many entry points to their sound stack. For example, pulseaudio usually uses ALSA as a backend, AudioFlinger in Android uses the android HAL backend (which can also use ALSA). The higher level entry points are intended for sharing of audio across applications. The lower level entries are hardware abstraction layers. AudioUnits has the nice property that it unifies the HAL and higher order processing.
To address this problem, we could create a supported list of OS, ARCH, entry point triples and place them in sio in a structured fashion. Build tags could potentially be used for the entry points as well.
Each of these triples, as determined by build tags, would then be registered in package initialisation,
fulfilling an interface which would allow access to the Dev structure and scanning/device change notifications.
3rd party package could also register such device access, and as in http://godoc.org/zikichombo.org/codec/#CodecFor
we could add a package selection mechanism for consumers to ultimately decide the implementation.
Thoughts?
To address this problem, we could create a supported list of OS, ARCH, entry point triples and place them in sio in a structured fashion. Build tags could potentially be used for the entry points as well.
Thoughts?
I know you mentioned it somewhere, but can't find the reference now. What about the design of oto makes it incompatible with the design goals of zc? Is it that oto does not handle capture of sound? It seems like it may be a better approach to try to contribute to oto so we have one solid implementation in the community for audio playback on a large set of supported platforms.
We are working with oto.
If you want only playback now, oto is an option.
oto does not engage in community driven design and has stopped progress on some issues in the past that were the right way to go in terms of input and output (eg AUHAL) without explanation. Oto does not choose the lowest level entry points for the systems; Oto does not engage in dialogue at the level of design and unilaterally will decide how to go about it, if it continues as it had in the past.
the decision to address audio based on output is probably market driven and in our estimation a disastrous long-term way of going about it from a software design perspective because addressing output without input or duplex just doesn't make sense in terms of interface provided to the user and in terms of organisation/implementtion of the code. For those who can't see that, feel free to use oto.
The io.Writer interface is not intended for timing related interaction. It will have the same problems as ALSA in terms of duplex if one day oto grows an io.Reader.
The io.Writer interface doesn't not allow one to set the buffer size, or interact with a ringbuffer
which is necessary to get sufficient control of I/O for duplex and latency sensitive applications.
The compile time interface to making an io.Writer doesn't allow multiple entry points into one host in one build, which doesn't make sense.
We're approaching this more slowly but more solidly from a design perspective and in terms of feedback.
Not so interested in having that discussion shut off by oto.
Very much interested in your feedback at the level of the design points independent of oto or other projects, which would be more directed toward the goals of zc and zc/sio
Have started a new GitHub project for setting up structured host interfaces with multiple entry points
https://github.com/zikichombo/sio/projects
Looking for feedback for overall sio project organisation for hosts and entry points at http://github.com/wsc1/sio
So the structuring around hosts has something in place that seems ok to build on so far. It also places the questions about latency and reliability using constructs in libsio, which has been relegated to optional use. Since there are no comments after some time regarding limitations of libsio constructs, and they work in our tests so far, and they are now optional, closing.