mudler/LocalAI

feat(multimodal): Audio understanding

mudler opened this issue · 0 comments

Is your feature request related to a problem? Please describe.
Projects like https://github.com/kyutai-labs/moshi and https://github.com/ictnlp/LLaMA-Omni allow to audio understanding, this is a capability that could be integrated in LocalAI as well

Describe the solution you'd like
A backend and a way for the API to understand audio

Describe alternatives you've considered

Additional context
This Issue is left open on purpose for discussing potential implementations and backends that should be integrated in LocalAI