Common types for text-to-speech
Opened this issue · 2 comments
Similar to what I suggested in #1478, it would be great if text-to-speech had a set of common types. The SpeechModel
interface, as well as SpeechPrompt
, SpeechResponse
, and StreamingSpeechModel
feel like what I'd expect with such types, but they are currently delivered in the OpenAI module. Even though OpenAI is the only implementation, it feels like those types should be in core with the implementations and OpenAI-specific extensions in the OpenAI module.
Moreover, while SpeechPrompt
feels like it should be in core, it carries OpenAiAudioSpeechOptions
. Perhaps there should be a more generic SpeechOptions
that is carried by SpeechPrompt
, with OpenAiAudioSpeechOptions
being an extension of SpeechOptions
.
Altogether, this would not only make the types more consistent with how the types for chat and other models are structured, it also sets the stage for additional text-to-speech implementations should more APIs that offer that be added to Spring AI.
I like the suggestion and I'm available to work on this. I'll have a PR ready soon.
Similar to what I suggested in #1478, it would be great if text-to-speech had a set of common types. The
SpeechModel
interface, as well asSpeechPrompt
,SpeechResponse
, andStreamingSpeechModel
feel like what I'd expect with such types, but they are currently delivered in the OpenAI module. Even though OpenAI is the only implementation, it feels like those types should be in core with the implementations and OpenAI-specific extensions in the OpenAI module.Moreover, while
SpeechPrompt
feels like it should be in core, it carriesOpenAiAudioSpeechOptions
. Perhaps there should be a more genericSpeechOptions
that is carried bySpeechPrompt
, withOpenAiAudioSpeechOptions
being an extension ofSpeechOptions
.Altogether, this would not only make the types more consistent with how the types for chat and other models are structured, it also sets the stage for additional text-to-speech implementations should more APIs that offer that be added to Spring AI.
What is the reason for having SpeechResponse in the core, since it is already an implementation of ModelResponse, which belongs to the core family? Most of the classes or interfaces mentioned above are already implementations or extensions of existing core family classes or interfaces.