Byron/google-apis-rs

Text-to-speech synthesize response uses incorrect base64 encoding

connorskees opened this issue · 9 comments

SynthesizeSpeechResponse uses urlsafe base64 during deserialization, where the API returns standard base64 (including / and +). Such a response can be found in the docs here. This prevents the API from being used at all.

#[serde_as(as = "Option<::client::serde::urlsafe_base64::Wrapper>")]

A minimal example demonstrating this:

let req = SynthesizeSpeechRequest {
    input: Some(SynthesisInput {
        text: Some("hello world".to_owned()),
        ssml: None,
    }),
    voice: Some(VoiceSelectionParams {
        language_code: Some("en-US".to_owned()),
        ..Default::default()
    }),
    audio_config: Some(AudioConfig {
        audio_encoding: Some("MP3".to_owned()),
        ..Default::default()
    }),
};

// will crash
let result = hub.text().synthesize(req).doit().await.unwrap();
Byron commented

Thanks for reporting. Changing that for all APIs is easy, but I wonder if it would break others or most of them.

For those who want to tackle this I recommend to see if this encoding is in some way specified in the OpenAPI description so it could be used to make this decision.

If it it is implicit, it might be safest to implement a per-API override (as is present for the Youtube API for example) to turn a different encoding on only for the texttospeech API.

tkrs commented

Hi, I'm facing the same issue with google_datastore1 v5. It crashes when decoding blob_value.

#[serde_as(as = "Option<::client::serde::urlsafe_base64::Wrapper>")]
pub blob_value: Option<Vec<u8>>,

Byron commented

As a workaround, it's probably easiest to vendor the Google API crate and patch it yourself. I presume without the urlsafe version of the wrapper it should work.

My workaround/hack has been to match on the error and decode as base64 a second time,

let result = match self.hub.text().synthesize(req).doit().await {
    Ok(v) => v.1,
    Err(e) => match e {
        Error::JsonDecodeError(s, _) => {
            let result = serde_json::from_str::<MySynthesizeSpeechResponse>(&s).unwrap();
            SynthesizeSpeechResponse {
                #[allow(deprecated)]
                audio_content: Some(base64::decode(result.audio_content.unwrap()).unwrap()),
            }
        }
        _ => todo!(),
    },
};
Byron commented

Thanks for sharing!

I'm thinking of using feature to be able to select whether URL-safe or not. What do you think?

I don't think feature is a good choice here. A feature is a crate-wide configuration, while this bug only impacts a particular Google API. If this issue is more widespread, then something else in the crate should change. Ideally a user shouldn't have to even know about the base64 encoding; it's an implementation detail of the API.

I get it.

I think cargo features could work if the whole crate either wants URL encoding or base64, and I don't know if that's the case consistently. Maybe it would also be possible to make it configurable at runtime on a per-field basis with a custom encoder/decoder, or better, with a switch for each callable method.

Definitely there are options, but it doesn't look like anybody will have the time to actually look into it.