Cannot get AWS Polly synthesize_speech to work

Question

Cannot get AWS Polly synthesize_speech to work

Closed this issue 6 months ago · 3 comments

RudolfVonKrugstein commented 6 months ago

Hi,

I am trying to get the synthesize speech API to work, and I am doing this:

AWS.Polly.synthesize_speech(c, %{"Text": "hello","OutputFormat": "mp3", "VoiceId": "Amy"})

And I get:

** (Jason.DecodeError) unexpected byte at position 0: 0x49 ("I")
    (jason 1.4.1) lib/jason.ex:92: Jason.decode!/2
    (aws 0.13.3) lib/aws/request.ex:139: AWS.Request.request_rest/9
    iex:6: (file)

Could it be, that the API is returning audio data and not JSON and that causes the Problem?
What can I do?

Thank you!

Answer 1 · 2024-03-04T16:08:44.000Z

@RudolfVonKrugstein Yup guaranteed that this won't work... The cr*ppy thing here is that the API in general is listed as a JSON API with the exception of this method which may return json but it may return an audiostream depending on the OutputFormat chosen... 😢

What happens when you set receive_body_as_binary like this:

options = Keyword.new([{:receive_body_as_binary?, true}])

and pass the options as the last optional argument? I believe this should allow you to get back the raw binary.

Could you try and loop back with your findings? :-) If correct, I'll see how we can improve the docs on this.

Answer 2 · 2024-03-04T21:33:58.000Z

I can report back that it works!

iex(2)> options = Keyword.new([{:receive_body_as_binary?, true}])
[receive_body_as_binary?: true]
iex(3)> AWS.Polly.synthesize_speech(c, %{"Text": "hello","OutputFormat": "mp3", "VoiceId": "Amy"}, options)
{:ok,
 %{
   "Body" => <<73, 68, 51, 4, 0, 0, 0, 0, 0, 35, 84, 83, 83, 69, 0, 0, 0, 15, 0,
     0, 3, 76, 97, 118, 102, 53, 56, 46, 55, 54, 46, 49, 48, 48, 0, 0, 0, 0, 0,
     0, 0, 0, 0, 0, 0, 255, 243, ...>>,
   "ContentType" => "audio/mpeg",
   "RequestCharacters" => "5"
 },
 %{
   body: <<73, 68, 51, 4, 0, 0, 0, 0, 0, 35, 84, 83, 83, 69, 0, 0, 0, 15, 0, 0,
     3, 76, 97, 118, 102, 53, 56, 46, 55, 54, 46, 49, 48, 48, 0, 0, 0, 0, 0, 0,
     0, 0, 0, 0, 0, 255, ...>>,
   headers: [
     {"x-amzn-RequestId", "d6663177-5cd4-4dc5-a1ed-0a0d54c1f494"},
     {"x-amzn-RequestCharacters", "5"},
     {"Content-Type", "audio/mpeg"},
     {"Transfer-Encoding", "chunked"},
     {"Date", "Mon, 04 Mar 2024 21:32:26 GMT"}
   ],
   status_code: 200
 }}

Thank you!

Answer 3 · 2024-03-05T04:18:12.000Z

Glad to be of help! I'll create an issue to add this to the README 👍 Also I believe this to be impossible in aws-erlang so will do some refactoring to get that working there 👍