AWS Transcribe BadRequestException: Your stream is too big. Reduce the frame size and try your request again.
rmtuckerphx opened this issue · 5 comments
When the MediaSampleRateHertz is changed from 8000 to 16000, I get the following error:
2023-02-04T05:10:00.173Z d0ee90a1-a61f-44f9-826a-fd2c4977425b ERROR Error processing transcribe stream. SessionId: e484b9b6-1657-45d0-bf3d-12c3737e6f65 {
"name": "BadRequestException",
"$fault": "client",
"$metadata": {},
"message": "Your stream is too big. Reduce the frame size and try your request again."
}
What other settings do I need to change to allow for this?
This is similar to my AWS re:Post entry:
https://repost.aws/questions/QUg4QHSnUuSYKdSVOEAjQ0OQ/aws-transcribe-medical-bad-request-exception-your-stream-is-too-big-reduce-the-frame-size-and-try-your-request-again
Can someone please help me with this?
@rstrahan @babu-srinivasan
Hi, Thanks for reaching out.
From the lambda code you shared in your re:Post entry, it looks like you are letting nodejs stream to decide chunk size. I would recommend that you calculate the chunk size based on sample rate, media encoding (bytes per sample), and chunk duration in milliseconds. E.g. const CHUNK_SIZE = (SAMPLE_RATE * BYTES_PER_SAMPLE)*CHUNK_SIZE_IN_MS/1000;
Could you please share little bit more details on your use case, and how you are modifying the Live Call Analytics solution from this repo to support your Transcribe Medical use case. I can try to recreate this problem from my side to provide any additional recommendations.
Thanks
Babu Srinivasan
I'm looking to put together a demo where audio input comes from the web browser using MediaRecorder and calls the mediaRecorder.start(1000);
with a timeslice of 1000ms. Am I correct to understand that the data chunks coming from ondataavailable
can be sent directly to Kinesis Video Streams (KVS)?
mediaRecorder.ondataavailable = (evt) => {
chunks.push(evt.data);
};
Does KVS even care what the format of time series data is?
For example, would I be able to send MediaStream chunks from the browser to KVS and have an AWS Lambda Function GET fragments from the KVS stream and on the server transform the data to the audio format needed for Transcribe Medical (16kHz, PCM signed 16-bit little-endian in Matroska format)?
Or must the data being PUT into KVS be Matroska format?
What is the minimal that I need to do in the browser before sending time series data to KVS?
No activity.. @babu-srinivasan @rmtuckerphx - Is this still active, or shall we close?
Closing due to no activity.