httpcore.ReadTimeout: The read operation timed out
osamabinsaleem opened this issue · 3 comments
Hi. I'm getting frequent read time out errors. This is my code:
transcription_model = "nova-2-general"
# sometimes we also use "whisper-medium"
#download the video from s3
# convert the video to mp3 audio
with open(audio_efs_path, "rb") as file:
buffer_data = file.read()
payload: FileSource = {
"buffer": buffer_data,
}
options = PrerecordedOptions(
model=transcription_model,
smart_format=True,
detect_language=True,
)
response = deepgram.listen.prerecorded.v("1").transcribe_file(
payload, options, timeout=httpx.Timeout(300.0, connect=10.0)
)
I've my video files stored on s3 buckets. I first download the file and then I'm converting it to mp3. Then I use the above code to transcibe it. I still occasionally get timeout issues. We get timeout issues with nova-2-general, so model changing doesnt work.
What do you recommend here:
1- Re-trying with Exponentional backoff
2- I should upload the audio to s3 bucket and use the presigned url to get the transcipt
Thanks!
There are the error logs:
2024-06-24T13:12:34.175Z | [INFO] 2024-06-24T13:12:34.175Z 619e77ea-ea1b-4947-9b55-248653553619 Traceback (most recent call last):
-- | --
| 2024-06-24T13:12:34.175Z | File "/var/lang/lib/python3.10/site-packages/httpx/_transports/default.py", line 69, in map_httpcore_exceptions
| 2024-06-24T13:12:34.175Z | yield
| 2024-06-24T13:12:34.175Z | File "/var/lang/lib/python3.10/site-packages/httpx/_transports/default.py", line 233, in handle_request
| 2024-06-24T13:12:34.175Z | resp = self._pool.handle_request(req)
| 2024-06-24T13:12:34.175Z | File "/var/lang/lib/python3.10/site-packages/httpcore/_sync/connection_pool.py", line 216, in handle_request
| 2024-06-24T13:12:34.175Z | raise exc from None
| 2024-06-24T13:12:34.175Z | File "/var/lang/lib/python3.10/site-packages/httpcore/_sync/connection_pool.py", line 196, in handle_request
| 2024-06-24T13:12:34.175Z | response = connection.handle_request(
| 2024-06-24T13:12:34.175Z | File "/var/lang/lib/python3.10/site-packages/httpcore/_sync/connection.py", line 101, in handle_request
| 2024-06-24T13:12:34.175Z | return self._connection.handle_request(request)
| 2024-06-24T13:12:34.175Z | File "/var/lang/lib/python3.10/site-packages/httpcore/_sync/http11.py", line 143, in handle_request
| 2024-06-24T13:12:34.175Z | raise exc
| 2024-06-24T13:12:34.175Z | File "/var/lang/lib/python3.10/site-packages/httpcore/_sync/http11.py", line 113, in handle_request
| 2024-06-24T13:12:34.175Z | ) = self._receive_response_headers(**kwargs)
| 2024-06-24T13:12:34.175Z | File "/var/lang/lib/python3.10/site-packages/httpcore/_sync/http11.py", line 186, in _receive_response_headers
| 2024-06-24T13:12:34.175Z | event = self._receive_event(timeout=timeout)
| 2024-06-24T13:12:34.175Z | File "/var/lang/lib/python3.10/site-packages/httpcore/_sync/http11.py", line 224, in _receive_event
| 2024-06-24T13:12:34.175Z | data = self._network_stream.read(
| 2024-06-24T13:12:34.175Z | File "/var/lang/lib/python3.10/site-packages/httpcore/_backends/sync.py", line 124, in read
| 2024-06-24T13:12:34.175Z | with map_exceptions(exc_map):
| 2024-06-24T13:12:34.175Z | File "/var/lang/lib/python3.10/contextlib.py", line 153, in __exit__
| 2024-06-24T13:12:34.175Z | self.gen.throw(typ, value, traceback)
| 2024-06-24T13:12:34.175Z | File "/var/lang/lib/python3.10/site-packages/httpcore/_exceptions.py", line 14, in map_exceptions
| 2024-06-24T13:12:34.175Z | raise to_exc(exc) from exc
| 2024-06-24T13:12:34.175Z | httpcore.ReadTimeout: The read operation timed out
| 2024-06-24T13:12:34.175Z | The above exception was the direct cause of the following exception:
| 2024-06-24T13:12:34.175Z | Traceback (most recent call last):
If you need to increase the timeout because the file is larger, you can follow the docs here:
https://developers.deepgram.com/docs/python-sdk-pre-recorded-transcription#increasing-the-timeout-for-processing-larger-files
I'm doing the same thing. Can I increase it to more than 300 as well? @dvonthenen
e.g like this for 10 minutes:
response = deepgram.listen.prerecorded.v("1").transcribe_file(
payload, options, timeout=httpx.Timeout(600.0, connect=10.0)
)
of course! that's the point of that timeout
field! we have a default (don't need to specify the timeout
parameter at all), but if you need something other than default, you provide the timeout
parameter plus whatever values you want.