cannot decode files longer than 1 minute

Question

cannot decode files longer than 1 minute

jamalw opened this issue 6 years ago · 19 comments

When I attempt to run the command quail.decode_speech() I receive an error:

"Sync input too long. For audio longer than 1 min use LongRunningRecognize with a 'uri' parameter."

Answer 1 · 2018-10-03T02:43:17.000Z

Here is the full error message:

_Rendezvous Traceback (most recent call last)
/Users/jamalw/anaconda3/lib/python3.6/site-packages/google/gax/retry.py in inner(*args)
120 to_call = add_timeout_arg(a_func, timeout, **kwargs)
--> 121 return to_call(*args)
122 except Exception as exception: # pylint: disable=broad-except

/Users/jamalw/anaconda3/lib/python3.6/site-packages/google/gax/retry.py in inner(*args)
67 updated_args = args + (timeout,)
---> 68 return a_func(*updated_args, **kwargs)
69

/Users/jamalw/anaconda3/lib/python3.6/site-packages/grpc/_channel.py in call(self, request, timeout, metadata, credentials)
531 state, call, = self._blocking(request, timeout, metadata, credentials)
--> 532 return _end_unary_response_blocking(state, call, False, None)
533

/Users/jamalw/anaconda3/lib/python3.6/site-packages/grpc/_channel.py in _end_unary_response_blocking(state, call, with_call, deadline)
465 else:
--> 466 raise _Rendezvous(state, None, None, deadline)
467

_Rendezvous: <_Rendezvous of RPC that terminated with:
status = StatusCode.INVALID_ARGUMENT
details = "Sync input too long. For audio longer than 1 min use LongRunningRecognize with a 'uri' parameter."
debug_error_string = "{"created":"@1538534542.268359000","description":"Error received from peer","file":"src/core/lib/surface/call.cc","file_line":1099,"grpc_message":"Sync input too long. For audio longer than 1 min use LongRunningRecognize with a 'uri' parameter.","grpc_status":3}"

During handling of the above exception, another exception occurred:

RetryError Traceback (most recent call last)
in ()
1 import quail
2
----> 3 recall_data = quail.decode_speech('/Users/jamalw/Dropbox/music_context_reinstatement/data/MCR_092518_0/data/mono.wav', keypath='/Users/jamalw/Desktop/PNI/music_context_reinstatement/music-context-reinstatement-9d76d8344a75.json')

/Users/jamalw/anaconda3/lib/python3.6/site-packages/quail/decode_speech.py in decode_speech(path, keypath, save, speech_context, sample_rate, max_alternatives, language_code, enable_word_time_offsets, return_raw)
223 # decode file
224 results = decode_file(f, client, speech_context, sample_rate,
--> 225 max_alternatives, enable_word_time_offsets)
226
227 # parsing response

/Users/jamalw/anaconda3/lib/python3.6/site-packages/quail/decode_speech.py in decode_file(file_path, client, speech_context, sample_rate, max_alternatives, enable_word_time_offsets)
141 results = []
142 for idx, chunk in enumerate(audio_chunks):
--> 143 results.append(recognize(chunk, file_path+str(idx)))
144
145 # return list of results

/Users/jamalw/anaconda3/lib/python3.6/site-packages/quail/decode_speech.py in recognize(chunk, file_path)
107 # run speech decoding
108 try:
--> 109 result = client.recognize(opts, sample)
110 except ValueError as e:
111 print(e)

/Users/jamalw/anaconda3/lib/python3.6/site-packages/google/cloud/gapic/speech/v1/speech_client.py in recognize(self, config, audio, options)
199 """
200 request = cloud_speech_pb2.RecognizeRequest(config=config, audio=audio)
--> 201 return self._recognize(request, options)
202
203 def long_running_recognize(self, config, audio, options=None):

/Users/jamalw/anaconda3/lib/python3.6/site-packages/google/gax/api_callable.py in inner(request, options)
450 func, this_settings.timeout, **this_settings.kwargs)
451 api_call = _catch_errors(api_call, gax.config.API_ERRORS)
--> 452 return api_caller(api_call, this_settings, request)
453
454 if settings.page_descriptor:

/Users/jamalw/anaconda3/lib/python3.6/site-packages/google/gax/api_callable.py in base_caller(api_call, _, *args)
436 def base_caller(api_call, _, *args):
437 """Simply call api_call and ignore settings."""
--> 438 return api_call(*args)
439
440 def inner(request, options=None):

/Users/jamalw/anaconda3/lib/python3.6/site-packages/google/gax/api_callable.py in inner(*args, **kwargs)
374 """Wraps specified exceptions"""
375 try:
--> 376 return a_func(*args, **kwargs)
377 # pylint: disable=catching-non-exception
378 except tuple(to_catch) as exception:

/Users/jamalw/anaconda3/lib/python3.6/site-packages/google/gax/retry.py in inner(*args)
125 raise errors.RetryError(
126 'Exception occurred in retry method that was not'
--> 127 ' classified as transient', exception)
128
129 exc = errors.RetryError(

RetryError: RetryError(Exception occurred in retry method that was not classified as transient, caused by <_Rendezvous of RPC that terminated with:
status = StatusCode.INVALID_ARGUMENT
details = "Sync input too long. For audio longer than 1 min use LongRunningRecognize with a 'uri' parameter."
debug_error_string = "{"created":"@1538534542.268359000","description":"Error received from peer","file":"src/core/lib/surface/call.cc","file_line":1099,"grpc_message":"Sync input too long. For audio longer than 1 min use LongRunningRecognize with a 'uri' parameter.","grpc_status":3}"

)

Answer 2 · 2018-10-03T11:36:06.000Z

thanks @jamalw - could you also post the code that generated the error, and the output of pip freeze?

Answer 3 · 2018-10-03T11:41:21.000Z

nm, i see the code you used in the error trace, so just the output of pip freeze please!

Answer 4 · 2018-10-03T14:23:01.000Z

here is my pip freeze output:

alabaster==0.7.9
anaconda-client==1.6.0
anaconda-navigator==1.5
anaconda-project==0.4.1
appnope==0.1.0
appscript==1.0.1
apptools==4.4.0
astroid==1.4.9
astropy==1.3
audioread==2.1.5
Babel==2.3.4
backports.shutil-get-terminal-size==1.0.0
backports.weakref==1.0rc1
beautifulsoup4==4.5.3
bitarray==0.8.1
blaze==0.10.1
bleach==1.5.0
bokeh==0.12.4
boto==2.45.0
Bottleneck==1.2.0
brainiak==0.5
cachetools==2.1.0
certifi==2018.8.24
cffi==1.9.1
chardet==3.0.4
chest==0.2.3
click==6.7
cloudpickle==0.2.2
clyent==1.2.2
colorama==0.3.7
conda==4.3.14
configobj==5.0.6
contextlib2==0.5.4
cryptography==1.7.1
cycler==0.10.0
Cython==0.26.1
cytoolz==0.8.2
dask==0.13.0
datashape==0.5.4
decorator==4.0.11
deepdish==0.3.6
dill==0.2.5
docutils==0.13.1
et-xmlfile==1.0.1
eyeD3==0.8.4
fastcache==1.0.2
Flask==0.12
Flask-Cors==3.0.2
future==0.16.0
gapic-google-cloud-datastore-v1==0.15.3
gapic-google-cloud-error-reporting-v1beta1==0.15.3
gapic-google-cloud-logging-v2==0.91.3
gevent==1.2.1
google-api-core==0.1.4
google-auth==1.5.1
google-cloud==0.33.1
google-cloud-bigquery==0.28.0
google-cloud-bigquery-datatransfer==0.1.1
google-cloud-bigtable==0.28.1
google-cloud-container==0.1.1
google-cloud-core==0.28.1
google-cloud-datastore==1.4.0
google-cloud-dns==0.28.0
google-cloud-error-reporting==0.28.0
google-cloud-firestore==0.28.0
google-cloud-language==1.0.2
google-cloud-logging==1.4.0
google-cloud-monitoring==0.28.1
google-cloud-pubsub==0.30.1
google-cloud-resource-manager==0.28.1
google-cloud-runtimeconfig==0.28.1
google-cloud-spanner==0.29.0
google-cloud-speech==0.30.0
google-cloud-storage==1.6.0
google-cloud-trace==0.17.0
google-cloud-translate==1.3.1
google-cloud-videointelligence==1.0.1
google-cloud-vision==0.29.0
google-gax==0.15.16
google-resumable-media==0.3.1
googleapis-common-protos==1.5.3
greenlet==0.4.11
grpc-google-iam-v1==0.11.4
grpcio==1.15.0
h5py==2.6.0
HeapDict==1.0.0
hmmlearn==0.2.0
html5lib==0.9999999
httplib2==0.11.3
hypertools==0.1.7
idna==2.7
imagesize==0.7.1
ipykernel==4.5.2
ipython==5.1.0
ipython-genutils==0.1.0
ipywidgets==5.2.2
isort==4.2.5
itsdangerous==0.24
jdcal==1.3
jedi==0.9.0
Jinja2==2.9.4
joblib==0.12.5
jsonschema==2.5.1
jupyter==1.0.0
jupyter-client==4.4.0
jupyter-console==5.0.0
jupyter-core==4.2.1
Keras==2.0.6
kiwisolver==1.0.1
lazy-object-proxy==1.2.2
librosa==0.5.1
llvmlite==0.15.0
locket==0.2.0
lxml==3.7.2
Markdown==2.6.8
MarkupSafe==0.23
matplotlib==3.0.0
mayavi==4.5.0
mistune==0.7.3
mpi4py==2.0.0
mpmath==0.19
multipledispatch==0.4.9
multiprocess==0.70.5
nbconvert==4.2.0
nbformat==4.2.0
networkx==1.11
nibabel==2.1.0
nilearn==0.2.6
nitime==0.7
nltk==3.2.2
nose==1.3.7
notebook==4.3.1
numba==0.30.1
numexpr==2.6.8
numpy==1.15.2
numpydoc==0.6.0
oauth2client==3.0.0
odo==0.5.0
openpyxl==2.4.1
pandas==0.23.4
partd==0.3.7
pathlib==1.0.1
pathlib2==2.2.0
pathos==0.2.0
patsy==0.4.1
PeakUtils==1.1.1
pep8==1.7.0
pexpect==4.2.1
pickleshare==0.7.4
Pillow==4.0.0
ply==3.8
pox==0.2.3
ppca==0.0.2
ppft==1.6.4.7.1
prompt-toolkit==1.0.9
proto-google-cloud-datastore-v1==0.90.4
proto-google-cloud-error-reporting-v1beta1==0.15.3
proto-google-cloud-logging-v2==0.91.3
protobuf==3.3.0
psutil==5.4.7
ptyprocess==0.5.1
py==1.4.32
pyasn1==0.4.4
pyasn1-modules==0.2.2
pyAudioAnalysis==0.1.3
pybind11==2.2.0
pycosat==0.6.1
pycparser==2.17
pycrypto==2.6.1
pycurl==7.43.0
pydub==0.20.0
pyface==5.1.0
pyflakes==1.5.0
pygame==1.9.3
Pygments==2.1.3
pylint==1.6.4
pymanopt==0.2.2
pyOpenSSL==16.2.0
pyparsing==2.2.2
pysurfer==0.8.0
pytest==3.0.5
python-dateutil==2.7.3
python-magic==0.4.13
pytz==2018.5
PyYAML==3.12
pyzmq==16.0.2
QtAwesome==0.4.3
qtconsole==4.2.1
QtPy==1.2.1
quail==0.2.0
redis==2.10.5
requests==2.19.1
resampy==0.1.5
rope-py3k==0.9.4.post1
rsa==4.0
scikit-image==0.12.3
scikit-learn==0.19.0
scipy==1.1.0
seaborn==0.9.0
simplegeneric==0.8.1
simplejson==3.12.0
singledispatch==3.4.0.3
six==1.11.0
sklearn==0.0
snowballstemmer==1.2.1
sockjs-tornado==1.0.3
SoundFile==0.9.0.post1
Sphinx==1.5.1
spyder==3.1.2
SQLAlchemy==1.1.5
statsmodels==0.6.1
sympy==1.0
tables==3.4.4
tensorflow==1.2.1
terminado==0.6
Theano==0.9.0
toolz==0.8.2
tornado==4.4.2
traitlets==4.3.1
traits==4.6.0
traitsui==5.1.0
typing==3.6.2
unicodecsv==0.14.1
urllib3==1.23
utils==0.9.0
vtk==8.1.0
wcwidth==0.1.7
Werkzeug==0.11.15
widgetsnbextension==1.2.6
wrapt==1.10.8
wxPython==4.0.1
xlrd==1.0.0
XlsxWriter==0.9.6
xlwings==0.10.2
xlwt==1.2.0

Answer 5 · 2018-10-03T18:17:04.000Z

One thing that's strange is that the traceback shows no sign of running lines 129-138 (segment into 1 minute chunks) of decode_speech.py. This is almost as though it fails at line 126 (read in wav) except this isn't in the traceback either. I'm assuming this may have something to do with my wav format or wav header info but I compared these wav files to one that actually doesn't produce the error (because it's shorter than 60 seconds) and the format and header info look the same except for the duration.

Answer 6 · 2018-10-04T14:19:49.000Z

Problem solved! It was the case that my wav header/format was the issue. I don't know the specifics but when I convert my wav file to an mp3 and then back to a wav file the speech_decode.py function finally accepts my file.

Answer 7 · 2018-10-04T15:21:09.000Z

@jamalw would you be willing to share (either by uploading a file here or privately through email) the wav file that was giving you trouble, and also the mp3 file that worked, so that we can investigate this and decide how to proceed? E.g. if wav files are buggy or unsupported, we should (at minimum) document that somewhere. Or if this is a quail-specific bug (or one that we could potentially fix through wrapper functions) we could explore that too.

Answer 8 · 2018-10-04T15:27:49.000Z

No problem. I've attached a zip file with four files. test.wav is the file that doesn't work. test.mp3 is the converted file. mp3_to_wav.wav is the new wav file converted from the mp3. And mono.wav is the mp3_to_wav.wav file converted from stereo to mono using ffmpeg.

Archive.zip

Answer 9 · 2018-10-04T15:28:17.000Z

thanks! 🎉

Answer 10 · 2018-11-28T23:42:30.000Z

Hello, I am Jin, research assistant at Polyn Lab at Vanderbilt. Thank you for the amazing package. While using quail's decode_speech, I also had the same problem when decoding wav files with over 1 minute (90 seconds for my case). I don't have any error messages. When running the decode_speech, it all works fine, first printing that it is splitting into 2 one minute segments, then starts printing out the words and time of onset, offset. However, anything recalled after 1 minute mark is simply not shown. I just recently found out about it, and was wondering if the issue is addressed!

Answer 11 · 2018-11-29T15:02:26.000Z

Hi Jin, I did resolve my problem by padding my audio files with zeros. I don't know if we are having the same problem but maybe try this code: fs,samples = wavfile.read("corrupted_file.wav") samples = np.pad(samples, (0,int(fs*5)), mode='constant', constant_values=0) wavfile.write(fixed_file.wav,fs,samples) - Jamal

…

On Wed, Nov 28, 2018 at 6:42 PM Jin Jeon ***@***.***> wrote: Hello, I am Jin, research assistant at Polyn Lab at Vanderbilt. Thank you for the amazing package. While using quail's decode_speech, I also had the same problem when decoding wav files with over 1 minute (90 seconds for my case). I don't have any error messages. When running the decode_speech, it all works fine, first printing that it is splitting into 2 one minute segments, then starts printing out the words and time of onset, offset. However, anything recalled after 1 minute mark is simply not shown. I just recently found out about it, and was wondering if the issue is addressed! — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#113 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AE2RqYaEsumNW4cq6joJ1pPLY427KUHUks5uzx9ngaJpZM4XEU0G> .

Answer 12 · 2018-11-29T15:28:36.000Z

Thanks for posting your code @jamalw.

@jeon11 - Have you verified that there is speech after 60 seconds? If so, can you post the file?

Answer 13 · 2018-11-29T19:23:52.000Z

Hello @jamalw and @andrewheusser ,

Thanks so much for the quick replies. I actually figured it was properly being annotated after 60 seconds in the pickle file. I was only looking at the resulted txt file showing the time onset and offset.

Since I'm not familiar with using pickle, I'm still working my way through it, but I guess now the issue is that time onset/offset text file does not show items recalled after 60 seconds. In the attached file, the txt file does not show the last two items (whistle and Snoop Dog) while loading the pickle file does correctly show the whistle and Snoop Dog.

Below, I attached the zip file with wav, resulted pickle, and txt files.
example.zip

Thanks again!

Answer 14 · 2018-12-03T13:39:45.000Z

I think I've identified the bug. The parse_response function only looks at the first 60 second chunk of audio, whereas the "raw" pickle files save out all the data:
https://github.com/ContextLab/quail/blob/master/quail/decode_speech.py#L148-L168

@paxtonfitzpatrick - could you extend the parse_response function to work over a list of response objects, and then release the patch?

Answer 15 · 2018-12-06T21:35:20.000Z

hey @jeremymanning - could you ping @paxtonfitzpatrick about this (or another lab member)? I won't have time to get to this in the next few weeks

Answer 16 · 2018-12-06T21:35:57.000Z

@andrewheusser @jeremymanning I'm just about to start looking into this now!

Answer 17 · 2018-12-06T21:38:52.000Z

Ping! ping! ping ping! Piiiiiiiinnnnnnnggggg! Oh, you're already working on it? Happy to be of use... (ping! PING! ping ping ping!)

Answer 18 · 2018-12-06T21:44:24.000Z

Note: we really need to work on our response times. @paxtonfitzpatrick, I see that it took nearly an entire minute to respond to @andrewheusser. I believe we can do better. Please run your diagnostics and present yourself to the reconditioning computer in TESTING ROOM 1 for tuning and upgrades as needed. 🤖 📈

Answer 19 · 2018-12-06T21:45:00.000Z

This is your vacation time, not some sort of "break from research."