akras14/speech-to-text

ValueError: Audio file could not be read as PCM WAV, AIFF/AIFF-C, or Native FLAC

yianchen opened this issue · 2 comments

Has anyone encountered a value error even though the audio file is a PCM wav? Any idea to solve it?
ValueError: Audio file could not be read as PCM WAV, AIFF/AIFF-C, or Native FLAC.

I ran the fast.py with some sample wav files and it worked perfectly! But when I tested it with audio files I collected from website, I got a value error even though the info from soxi command says otherwise.

I then re-ran the sample wav files that were previously worked, but received the same error messages.

Audio files I collected from website
I downloaded Amazon's audio (https://www.youtube.com/watch?v=CxK1VhtJlNQ), converted it to wav file at 16K sample rate and 1 channel. Split it into small pieces with py-webrtcvad.

soxi chunk-02.wav
Input File : 'chunk-02.wav'
Channels : 1
Sample Rate : 16000
Precision : 16-bit
Duration : 00:00:03.03 = 48480 samples ~ 227.25 CDDA sectors
File Size : 97.0k
Bit Rate : 256k
Sample Encoding: 16-bit Signed Integer PCM

Long over due, but responding in case somebody comes across it.

I think not all wav files are created equal, but I don't have any more details. I've ran into this issue as well. Export as wav from Audacity as outlined in the original article, seemed to work for me to resolve it. I don't have any more feedback than that :(

thanks @akras14, will try Audacity.