Failed to decode audio in `.mkv` files with `flac` codec [analysis and solution attached]

Question

Failed to decode audio in `.mkv` files with `flac` codec [analysis and solution attached]

yuantuo666 opened this issue 6 months ago · 0 comments

TL;DR

Need to specify the correct codec since pydub having some bug handling not pcm encoded audio. E.g.:

from pydub import AudioSegment
audio = AudioSegment.from_file('test.mkv', codec='flac') # specify codec

If this not working, it might because your mkv file is not encoded in flac either! Run ffmpeg -i test.mkv and check the detected codec:

Stream #0:1(jpn): Audio: flac, 48000 Hz, stereo, s32 (24 bit) (default)

For my mkv it is flac. Check your own mkv file for the correct codec. Reference: https://stackoverflow.com/questions/2869281/how-to-determine-video-codec-of-a-file-with-ffmpeg

Steps to reproduce

Run:

from pydub import AudioSegment
audio = AudioSegment.from_file('test.mkv')

Expected behavior

Should load .mkv file successfully.

Actual behavior

pydub.exceptions.CouldntDecodeError: Decoding failed. ffmpeg returned error code: 1

Your System configuration

Python version: 3.9
Pydub version: pydub==0.25.1
ffmpeg or avlib?: ffmpeg
ffmpeg/avlib version: 4.2.2

Is there an audio file you can include to help us reproduce?

trimed.mkv.zip

Related Issues

#175
#191
#308

Fix this bug

Test show that remove the -acodec argument, ffmpeg can auto detect the correct format. (Only flac tested, I am not clear is this work for other codecs)
In the pydub/audio_segment.py file:

--- if codec:
+++ if codec and codec != "auto":
            # force audio decoder
            conversion_command += ["-acodec", codec]

And use:

from pydub import AudioSegment
audio = AudioSegment.from_file('test.mkv', codec='auto') # specify codec

Details

I have a mkv file with flac as audio codec.

Running following test code:

from pydub import AudioSegment
file_obj = open('test.mkv', 'rb')
audio = AudioSegment.from_file(file_obj)

Got the following error message:

Traceback (most recent call last):
  File "/test/pydub_mkv.py", line 5, in <module>
    audio = AudioSegment.from_file(file_obj)
  File "/miniconda3/envs/xxx/lib/python3.9/site-packages/pydub/audio_segment.py", line 775, in from_file
    raise CouldntDecodeError(
pydub.exceptions.CouldntDecodeError: Decoding failed. ffmpeg returned error code: 1

Output from ffmpeg/avlib:

ffmpeg version 4.2.2 Copyright (c) 2000-2019 the FFmpeg developers
  built with gcc 7.3.0 (crosstool-NG 1.23.0.449-a04d0)
  configuration: --prefix=/tmp/build/80754af9/ffmpeg_1587154242452/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placeho --cc=/tmp/build/80754af9/ffmpeg_1587154242452/_build_env/bin/x86_64-conda_cos6-linux-gnu-cc --disable-doc --enable-avresample --enable-gmp --enable-hardcoded-tables --enable-libfreetype --enable-libvpx --enable-pthreads --enable-libopus --enable-postproc --enable-pic --enable-pthreads --enable-shared --enable-static --enable-version3 --enable-zlib --enable-libmp3lame --disable-nonfree --enable-gpl --enable-gnutls --disable-openssl --enable-libopenh264 --enable-libx264
  libavutil      56. 31.100 / 56. 31.100
  libavcodec     58. 54.100 / 58. 54.100
  libavformat    58. 29.100 / 58. 29.100
  libavdevice    58.  8.100 / 58.  8.100
  libavfilter     7. 57.100 /  7. 57.100
  libavresample   4.  0.  0 /  4.  0.  0
  libswscale      5.  5.100 /  5.  5.100
  libswresample   3.  5.100 /  3.  5.100
  libpostproc    55.  5.100 / 55.  5.100
[cache @ 0x557210c9c6c0] Inner protocol failed to seekback end : -38
[cache @ 0x557210c9c6c0] write in cache failed
    Last message repeated 80957 times
[cache @ 0x557210c9c6c0] Failed to perform internal seek
[matroska,webm @ 0x557210c9be40] Read error at pos. 974984906 (0x3a1d16ca)
[cache @ 0x557210c9c6c0] Inner protocol failed to seekback end : -38
    Last message repeated 2 times
Input #0, matroska,webm, from 'cache:pipe:0':
  Metadata:
    encoder         : libebml v1.3.4 + libmatroska v1.4.5
    creation_time   : 2019-08-27T12:20:40.000000Z
  Duration: 00:22:56.38, start: 0.000000, bitrate: 5666 kb/s
    Chapter #0:0: start 0.000000, end 112.028000
    Metadata:
      title           : 
    Chapter #0:1: start 112.028000, end 555.054000
    Metadata:
      title           : 
    Chapter #0:2: start 555.054000, end 1257.047000
    Metadata:
      title           : 
    Chapter #0:3: start 1257.047000, end 1347.054000
    Metadata:
      title           : 
    Chapter #0:4: start 1347.054000, end 1365.072000
    Metadata:
      title           : 
    Chapter #0:5: start 1365.072000, end 1376.375000
    Metadata:
      title           : 
    Stream #0:0(jpn): Video: hevc (Main 10), yuv420p10le(tv, bt709/unknown/unknown), 1920x1080, SAR 1:1 DAR 16:9, 23.98 fps, 23.98 tbr, 1k tbn, 23.98 tbc (default)
    Stream #0:1(jpn): Audio: flac, 48000 Hz, stereo, s32 (24 bit) (default)
Unknown encoder 'pcm_s0le'
[cache @ 0x557210c9c6c0] Statistics, cache hits:7 cache misses:136423

Note the error is Unknown encoder 'pcm_s0le' which did not show in the ffmpeg -encoders, and running ffmpeg -i test.mkv test.wav works file, which indicate the error might from the pcm_s0le, and this should be generated by pydub library.

By checking the pydub code, found following code in pydub/audio_segment.py:

        if codec:
            info = None
        else:
            info = mediainfo_json(orig_file, read_ahead_limit=read_ahead_limit)
        if info:
            audio_streams = [x for x in info['streams']
                             if x['codec_type'] == 'audio']
            # This is a workaround for some ffprobe versions that always say
            # that mp3/mp4/aac/webm/ogg files contain fltp samples
            audio_codec = audio_streams[0].get('codec_name')
            if (audio_streams[0].get('sample_fmt') == 'fltp' and
                    audio_codec in ['mp3', 'mp4', 'aac', 'webm', 'ogg']):
                bits_per_sample = 16
            else:
                bits_per_sample = audio_streams[0]['bits_per_sample'] # some bug here
            if bits_per_sample == 8:
                acodec = 'pcm_u8'
            else:
                acodec = 'pcm_s%dle' % bits_per_sample

            conversion_command += ["-acodec", acodec]

Here, the acodec = 'pcm_s%dle' % bits_per_sample is the problem. When bits_per_sample is 0, the error occurred.

Checking the mediainfo_json function, find out it actually called ffprobe, running the constructed command manually showed that the bits_per_sample is really 0 for my test.mkv file.

By searching my mkv audio codec and the error info, found the following issue:

https://trac.ffmpeg.org/ticket/3047

At the bottom of this page:

Afaict, bits_per_sample is only valid for pcm and closely related codecs. Perhaps sample_fmt is what you are searching for? A flac file (as supported by FFmpeg) can either output s16 or s32 (or s16p or s32p respectively).

Which clearly state that the bits_per_sample should belong to audio that uses pcm codes. So in my case, for flac codec, it should use the flac codec as displayed in ffmpeg -encoders.

A..... flac                 FLAC (Free Lossless Audio Codec)

Change test code to:

from pydub import AudioSegment
file_obj = open('test.mkv', 'rb')
audio = AudioSegment.from_file(file_obj, codec='flac')

This make it works!