`Unable to process >4GB files` error for M4B files much smaller than 4GB

Question

`Unable to process >4GB files` error for M4B files much smaller than 4GB

domkm opened this issue a year ago · 7 comments

Steps to reproduce

Use AudioSegment.from_file(file_path) to load a moderately large M4B file. I am using one which is 638MB.

In case it's relevant, ffprobe -i [MY_M4B_FILE] shows chapter information plus

  Stream #0:0[0x1](eng): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 125 kb/s (default)
    Metadata:
      creation_time   : 2023-06-13T16:57:01.000000Z
      handler_name    : ?Apple Sound Media Handler
      vendor_id       : [0][0][0][0]
  Stream #0:1[0x2](eng): Data: bin_data (text / 0x74786574) (default)
    Metadata:
      creation_time   : 2023-06-13T16:57:01.000000Z
      handler_name    : ?Apple Text Media Handler
  Stream #0:2[0x0]: Video: mjpeg (Progressive), yuvj420p(pc, bt470bg/unknown/unknown), 2400x2400 [SAR 1:1 DAR 1:1], 90k tbr, 90k tbn (attached pic)

Expected behavior

AudioSegment.from_file(file_path) should load files <4GB without error.

Actual behavior

AudioSegment.from_file(file_path) throws Unable to process >4GB files error.

Your System configuration

Python version: 3.10.13
Pydub version: 0.25.1
ffmpeg or avlib?: ffmpeg
ffmpeg/avlib version: ffmpeg version 6.0 Copyright (c) 2000-2023 the FFmpeg developers built with Apple clang version 14.0.3 (clang-1403.0.22.14.1) configuration: --prefix=/opt/homebrew/Cellar/ffmpeg/6.0-with-options_4 --enable-shared --cc=clang --host-cflags= --host-ldflags= --enable-gpl --enable-libaom --enable-libdav1d --enable-libmp3lame --enable-libopus --enable-libsnappy --enable-libtheora --enable-libvorbis --enable-libvpx --enable-libx264 --enable-libx265 --enable-libfontconfig --enable-libfreetype --enable-frei0r --enable-libass --enable-demuxer=dash --enable-opencl --enable-audiotoolbox --enable-videotoolbox --enable-neon --disable-htmlpages --enable-libfdk-aac --enable-nonfree

Is there an audio file you can include to help us reproduce?

The audio file in question is not public domain. If this is not expected behavior and will be fixed, I can work on providing a public domain file to reproduce the issue.

Answer 1 · 2023-10-11T15:09:48.000Z

Hi,
I got the same problem with a mp3 file, unfortunately also not public domain.
my file has 276mb

Python 3.12.0, Win11, 64bit
pydub==0.25.1 (installed via pip)

called via

from pydub import AudioSegment
path = "my full path without spaces.mp3"
AudioSegment.from_file(path, format="mp3")

Stacktrace

  File "C:\projects\python\audio_speed\Lib\site-packages\pydub\audio_segment.py", line 142, in fix_wav_headers
    raise CouldntDecodeError("Unable to process >4GB files")
pydub.exceptions.CouldntDecodeError: Unable to process >4GB files

ffprobe -i

...
Metadata
...
encoder : LAME 64bits version 3.100.1 (https://lame.sourceforge.io)
...
Duration: 06:45:47.92, start: 0.025056, bitrate: 92 kb/s
Stream #0:0: Audio: mp3, 44100 Hz, stereo, fltp, 92 kb/s
Metadata:
encoder : LAME3.100
Side data:
replaygain: track gain - -4.500000, track peak - unknown, album gain - unknown, album peak - unknown,
Stream #0:1: Video: mjpeg (Baseline), yuvj420p(pc, bt470bg/unknown/unknown), 500x500 [SAR 1:1 DAR 1:1], 90k tbr, 90k t
n (attached pic)
Metadata:
comment : Other

Answer 2 · 2023-11-15T17:22:24.000Z

The error message is a little misleading: what it should really say is that it can't process uncompressed (i.e., wav) files larger than 4GB.

What happens behind the scenes is that pydub transforms the mp3 file into a wav file. In the some simple tests I ran, I noticed that with a straightforward transformation (i.e., ffmpeg -i audio.m4a audio.wav), the resulting wav file was about 10 times larger.

Maybe pydub does something a bit smarter to try and reduce the size of the resulting wav file, but I guess ultimately the same thing happens.

Answer 3 · 2024-03-26T16:52:57.000Z

Is there a reason this 4GB limit exists? I have a lot more RAM than that nowadays.

E: Oh that's why.

https://stackoverflow.com/a/27880366/1149933

.wav files cannot be bigger than 4GB as the wave file format specification prevents that

That's annoying.

Answer 4 · 2024-06-09T20:59:41.000Z

I'm interested in brainstorming solutions to this problem because I work with very long (up to 10 hrs or more) field recordings that I make.

Has anyone got any ideas? I don't know if it's a good idea, but we could implement a wav chunk manager that is within the spec limit. I'm not sure how I would control any artifacts that would probably introduce when they're mixed back down though.

Open to other suggestions.

Answer 5 · 2024-06-09T21:02:46.000Z

My impression is that you don't want to work on 10 hours of audio at once - it should probably be streamed into and out of memory in a circular buffer.

An "idea guy" is the lowest paying role in the world though, so me saying this isn't really that helpful if I'm not going to implement it.

Answer 6 · 2024-06-20T15:31:42.000Z

You can use command like ffmpeg -i input.mp3 -ss 0 -t 30 output.mp3
first to split files in smaller chunks
Had same problem but now it works

for python import subprocess
command = ["ffmpeg", "-i", input_path, "-ss", str(start_time), "-t", str(duration), output_path]
subprocess.run(command, check=True)

Answer 7 · 2024-07-30T12:17:19.000Z

I had a similar use case and i made an app. Its operates with large audio files with pydub and multiprocessing. https://github.com/Tikhvinskiy/Smart-audio-splitter
Its works fine.