pro100andrey/lame

Facing issue converting an m4a to mp3

tryWabbit opened this issue · 13 comments

Thanks for creating this library!

I'm trying to convert an m4a file to mp3 but I'm getting corrupt audio. I'm using the example project and i just replaced the input file to my m4a file

In the debugger I'm getting

void * _Nullable NSMapGet(NSMapTable * _Nonnull, const void * _Nullable): map table argument is NULL

the file is exported but it is corrupt. Any suggestions or feedbacks what I'm doing wrong?

Hi @tryWabbit, You are not decoding the M4A audio into raw audio data before sending it into LAME.

"PCM audio" means no compression.

Thanks for the quick reply @pro100andrey ! I have limited experience with audio conversions. Can you explain me in detail what updates I need to make to get an mp3 file out of a m4a file. Really appreaciate your help!

Certainly, @tryWabbit! Converting an M4A file to MP3 involves a few steps. I'll explain in detail:

  • Decode the M4A File: You need to decode the M4A audio data into raw PCM audio data. M4A is a compressed audio format, and you should decompress it before encoding it to MP3. You can use a library like FFmpeg or AVFoundation in Swift to do this. You'll end up with PCM audio data, which is uncompressed. Convert m4a to WAV

  • Encode to MP3: Once you have the raw PCM audio data, you can encode it to MP3. LAME is a popular library for encoding audio to MP3 format. Make sure you're passing the raw audio data (uncompressed) to LAME for encoding. AudioConverter.swift

  • Set Appropriate MP3 Parameters: When encoding to MP3, you can set various parameters like bitrate, sample rate, and quality. Adjust these parameters according to your needs. Higher bitrates typically result in better audio quality but larger file sizes. AudioConverter.swift

  • Save the MP3 File: Finally, save the encoded MP3 audio to a file. Make sure you provide a valid file path for the output.

Thanks!

@tryWabbit

If you're recording audio directly from a microphone and want to save it in MP3 format, you can use the LAME library to convert PCM audio to MP3 on the fly. LAME provides the capability to encode audio into MP3 format while recording or streaming.

For example AURenderCallback

Thank you so much for the detailed explanation @pro100andrey Really appreciate your help! I will try and update you

Hey @pro100andrey I tried your solution and it worked! The only problem I have right now is the converted audio is half of the size and playing like it's in fast forward. I'm sure there is a difference between the configuration I'm converting it in and your configs for conversion. Can you verify?

Method I'm using to convert m4a to wav. The output is fine and match the audio.

    func convertAudio(_ url: URL, outputURL: URL) {
        var error : OSStatus = noErr
        var destinationFile : ExtAudioFileRef? = nil
        var sourceFile : ExtAudioFileRef? = nil

        var srcFormat : AudioStreamBasicDescription = AudioStreamBasicDescription()
        var dstFormat : AudioStreamBasicDescription = AudioStreamBasicDescription()

        ExtAudioFileOpenURL(url as CFURL, &sourceFile)

        var thePropertySize: UInt32 = UInt32(MemoryLayout.stride(ofValue: srcFormat))

        ExtAudioFileGetProperty(sourceFile!,
            kExtAudioFileProperty_FileDataFormat,
            &thePropertySize, &srcFormat)
        
        dstFormat.mSampleRate = 44100  //Set sample rate
        dstFormat.mFormatID = kAudioFormatLinearPCM
        dstFormat.mChannelsPerFrame = 1
        dstFormat.mBitsPerChannel = 16
        dstFormat.mBytesPerPacket = 2 * dstFormat.mChannelsPerFrame
        dstFormat.mBytesPerFrame = 2 * dstFormat.mChannelsPerFrame
        dstFormat.mFramesPerPacket = 1
        dstFormat.mFormatFlags = kLinearPCMFormatFlagIsPacked |
        kAudioFormatFlagIsSignedInteger


        // Create destination file
        error = ExtAudioFileCreateWithURL(
            outputURL as CFURL,
            kAudioFileWAVEType,
            &dstFormat,
            nil,
            AudioFileFlags.eraseFile.rawValue,
            &destinationFile)
        reportError(error: error)

        error = ExtAudioFileSetProperty(sourceFile!,
                kExtAudioFileProperty_ClientDataFormat,
                thePropertySize,
                &dstFormat)
        reportError(error: error)

        error = ExtAudioFileSetProperty(destinationFile!,
                                         kExtAudioFileProperty_ClientDataFormat,
                                        thePropertySize,
                                        &dstFormat)
        reportError(error: error)

        let bufferByteSize : UInt32 = 32768
        var srcBuffer = [UInt8](repeating: 0, count: 32768)
        var sourceFrameOffset : ULONG = 0

        while(true){
            var fillBufList = AudioBufferList(
                mNumberBuffers: 1,
                mBuffers: AudioBuffer(
                    mNumberChannels: 2,
                    mDataByteSize: UInt32(srcBuffer.count),
                    mData: &srcBuffer
                )
            )
            var numFrames : UInt32 = 0

            if(dstFormat.mBytesPerFrame > 0){
                numFrames = bufferByteSize / dstFormat.mBytesPerFrame
            }

            error = ExtAudioFileRead(sourceFile!, &numFrames, &fillBufList)
            reportError(error: error)

            if(numFrames == 0){
                error = noErr;
                break;
            }
            
            sourceFrameOffset += numFrames
            error = ExtAudioFileWrite(destinationFile!, numFrames, &fillBufList)
            reportError(error: error)
        }
        
        error = ExtAudioFileDispose(destinationFile!)
        reportError(error: error)
        error = ExtAudioFileDispose(sourceFile!)
        reportError(error: error)
    }

Your method of conversion from wav to mp3

class func encodeToMp3(
        inPcmPath: String,
        outMp3Path: String,
        onProgress: @escaping (Float) -> (Void),
        onComplete: @escaping () -> (Void)
    ) {

        encoderQueue.async {

            let lame = lame_init()
            lame_set_in_samplerate(lame, 44100)
            lame_set_out_samplerate(lame, 0)
            lame_set_brate(lame, 0)
            lame_set_quality(lame, 4)
            lame_set_VBR(lame, vbr_off)
            lame_init_params(lame)
            

            let pcmFile: UnsafeMutablePointer<FILE> = fopen(inPcmPath, "rb")
            fseek(pcmFile, 0 , SEEK_END)
            
            let fileSize = ftell(pcmFile)
            // Skip file header.
            let pcmHeaderSize = 48 * 8
            fseek(pcmFile, pcmHeaderSize, SEEK_SET)

            let mp3File: UnsafeMutablePointer<FILE> = fopen(outMp3Path, "wb")

            let pcmSize = 1024 * 8
            let pcmbuffer = UnsafeMutablePointer<Int16>.allocate(capacity: Int(pcmSize * 2))

            let mp3Size: Int32 = 1024 * 8
            let mp3buffer = UnsafeMutablePointer<UInt8>.allocate(capacity: Int(mp3Size))

            var write: Int32 = 0
            var read = 0

            repeat {

                let size = MemoryLayout<Int16>.size * 2
                read = fread(pcmbuffer, size, pcmSize, pcmFile)
                // Progress
                if read != 0 {
                    let progress = Float(ftell(pcmFile)) / Float(fileSize)
                    DispatchQueue.main.sync { onProgress(progress) }
                }

                if read == 0 {
                    write = lame_encode_flush_nogap(lame, mp3buffer, mp3Size)
                } else {
                    write = lame_encode_buffer_interleaved(lame, pcmbuffer, Int32(read), mp3buffer, mp3Size)
                }

                fwrite(mp3buffer, Int(write), 1, mp3File)

            } while read != 0

            // Clean up
            lame_close(lame)
            fclose(mp3File)
            fclose(pcmFile)

            pcmbuffer.deallocate()
            mp3buffer.deallocate()

            DispatchQueue.main.sync { onComplete() }
        }
    }

Can you identify the difference in conversion which is causing the audio to be fast the half of the actual duration ?

Thanks

Hi, @tryWabbit, if your file contains 1 channel, try set lame_set_num_channels to 1

/* number of channels in input stream. default=2 */
int CDECL lame_set_num_channels(lame_global_flags *, int);

@tryWabbit
I noticed that you can improve your code. You can combine the two functions func convertAudio(_ url:, outputURL:) and func encodeToMp3(inPcmPath:, outMp3Path:, onProgress:, onComplete:) into one. After ExtAudioFileRead(sourceFile!, &numFrames, &fillBufList), you can directly write fillBufList to MP3 without intermediate saving to a WAV file.

Think about it :).

Hi, @tryWabbit, if your file contains 1 channel, try set lame_set_num_channels to 1

/* number of channels in input stream. default=2 */ int CDECL lame_set_num_channels(lame_global_flags *, int);

I set the number of channels to 1 by lame_set_num_channels(lame, 1) but it is still generating audio which is having half of the duration and fast playing

I'm sorry if I don't make sense I don't have any experience with lame and c apis and have limited experience with audio apis for ios.

@tryWabbit I noticed that you can improve your code. You can combine the two functions func convertAudio(_ url:, outputURL:) and func encodeToMp3(inPcmPath:, outMp3Path:, onProgress:, onComplete:) into one. After ExtAudioFileRead(sourceFile!, &numFrames, &fillBufList), you can directly write fillBufList to MP3 without intermediate saving to a WAV file.

Think about it :).

Thank you so much for suggesting that. I will definitely focus on this once I get it working.

I uploaded a testing project on which I'm doing the experiment here in case you want to see the issue - https://github.com/tryWabbit/Audio-Conversion

@tryWabbit result without issue (replaced lame_encode_buffer_interleaved with lame_encode_buffer with empty right channel.)

//
//  AudioConverter.swift
//  Example
//
//  Created by Andrey on 20.11.2020.
//

import Foundation
import lame

class AudioConverter {

    private static let encoderQueue = DispatchQueue(label: "com.audio.encoder.queue")

    class func encodeToMp3(
        inPcmPath: String,
        outMp3Path: String,
        
        onProgress: @escaping (Float) -> (Void),
        onComplete: @escaping () -> (Void)
    ) {

        encoderQueue.async {
            
            let numOfChannels: Int32 = 1

            let lame = lame_init()
            lame_set_in_samplerate(lame, 44100)
            lame_set_out_samplerate(lame, 0)
            lame_set_brate(lame, 0)
            lame_set_quality(lame, 4)
            lame_set_VBR(lame, vbr_off)
            lame_set_num_channels(lame, numOfChannels)
            lame_init_params(lame)
            
            let pcmFile: UnsafeMutablePointer<FILE> = fopen(inPcmPath, "rb")
            fseek(pcmFile, 0 , SEEK_END)
            
            let fileSize = ftell(pcmFile)
            // Skip file header.
            let pcmHeaderSize = 48 * 8
            fseek(pcmFile, pcmHeaderSize, SEEK_SET)

            let mp3File: UnsafeMutablePointer<FILE> = fopen(outMp3Path, "wb")

            let pcmSize = 1024 * 8
            let pcmbuffer = UnsafeMutablePointer<Int16>.allocate(capacity: Int(pcmSize * 2))

            let mp3Size: Int32 = 1024 * 8
            let mp3buffer = UnsafeMutablePointer<UInt8>.allocate(capacity: Int(mp3Size))

            var write: Int32 = 0
            var read = 0

            repeat {

                let size = MemoryLayout<Int16>.size * Int(numOfChannels)
                read = fread(pcmbuffer, size, pcmSize, pcmFile)
                // Progress
                if read != 0 {
                    let progress = Float(ftell(pcmFile)) / Float(fileSize)
                    DispatchQueue.main.sync { onProgress(progress) }
                }

                if read == 0 {
                    write = lame_encode_flush_nogap(lame, mp3buffer, mp3Size)
                } else {
                    write = lame_encode_buffer(lame, pcmbuffer, [] ,Int32(read), mp3buffer, mp3Size)
                }

                fwrite(mp3buffer, Int(write), 1, mp3File)

            } while read != 0

            // Clean up
            lame_close(lame)
            fclose(mp3File)
            fclose(pcmFile)

            pcmbuffer.deallocate()
            mp3buffer.deallocate()

            DispatchQueue.main.sync { onComplete() }
        }
    }
}

Thank you so much @pro100andrey the mistake I did was I put the lame_set_num_channels(lame, 1) after the lame_init_params(lame).

Really appreciate your help : )