mewkiz/flac

Coping with damaged flac files

mehrvarz opened this issue · 3 comments

Hi. With some flac files (rather rare, but still) I get "crc mismatch" messages, either on startup or at some later point during playback. In some cases I see an additional message such as this: "frame.Subframe.decodeRicePart: The flac library test cases do not yet include any audio files with Rice parameter escape codes. If possible please consider contributing this audio sample to improve the reliability of the test cases." Is it fair to assume that an internally damaged flac file could cause such a message to show up? All crc mismatch messages lead to a program abort, which is not ideal.

In some cases I didn't get a crc mismatch message, but a direct array overrun panic. I added three "if" statements to make the code skip over these situations. I have not seen a single array overrun since. See my code changes ("tmtmtm") below. Do you agree that making the decoder more robust against damaged flac files is a reasonable goal?

frame.go:

func (frame *Frame) correlate() {
	switch frame.Channels {
	case ChannelsLeftSide:
		// 2 channels: left, side; using inter-channel decorrelation.
		left := frame.Subframes[0].Samples
		side := frame.Subframes[1].Samples
		for i := range side {
			// right = left - side
			if i<len(left) {	                   // tmtmtm
				side[i] = left[i] - side[i]
			}
		}

subframe.go:

func (subframe *Subframe) decodeLPC(coeffs []int32, shift int32) error {
	if len(coeffs) != subframe.Order {
		return fmt.Errorf("frame.Subframe.decodeLPC: prediction order (%d) differs from number of coefficients (%d)", subframe.Order, len(coeffs))
	}
	if shift < 0 {
		return fmt.Errorf("frame.Subframe.decodeLPC: invalid negative shift")
	}
	for i := subframe.Order; i < subframe.NSamples; i++ {
		var sample int64
		for j, c := range coeffs {
			if i-j-1 < len(subframe.Samples) {	     // tmtmtm
				sample += int64(c) * int64(subframe.Samples[i-j-1])
			}
		}
		if i<len(subframe.Samples) {	                    // tmtmtm
			subframe.Samples[i] += int32(sample >> uint(shift))
		}
	}
	return nil
}

All crc mismatch messages lead to a program abort, which is not ideal.

Turning CRC mismatches into warnings, rather than fatal errors seems reasonable. We'll push this change momentarily.

Do you agree that making the decoder more robust against damaged flac files is a reasonable goal?

As for CRC, sure. However, for the later two cases, it seems better to try to get a working FLAC file when it is broken is incorrectly encoded. Otherwise we may end up with something similar to quirks mode in HTML, which is a mess. Better to stay true to the FLAC standard.

As far as I'm aware, it should never be the case that a valid FLAC file has different number of samples in the side and left channels (i.e. in correlate). So, the FLAC files exhibiting this property must be incorrectly encoded or damaged.

For the LPC case, it seems your file does not have subframe.NSamples == len(subframe.Samples) which is strange. Would you please verify if this is the case?

I.e. add the following to decodeLPC:

if subframe.NSamples != len(subframe.Samples) {
	log.Fatalf("subframe sample count mismatch; expected %d, got %d", subframe.NSamples, len(subframe.Samples))
}

Turning CRC mismatches into warnings, rather than fatal errors seems reasonable. We'll push this change momentarily.

Done as of rev 8c55685.

for the later two cases, it seems better to try to get a working FLAC file when it is broken is incorrectly encoded. Otherwise we may end up with something similar to quirks mode in HTML, which is a mess. Better to stay true to the FLAC standard.

Let us not call it quirks mode. Quirks mode is about dealing with human ignorance. Here we have flipping bits. Why not call it error correction? Or maybe better "error resilience". Apparently mp3 decoders never panic on such issues. They attempt to be robust about damages and try to (sometimes audibly) skip over. I find this preferable.

Turning CRC mismatches into warnings

Thank you. However, this change alone may not be sufficient. What this does is that...

frame, err := flacstream.ParseNext()

may now return incomplete number of subframes or samples. You either get "len(frame.Subframes) < channels" or "len(frame.Subframes[n].Samples) < int(frame.BlockSize). Without any precaution this will likely crash the consumer of the API. Some guidance is expedient.

Btw, you can use this to find damaged flac files (I just learned):

find "path2musicfolder" -name \*.flac -type f -exec flac -t "$1" {} \+