andygrundman/Audio-Scan

wrong duration for wav files > 16 bit

Closed this issue · 7 comments

Hi,

I've found that using the Audio::Scan module in LMS 7.9.1 (Audio::Scan 0.9.5) when running over a wav file with precision greater than 16 bit we get a wrong duration, when i.e. sox expose the correct value:

Problem is that tracks are stored with this wrong duration so, when played, the progress bar is messed up and some controls like 'jump to time' does not works properly in plugins.

this is an example over a wav 32 bit 384000 Hz,.

look at song_length_ms vs. Duration.

AudioScan: {
  info => {
            audio_offset    => 80,
            audio_size      => 397516800,
            bitrate         => 24576000,
            bits_per_sample => 32,
            block_align     => 8,
            channels        => 2,
            file_size       => 397516880,
            format          => 65534,
            jenkins_hash    => 643161139,
            samplerate      => 384000,
            song_length_ms  => 6367,
          },
  tags => {},
} at F:\SVILUPPO\AudioScan\AudioScan.pl line 78.
"F:/Sviluppo/slimserver/Plugins/C3PO/Bin/MSWin32-x86-multi-thread/sox.exe --i \"F:\\SVILUPPO\\01 - SqueezeboxServer Plugins\\musica campione\\wav_32_384000.wav\""
0
(
  "\n",
  "Input File     : 'F:\\SVILUPPO\\01 - SqueezeboxServer Plugins\\musica campione\\wav_32_384000.wav'\n",
  "Channels       : 2\n",
  "Sample Rate    : 384000\n",
  "Precision      : 32-bit\n",
  "Duration       : 00:02:09.40 = 49689600 samples ~ 9705 CDDA sectors\n",
  "File Size      : 398M\n",
  "Bit Rate       : 24.6M\n",
  "Sample Encoding: 32-bit Signed Integer PCM\n",
  "\n",
)

here a table with results over different versions of same file (always 129.4 secs long).

file         offset	size	        ch     s/r      bit 	secs     
wav 16 192000	44	99379200	2	192000	16	129.4   
wav 16 44100	44	22826160	2	44100	16	129.4    
wav 16 96000	44	49689600	2	96000	16	129.4    
wav 24 192000	80	149068800	2	192000	24	17.551   
wav 24 384000	80	298137600	2	384000	24	6.367
wav 32 384000	80	397516800	2	384000	32	6.367

Using Audacity I took a 24-96 FLAC and resampled it to several different formats. All of them retained the exact same duration value. I also used ffprobe to double-check things. Can you send me some sample files, maybe there's something else going on. Wonder if it's sox-related, because I didn't try with that. BTW I'm using version 0.96, although there shouldn't be any WAV-related changes in that vs 0.95.

File                            Bits  Samplerate    s_l_ms    ffprobe info
----------------------------------------------------------------------------------------------
24-96.flac (source)             24      96          224607    00:03:44.61, bitrate: 2614 kb/s
24-96.wav  (flac -d)            24      96          224607    00:03:44.61, bitrate: 4608 kb/s
32-96.wav  (Audacity export)    32      96          224607    00:03:44.61, bitrate: 6144 kb/s
32-192.wav (Audacity resample)  32      192         224607    00:03:44.61, bitrate: 12288 kb/s
32-384.wav (Audacity resample)  32      384         224607    00:03:44.61, bitrate: 24576 kb/s

Here some shorter (20 sec each) example: http://www.marcoc1712.it/downloads/20_Sec.rar

They are sox upasampled from the 'original' wav_16_044100.wav, that is produced from a longer flac file using FLAC, not sox.

Note that only the 32_384000 is wrong (8.85 secs) here and I've found that shorter files are always correct (I started producing a 2 sec file) so ...size matter! (at least for Audio::Scan).

A difference with your trials is that you started from an HIRez file.

Audio::Scan look for a strange 'fact' in header, don't know what is it, but without that test it will report the correct value always,

Thanks, I can reproduce it now.

Hi Marco, the problem was due to the math being done on the number of samples stored in the "fact" chunk. If num_samples*1000 exceeded 32 bits, it would overflow and give you a shorter song_length_ms value. The fact chunk is only used in files where the format is set to non-PCM. The files I made in Audacity were format=PCM even though according to the spec, anything > 16-bit is supposed to use WAVE_FORMAT_EXTENSIBLE (format=0xFFFE) with subtype PCM.

sox is working properly, and Audacity is too. I know I said non-PCM which is confusing, but what I really meant was that the header is just structured in a different way. You can have a "basic" header that includes the number of channels, samplerate, and bit depth, or an extended header that also includes the ordering of channels and a place to enter the exact number of samples. For files with that extra data (sox creates these), A::S uses it for the duration calculation. In the basic form (Audacity's), it's calculated from the size of the data divided by the bitrate.

The extra num_samples value for your test file was 7680000 (20 seconds at 384k), and the math for the duration is (7680000 * 1000) / 384000. The bug was that the top value here requires 33 bits to represent, but was being done in a 32-bit variable. The number gets truncated to a smaller number, leading to the wrong duration.

I just pushed out version 0.97 to CPAN, so let me know if this works for you.