bastibe/SoundCard

Size of recorded data

josephernest opened this issue · 8 comments

Hi @bastibe. As discussed by email, I sometimes have 0-size blocks. Here is how to reproduce (I have a Thinkpad laptop, built-in soundcard, Python 3.7, Windows 7):

import soundcard as sc
lb = sc.all_microphones(include_loopback=True)[0]
with lb.recorder(samplerate=44100) as mic:
    for i in range(1000):
        data = mic.record(numframes=None)
        print(data.shape)

Output:

(448, 2)
(448, 2)
(448, 2)
(448, 2)
(448, 2)
(0, 2)
(448, 2)
(448, 2)
(448, 2)
(448, 2)
(448, 2)
(0, 2)
(448, 2)
(0, 2)
(448, 2)
(0, 2)
(448, 2)
(0, 2)

This is really not a problem finally, because I can simply ignore the 0-size blocks:

if data.shape[0] == 0:
    continue

No audio data is lost anyway, so it's not a real problem.


Funnily, I first did a mistake, and used a 44000 samplerate whereas my soundcard uses the usual 44100 ; thus, I was getting strange sizes:

(447, 2)
(447, 2)
(130, 2)
(317, 2)
(447, 2)
(260, 2)
(187, 2)
(0, 2)
(447, 2)

but again this is solved too by using 44100.

If you are recording with record(numframes=None), SoundCard returns whatever is available at the moment. If no audio data is available, you get a zero-length frame. This is in fact not a bug, but exactly as it should be.

If you want consistent block sizes, use record(numframes=blocksize). This will ensure that you always get blocksize samples, at the cost of up to one (hardware-)blocksize of delay.

The strange sizes you are getting are a result of Microsoft's internal resampling. It seems that your sound card is configured to record at 44100 Hz and 10.1 ms blocks, which corresponds to 448 frames per block. However, when requesting data at 44000 Hz, an internal resampling routine is engaged that takes the 44.1 kHz data and resamples it to 44 kHz, at a smaller block size than the audio device itself.

Via email, you commented that using record(numframes=1024) would sometimes result in all-zero blocks that shouldn't be there. This is indeed a bug, and I would need your help in debugging it, as it doesn't happen on my machine.

The zero frames are caused either by line 734 or by line 723. I would need to know which one of them is the cause of this error. Could you insert a print statement (or breakpoint) in both places, and report back which one of them produces the zeros?

Via email, you commented that using record(numframes=1024) would sometimes result in all-zero blocks that shouldn't be there. This is indeed a bug, and I would need your help in debugging it, as it doesn't happen on my machine.

Indeed, this is the code producing this bug:

import soundcard as sc, numpy as np, time, socket
lb = sc.all_microphones(include_loopback=True)[0]
print(lb)
with lb.recorder(samplerate=44100) as mic, open('test.raw', 'wb') as f:
    for i in range(2000):
        data = mic.record(numframes=1024)
        data = (data * 2**15).astype(np.int16)
        f.write(data.tostring())

The sound is stuttering, and here is the recorded waveform:

image

The zero frames are caused either by line 734 or by line 723. I would need to know which one of them is the cause of this error. Could you insert a print statement (or breakpoint) in both places, and report back which one of them produces the zeros?

I just tried, and found that it comes from line https://github.com/bastibe/SoundCard/blob/master/soundcard/mediafoundation.py#L723:

    while not self._capture_available_frames():
        empty_frames += 1
        if empty_frames > 10:
            # no data for 10 ms: give up.
            return numpy.zeros([0], dtype='float32')
        time.sleep(0.001)

Thank you for your analysis. I think I see the error, though: a blocksize of, say, 1024 samples should take about 20 ms to produce, but the code is only waiting 10 ms. Silly me.

So instead of waiting for 10 ms, we should wait for one device blocksize.

Could you try running the recorder with blocksize=256? You can still record with numframes=1024, but the recorder blocksize should make _capture_available_frames return more often.

I think the proper fix should be to replace line 721 with

if empty_frames * 0.001 > self.deviceperiod[0]:

That is, _record_chunk should wait at least one device period before giving up.

Hi @bastibe

I tried various methods:

  1. recorder with blocksize=256

  2. Line 721: if empty_frames * 0.001 > self.deviceperiod[0]: instead of if empty_frames > 10:

  3. Combination of 1) and 2)

but the bug is still present in all of them.

Maybe it's juste my local configuration... (I use the ThinkPad's built-in soundcard, which is not particularly excellent).

Hi @josephernest,

this is deeply puzzling. The documentation specifically states that audio data should be returned every self.deviceperiod[0]. But it wouldn't be the first time that audio hardware has a mind of its own. But it's possible we're only off by a few milliseconds.

Blocksizes are regrettably often handled as more of a suggestion than a fixed configuration, so the blocksize part not working is not particularly surprising.

Anyway, if you'd like to experiment some more, try increasing the if empty_frames > XXX until the issue goes away. Perhaps some multiple of self.deviceperiod[0] might work? It would be somewhat reasonable if something like self.deviceperiod[0] * 1.1 did the trick.

Looks like there are 2 channel data being interleaved as a single channel recording.
Check if the zeros are evenly spaced by blocksize samples.