Speed sampling in speed augmentation is counterintuitive

Question

Speed sampling in speed augmentation is counterintuitive

Closed this issue 3 years ago · 0 comments

Currently speed augmentation (nlpaug/augmenter/audio/speed.py) behaves counterintuitively to what user may expect. Namely, the method for get_random_factor as defined there:

    def get_random_factor(self):
        speeds = [round(i, 1) for i in np.arange(self.factor[0], self.factor[1], 0.1)]
        speeds = [s for s in speeds if s != 1.0]
        return speeds[np.random.randint(len(speeds))]

results in very small number of possible speeds used, e.g. for factor=(0.9, 1.1) there will be 3 values of speeds used (0.9, 1.0, 1.1). In particular, even lower amount of audio (than user would expect) will be augmented as there 1:3 probability that we sample the value 1.0!

User expects uniform sampling from the interval given by factor, so the get_random_factor should look like:

    def get_random_factor(self):
        return  (self.factor[1] - self.factor[0]) * np.random.random_sample + self.factor[0]