aoifemcdonagh/audioset-processing

Audio download number limiters

Opened this issue · 1 comments

Hey, this isn't really an issue but I thinking that having a limiter for successful download will be really helpful.

For example I would be able to set the script to download only 10 or 20 audio clips, rather than downloading the whole list.

Here is how I did it.

  1. Go to core/utils.py
  2. Set a const: DEFAULT_NUM_FILES = 200 (for example 200 files)
  3. In create_csv function, when function creates new csv, truncate list.
with open(csv_dataset) as dataset, open(new_csv_path, 'w', newline='') as new_csv:
        reader = csv.reader(dataset, skipinitialspace=True)
        writer = csv.writer(new_csv)

        #  Include the row if it contains label for desired class and no labels of blacklisted classes
        to_write = [row for ind, row in enumerate(reader) for label in label_id if label in row[3] and bool(set(row[3].split(",")).intersection(blacklisted_ids)) is False]  # added check for blacklisted classes
        to_write = to_write[:DEFAULT_NUM_FILES+1]
        
        writer.writerows(to_write)