l3das/L3DAS21

Training baseline model (Task1)

Closed this issue · 5 comments

Hi,

I am getting the following error when I run 'python train_baseline_task1.py'

Traceback (most recent call last):
File "train_baseline_task1.py", line 274, in
main(args)
File "train_baseline_task1.py", line 58, in main
training_predictors = pickle.load(f)
EOFError: Ran out of input

l3das commented

Hello anton-jeran.
This error may indicate that you are loading an empty pickle file. You may have accidentally overwritten the pickle files that the script preprocessing.py generates.
Could you please verify that the files that you are loading are not empty? The files paths are specified as arguments in train_baseline_task1.py (look for #dataset parameters).

Hi,

issue

I see 2 issues

  1. When I download datasets, some data is downloaded in Task1 folder while some are downloaded in 'Task1'$'\r'. I copied all the data in 'Task1'$'\r' to Task1.

  2. From the above figure you can see that I removed the processed folder, and ran your comments from the beginning. But I am still getting the error. Also, pkl file is not empty.

l3das commented

Issue 1: The files between apices look like incomplete downloads. The download_dataset.py script downloads the desired set in a single zip folder, then extracts it to a new folder with the same name and finally deletes the zip. Alternatively, you can download the datasets manually at this link: https://doi.org/10.5281/zenodo.4642005.

Issue 2: As I can see from your terminal, the preprocessing was killed, and this usually means that you ran out of RAM. For the same reason you get the 'Ran out of input' error, that is because the pickle matrices saved by preprocessing.py are incomplete, and then corrupted. Unfortunately, the dataset is quite big, but you can reduce the memory requirements by preprocessing only a subset of it. To do this, simply add the argument --num_data X, where X is the maximum amount of datapoints (uncut) to preprocess for each set.
For example:
python preprocessing.py --task 1 --input_path DATASETS/Task1 --training_set train100 --num_mics 1 --segmentation_len 2 --num_data 200
We will look for a workaround to reduce the memory requirements for the preprocessing.

We hope this helps!

Thanks, Now I understood the issue :)

l3das commented

Great! So I'm closing the issue.