LUMII-Syslab/RSE

When run the parse_file will led to out of memory problem and how to inference

Closed this issue · 2 comments

I have already done the training by the Musicnet dataset I use before(the similar resampling and parse), but when I want to test the result, it seems the dataset parse has a different process and result, so I try to use the parse_file.py to get the correct training and testing data.

When I run the command python3 -u parse_file.py, it will be killed at preparing train song 81.
But my system already has 64GB RAM, I wonder is it necessary to spend so much memory or maybe something I didn't set up correctly.
If there is meant to use large memory, I want to know how big it should have.

And I also want to ask how to inference my own music file.

Thank you for taking the time to read my question.

My system information:
OS: Ubuntu 20.04
RAM: 64GB
GPU: RTX 3060ti

nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2019 NVIDIA Corporation Built on Sun_Jul_28_19:07:16_PDT_2019 Cuda compilation tools, release 10.1, V10.1.243

And I have modified the code turns it into the TF V2, so I can run it on my system.

The python3 -u parse_file.py command did require a lot of RAM (more than 128GB). I have updated parse_file.py and a couple of other files to use numpy.memmap which decreases the RAM requirements. 64GB should now be enough to run the experiments.

For transcribing a custom wav file, I have added transcribe.py in the musicnet_data directory. It can be used by placing your music file yourfile.wav in the musicnet_data directory and running python3 transcribe.py yourfile.wav. This requires a saved model for running the inference. Running the command will produce an array with the predictions of the model yourfile_results.npy and a MIDI file yourfile.mid. If you wish to include labels in your music file, a process similar to the one in process_dataset() function in get_musicnet.py can be used to convert wav and csv to a valid npz file.