madebyollin/acapellabot

how to train this model by myself?

Opened this issue · 4 comments

Hello, how can I get myself to training this model? May I have your dataset? or how to make it by myself?

I talk a bit about my data collection process here.

I think you can probably get away with much less; the only really important part about the data is that the model can use it to learn to isolate acapellas from background noise. The minimum viable data collection process would be (assuming you don't want to modify the data loader at all):

  • Find a few acapellas online on /r/songstems, acapellas4u.co.uk, and other sites. Add "acapella" to all the filenames with a bash script.

  • Download a few instrumentals from SoundCloud.

  • Add "1" to the start of each filename.

  • Put them all in a folder input and run python acapellabot.py --data input

If you want to be a bit more careful you can actually key-tag the songs with KeyFinder and use the Camelot key as the number, so that the data processor only makes mashups of songs with the same key. You can also use Audacity to adjust the tempo of the acapellas so that they're standardized. For reference, some of my data look like:

screen shot 2017-06-02 at 8 43 36 am

I hope that helps!

Thanks a lot~
I have anther question, your model cannot split very well, when the song with some special drumbeats, like this
test001.mp3.zip

My question is can I solve this problem by using more data with special drumbeats to training your model? I am not sure the problem's reason is or isn't because the lack of data

Ah, that's an interesting example! I'm working on an updated version of the model that fixes several architectural problems, but it still can't remove all of the drums in your example (sample output).

I think this is a problem that more diverse training data will probably solve–I'm training exclusively on 128bpm EDM which don't have much variation in drum samples.

That said, I'm not sure that more data will help the current stable version of the model that I've posted to GitHub–you can probably get it to filter out the drums in your example, but probably at the expense of removing legitimate vocals in other songs (new model architecture is a bit smarter about this).

For now, I'd try collecting more data in this style, training a model, and seeing if it does better–even if it doesn't work, you'll have the data for training on the new architecture once it's working properly 👍

ok, thanks, please let me know if you have any progress, I'll try to training based on your model with more special style data
thanks again~