Spijkervet/CLMR

Weights for million song dataset?

NotNANtoN opened this issue · 3 comments

Hi,

Thanks for releasing this nice repository!

Do you plan on releasing the weights for the linear classifier trained on the million song dataset too? I would be very happy to use it in my work because I would care about these more abstract "happy" or "sad" classes.

On another note: I might be too stupid to see it, but I could not find an easy way to assign the predictions of the linear classifier trained on the magnatagatune dataset to their corresponding labels. In the paper, you say that you choose the top 50 most common labels. Is there a list somewhere here in the repository for it or can I look it up on the dataset site? I do not want to mess up the order of the labels for obvious reasons...

Thanks again and kind regards,
Anton

I found the tags for magnatagatune myself, here they are:
['guitar', 'classical', 'slow', 'techno', 'strings', 'drums', 'electronic', 'rock', 'fast', 'piano', 'ambient', 'beat', 'violin', 'vocal', 'synth', 'female', 'indian', 'opera', 'male', 'singing', 'vocals', 'no vocals', 'harpsichord', 'loud', 'quiet', 'flute', 'woman', 'male vocal', 'no vocal', 'pop', 'soft', 'sitar', 'solo', 'man', 'classic', 'choir', 'voice', 'new age', 'dance', 'male voice', 'female vocal', 'beats', 'harp', 'cello', 'no voice', 'weird', 'country', 'metal', 'female voice', 'choral']

I got them from https://github.com/minzwon/sota-music-tagging-models/raw/master/split/mtat/tags.npy, which was listed in the Magnatagatune dataset class.

I'd still be happy to hear your thoughts on releasing the MSD trained classifier!

Hi there!

I have to re-train the MSD classifier as I lost the weights on the lab computer. I will let you know once I have them!

That would be so great, thank you! :)

By the way, I have experimented with the magnatagatune tagger and got a bit unstable results. Here you can see the prediction for the tag "piano" for the song "Gnossienne No. 1". I basically just split the song into overlapping chunks of 2.7s and fed it into the model:

image

This instability holds for any label. Did you by any chance make similar experiences? I was hoping to use the predicted tags to generate images according to the current "mood" in the song, but unfortunately this is not stable enough for my purpose. Maybe I messed up some part of the processing?