Error when using music_speech..._89.98.pt: pytorch-lightning_version

Question

Error when using music_speech..._89.98.pt: pytorch-lightning_version

tomthecollins opened this issue a year ago · 5 comments

From your paper, I wasn't sure of the role/purpose of music_speech_audioset_epoch_15_esc_89.98.pt

Are these the saved model weights one should use if one wants to focus on separation of musical instruments from one another, say? Or is audiosep_base_4M_steps.ckpt still applicable in such use cases?

When I edited your example inference code from the readme to use music_speech_audioset_epoch_15_esc_89.98.pt on a Linux machine running Ubuntu, I got the following error.

Please clarify the purpose/use of this checkpoint, and if it is meant to be used, whether I need to modify the example inference code further.

Thanks!

Traceback (most recent call last):
File "/home/blah/repos/AudioSep/sayd_infer_example.py", line 6, in
model = build_audiosep(
File "/home/blah/repos/AudioSep/pipeline.py", line 17, in build_audiosep
model = load_ss_model(
File "/home/blah/repos/AudioSep/utils.py", line 387, in load_ss_model
pl_model = AudioSep.load_from_checkpoint(
File "/home/blah/anaconda3/envs/AudioSep/lib/python3.10/site-packages/lightning/pytorch/core/module.py", line 1532, in load_from_checkpoint
loaded = _load_from_checkpoint(
File "/home/blah/anaconda3/envs/AudioSep/lib/python3.10/site-packages/lightning/pytorch/core/saving.py", line 65, in _load_from_checkpoint
checkpoint = _pl_migrate_checkpoint(
File "/home/blah/anaconda3/envs/AudioSep/lib/python3.10/site-packages/lightning/pytorch/utilities/migration/utils.py", line 113, in _pl_migrate_checkpoint
old_version = _get_version(checkpoint)
File "/home/blah/anaconda3/envs/AudioSep/lib/python3.10/site-packages/lightning/pytorch/utilities/migration/utils.py", line 136, in _get_version
return checkpoint["pytorch-lightning_version"]
KeyError: 'pytorch-lightning_version'

Answer 1 · 2023-10-24T06:32:02.000Z

I asked the same here, it's seems a model focused on music separation but I wasn't able to load it.

Answer 2 · 2023-10-24T10:37:26.000Z

Oh cool, thanks. I did look through the issue titles but must have missed this one. Thanks for pointing it out. Although it seems liuxubo717 closed without solving/addressing it... Best wishes and thanks for the work liuxubo717, Tom

…

On Tue, 24 Oct 2023 at 02:32, Fabio Grasso ***@***.***> wrote: I asked the same here <#6>, it's seems a model focused on music separation but I wasn't able to load it. — Reply to this email directly, view it on GitHub <#16 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AETIGRHSGQ6M3MVQKUBOHQLYA5OG3AVCNFSM6AAAAAA6M6OBRSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTONZWGYYDQNZWGU> . You are receiving this because you authored the thread.Message ID: ***@***.***>

Answer 3 · 2023-10-25T08:05:17.000Z

I was able to fix this error by copying the missing keys from the first checkpoint to the second.
But this time the model parameters do not match. I guess the model definition for music separation is not given.

Answer 4 · 2023-10-25T08:48:23.000Z

music_speech_audioset_epoch_15_esc_89.98.pt is not used for music source separation. Actually, it is used to initalise the text encoder (https://github.com/Audio-AGI/AudioSep/blob/main/models/clap_encoder.py#L13) of the AudioSep model.

Answer 5 · 2023-11-15T06:39:40.000Z

It's from new transformers.

Run this script on the music_speech_audioset_epoch_15_esc_89.98.pt checkpoint: LAION-AI/CLAP#127 (comment)