snejus/beetcamp

KeyError: 'title' when using split_artist_title

jonschmidt opened this issue · 18 comments

File "/Users/jmschmidt/.local/lib/python3.7/site-packages/beets_bandcamp-0.1.5-py3.7.egg/beetsplug/bandcamp.py", line 340, in <listcomp>
    return [self.get_track_info(url) for url in track_urls]
  File "/Users/jmschmidt/.local/lib/python3.7/site-packages/beets_bandcamp-0.1.5-py3.7.egg/beetsplug/bandcamp.py", line 350, in get_track_info
    artist_from_title, meta["title"] = _split_artist_title(meta["title"])
KeyError: 'title'

This is when I am importing this in singleton mode

It's fixed now. I started wondering how come did this not come up when I had some testing runs - apparently I don't have this option set (it's already kind of done by default).

Also, keep in mind that this will go away - it will return the data in the correct fields instead.

Love the tune by the way 👍

🥇

So I just tested this and while it does not crash, it also does not get the artist right. I am expecting the artist in this case to be Yamaoka, but it is Neotantra (the album artist)

My config looks like this

directory: ~/beets
library: ~/Data/beets-library.db
plugins: bandcamp
paths:
    default: $artist/$album%aunique{}/$track $title
    singleton: $artist/$title
    comp: $artist/$title
import:
    move: no
    group_albums: yes
bandcamp:
    split_artist_title: yes

and I am running

beet import ~/Desktop/Neotantra\ -\ tʌntrə\ VIIII\ -\ 03\ Yamaoka-Powder.aiff

If you find the time to look into it would be amazing

I see - it indeed is misidentified for single track releases. Before this is solved, try using the edit plugin to make any necessary changes - I won't be able to get this done in the next days as I'm away currently.

Though the good stuff is that I'm halfway through refactoring it all on a separate branch (make-it-simple) which adds reliability in how the info is parsed (it should be more resistant to html changes by Bandcamp - which caused the initial issues here). I'm also adding explicit tests for each of the different cases (single track / album / comp / single album track etc.) with documentation on the expected data, therefore it will be easier to make any suggestions going forward.

For example:

@pytest.fixture
def single_track_soup(Pot) -> Tuple[BeautifulSoup, str, ReleaseInfo]:
    test_html_file = "tests/single.html"
    url = "https://mega-tech.bandcamp.com/track/matriark-arangel"
    info = ReleaseInfo(  # expected
        **{
            "title": "Matriark - Arangel, by Megatech",
            "type": "song",
            "image": "https://f4.bcbits.com/img/a2036476476_5.jpg",
            "album": "Matriark - Arangel",
            "artist": "Megatech",
            "label": "Megatech",
            "description": " track by Megatech ",
            "release_date": date(2020, 11, 9),
            "track_count": 1,
        }
    )
    info.standalone_trackinfo = TrackInfo(
        info.album,
        url,
        length=421,
        artist=info.artist,
        artist_id=url,
        data_url=url,
        data_source="bandcamp",
        media="Digital Media",
    )
    info.albuminfo = None
    return (Pot(codecs.open(test_html_file, "r", "utf-8").read()), url, info)

This will most likely add additional fields - for example here you can see label which would get added if it's an album release.

I'm hoping to have this done just around Christmas time methinks.

Looks great.

Before this is solved, try using the edit plugin to make any necessary changes - I won't be able to get this done in the next days as I'm away currently.

Didn't know about the edit plugin. Thanks

This option was originally my suggestion. Unfortunately, it's still kind of necessary because bandcamp makes zero distinction of track artists for compilation albums.

@gryphonmyers could you give a couple of example links? I think I've managed to distinguish them through some post processing but would like to test it on some specific cases that you have :)

Hey @gryphonmyers, the compilation tracks are now split by default:

beet info -l merry x-mas | grep ': [^ ]'
               added: 2020-12-30 02:58:55
               album: MEILLEURS VŒUX IV
            album_id: 104
         albumartist: Casual Gabberz
         albumstatus: Official
          albumtotal: 11
           albumtype: compilation
          art_source: bandcamp
              artist: Von Bikräv
             artpath: /media/music/Compilations/MEILLEURS VŒUX IV/cover.jpg
            bitdepth: 16
             bitrate: 1101kbps
                 bpm: 0
            channels: 2
            comments: Visit https://casualgabberzrecords.bandcamp.com
                comp: True
             country: FR
         data_source: bandcamp
                 day: 25
                disc: 00
           disctotal: 00
            filesize: 23005048
              format: FLAC
               genre: Gabber
                  id: 635
               label: Casual Gabberz
              length: 2:45
    mb_albumartistid: https://casualgabberzrecords.bandcamp.com
          mb_albumid: https://casualgabberzrecords.bandcamp.com/album/meilleurs-v-ux-iv
         mb_artistid: https://casualgabberzrecords.bandcamp.com
          mb_trackid: https://casualgabberzrecords.bandcamp.com/track/merry-x-mas
               media: Digital Media
               month: 12
               mtime: 2020-12-30 17:31:19
        original_day: 25
      original_month: 12
       original_year: 2020
     r128_album_gain: 000000
     r128_track_gain: 000000
       rg_album_gain: 0.0
       rg_album_peak: 0.0
       rg_track_gain: 0.0
       rg_track_peak: 0.0
          samplerate: 44kHz
           singleton: False
               title: Merry X-mas
               track: 03
          tracktotal: 11
                year: 2020

where the album info on bandcamp looks like this:
image

As you can see this now also gives some additional metadata - feel free to check out dev branch and test it. This won't get merged into master before I fetch the rest of metadata (correct media / disc info, genre / tags and comments / description) and add appropriate documentation.

So I just tested this and while it does not crash, it also does not get the artist right. I am expecting the artist in this case to be Yamaoka, but it is Neotantra (the album artist)

My config looks like this

directory: ~/beets
library: ~/Data/beets-library.db
plugins: bandcamp
paths:
    default: $artist/$album%aunique{}/$track $title
    singleton: $artist/$title
    comp: $artist/$title
import:
    move: no
    group_albums: yes
bandcamp:
    split_artist_title: yes

and I am running

beet import ~/Desktop/Neotantra\ -\ tʌntrə\ VIIII\ -\ 03\ Yamaoka-Powder.aiff

If you find the time to look into it would be amazing

I am guessing I should be testing the make-it-simple branch. I ran the same thing I quoted above and got the same result. Maybe this should not be changed yet though?

Hmm, are you on #5cc99a5 commit? I've ran it and got:

               added: 2020-12-31 23:44:37
         albumartist: Yamaoka
              artist: Yamaoka
            bitdepth: 0
             bitrate: 128kbps
                 bpm: 0
            channels: 2
                comp: False
         data_source: bandcamp
                 day: 00
                disc: 00
           disctotal: 00
            filesize: 8032529
              format: MP3
               genre: Ambient
                  id: 666
              length: 8:21
         mb_artistid: https://neotantra.bandcamp.com
          mb_trackid: https://neotantra.bandcamp.com/track/yamaoka-powder
               month: 00
               mtime: 2020-12-31 23:44:37
        original_day: 00
      original_month: 00
       original_year: 0000
     r128_album_gain: 000000
     r128_track_gain: 000000
       rg_album_gain: 0.0
       rg_album_peak: 0.0
       rg_track_gain: 0.0
       rg_track_peak: 0.0
          samplerate: 44kHz
           singleton: True
               title: Powder
               track: 00
          tracktotal: 00
                year: 0000

As far as I understand, the artist should be Yamaoka. Does it differ from how you expect it to be?
The albumartist though can't be different due to the singleton mode. Currently it only has the very basic fields (same as previously, actually) since all the rich metadata tends to be assigned to the album instead. I'll try to find a way around it once I have the albums sorted.

Not sure about the zeroes in the year etc - this may be something within my configuration. Let me know if you see them too.

For a comparison, that's the track info if it's added as part of the album (the country should say GB though but I found this to be an exceptional case - I'll handle it in the next commit):

               added: 2020-12-31 23:44:37
               album: tʌntrə VIIII
            album_id: 109
         albumartist: Neotantra
         albumstatus: Official
          albumtotal: 18
           albumtype: compilation
          art_source: bandcamp
              artist: Yamaoka
             artpath: /media/music/Compilations/tʌntrə VIIII/cover.jpg
            bitdepth: 0
             bitrate: 128kbps
                 bpm: 0
            channels: 2
                comp: True
             country: XW
         data_source: bandcamp
                 day: 29
                disc: 00
           disctotal: 00
            filesize: 8093197
              format: MP3
                  id: 666
               label: Neotantra
              length: 8:21
    mb_albumartistid: https://neotantra.bandcamp.com
          mb_albumid: https://neotantra.bandcamp.com/album/t-ntr-viiii
         mb_artistid: https://neotantra.bandcamp.com
          mb_trackid: https://neotantra.bandcamp.com/track/yamaoka-powder
               media: Digital Media
               month: 08
               mtime: 2020-12-31 23:54:21
        original_day: 29
      original_month: 08
       original_year: 2020
     r128_album_gain: 000000
     r128_track_gain: 000000
       rg_album_gain: 0.0
       rg_album_peak: 0.0
       rg_track_gain: 0.0
       rg_track_peak: 0.0
          samplerate: 44kHz
           singleton: False
               title: Powder
               track: 03
          tracktotal: 18
                year: 2020

Also, just letting you know that the work is now happening in the dev branch (though it only has one more commit than the one you ran).

And actually I was fetching the right code, but was forgetting to install it! Trying to figure out how to manually install something without the setup.py file. Any tips?

Ok I just used the setup.py from master and it worked great 👍

@gryphonmyers could you give a couple of example links? I think I've managed to distinguish them through some post processing but would like to test it on some specific cases that you have :)

Sorry - missed the notification. Looks like you found a good example though and also an improved workaround! Cool! I'll check this out soon

So sorry - just realised that I forgot to push it out! It only had one more little commit, so you didn't miss out on anything important.

Ok I just used the setup.py from master and it worked great 👍

Just try installing the directory with pip - pip install ., I use this sort of configuration for example:

pybeet -m pip install ~/repo/misc/beets-bandcamp
# where pybeet is the python binary used by beets virtual environment
alias pybeet=~/.local/pipx/venvs/beets/bin/python

I tend to reinstall while I'm in my music dir testing something, so it's configured to work from there too.

Though be aware that setup.py from master has outdated dependencies - if it worked that's probably because some other package in your environment also required pycountry which isn't included in the setup.py.

On the other hand, If you ran pip install <dir> pip should have picked up the pyproject.toml regardless of setup.py presence. Though, if you ran python setup.py install, it probably needs updating.

Closing this issue since a fix has been merged into master together with the refactor. pip install beetcamp should work now!