RileyXX/IMDB-Trakt-Syncer

UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 3982: character maps to <undefined>

Closed this issue · 9 comments

Is there already an issue for your problem?

  • I have checked older issues, open and closed

Bug Description

Traceback (most recent call last):
  File "C:\Users\denni\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Python311\site-packages\IMDBTraktSyncer\IMDBTraktSyncer.py", line 110, in main
    imdb_watchlist, imdb_ratings, imdb_reviews, errors_found_getting_imdb_reviews = imdbData.getImdbData(imdb_username, imdb_password, driver, directory, wait)
                                                                                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\denni\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Python311\site-packages\IMDBTraktSyncer\imdbData.py", line 42, in getImdbData
    for row in reader:
  File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.11_3.11.1264.0_x64__qbz5n2kfra8p0\Lib\encodings\cp1252.py", line 23, in decode
    return codecs.charmap_decode(input,self.errors,decoding_table)[0]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 3982: character maps to <undefined>

Environment

Installed in Windows 11
Installed Python 3.11.4 (tags/v3.11.4:d2340ef, Jun 7 2023, 05:45:37) [MSC v.1934 64 bit (AMD64)]
Installed with python -m pip install IMDBTraktSyncer
and then i ran:
IMDBTraktSyncer in the folder C:\Users\xxxxx\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Python311\Scripts>

Screenshots

No response

Hi thanks for the detailed bug report! It looks like there was a decoding error when handling the watchlists file from IMDB. Likely due to an invalid character on line 3982. A potential fix for this bug should now implemented in v1.4.0. Let me know if this fixes it for you, otherwise see below.

For pip install you can update to the latest version using the following:

python -m pip install IMDBTraktSyncer --upgrade

Alternatively, you can download the latest .zip from the releases page. Please note, if you plan on running IMDBTraktSyncer using the no pip install method it is recommended to uninstall it from pip first to avoid any conflicts. You can do this by running python -m pip uninstall IMDBTraktSyncer in command line.

If you are still getting this error after upgrading to v1.4.0, I will need more info from your watchlist file from IMDB in order to help me locate the problematic character(s) and implement a fix.

To do this:

  1. Login to imdb.com Navigate to https://www.imdb.com/list/watchlist.
  2. At the bottom of the page click Export this list to download the WATCHLIST.csv file.
  3. Attach the csv file here OR open the CSV file in a text editor such as NOTEPAD++ and copy as many lines as you can around line 3982 and paste it here.

Thanks for a fast fix!
However, i turned off syncing wathlist but is getting another error:

Traceback (most recent call last):
  File "C:\Users\denni\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Python311\site-packages\IMDBTraktSyncer\IMDBTraktSyncer.py", line 318, in main
    button = wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, '[data-testid="hero-rating-bar__user-rating"] button.ipc-btn')))
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\denni\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Python311\site-packages\selenium\webdriver\support\wait.py", line 95, in until
    raise TimeoutException(message, screen, stacktrace)
selenium.common.exceptions.TimeoutException: Message:
Stacktrace:
Backtrace:
        GetHandleVerifier [0x00716E73+48323]
        (No symbol) [0x006A9661]
        (No symbol) [0x005B5308]
        (No symbol) [0x005E0B45]
        (No symbol) [0x005E0CDB]
        (No symbol) [0x0060E3D2]
        (No symbol) [0x005FA924]
        (No symbol) [0x0060CAC2]
        (No symbol) [0x005FA6D6]
        (No symbol) [0x005D847C]
        (No symbol) [0x005D957D]
        GetHandleVerifier [0x0097FD5D+2575277]
        GetHandleVerifier [0x009BF86E+2836158]
        GetHandleVerifier [0x009B96DC+2811180]
        GetHandleVerifier [0x007A41B0+626688]
        (No symbol) [0x006B314C]
        (No symbol) [0x006AF4B8]
        (No symbol) [0x006AF59B]
        (No symbol) [0x006A21B7]
        BaseThreadInitThunk [0x76E27D59+25]
        RtlInitializeExceptionChain [0x7774B74B+107]
        RtlClearBits [0x7774B6CF+191]

Hi no problem and thanks again for taking the time to provide these detailed bug reports! These are very helpful in finding bugs and making the script more robust.

So just to clarify with the first error, are you still getting this error when watchlist sync is enabled?

And for the second error, it looks like there was an error while rating items on IMDB. My guess is it was rating items successfully and then one of the items gave this error and broke the script. Does this seem accurate? My guess would be, one of the items it was trying to rate didn't exist on IMDB due to an invalid IMDB id from Trakt, which returned a 404 error on IMDB. The script should now properly handle these exceptions without breaking the script in v1.4.1.

You can update to the latest version using the following:

python -m pip install IMDBTraktSyncer --upgrade

If you are still getting either of these errors please let me know so I can investigate further. Thanks!

Hi!
Yes, the first error was when watchlist was enabled. The second error was when it was disabled.
I have now updated to the latest version and watchlist is still disabled. But i still get this error:

Traceback (most recent call last):
  File "C:\Users\denni\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Python311\site-packages\IMDBTraktSyncer\IMDBTraktSyncer.py", line 319, in main
    button = wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, '[data-testid="hero-rating-bar__user-rating"] button.ipc-btn')))
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\denni\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Python311\site-packages\selenium\webdriver\support\wait.py", line 86, in until
    value = method(self._driver)
            ^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\denni\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Python311\site-packages\selenium\webdriver\support\expected_conditions.py", line 82, in _predicate
    return driver.find_element(*locator)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\denni\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Python311\site-packages\selenium\webdriver\remote\webdriver.py", line 740, in find_element
    return self.execute(Command.FIND_ELEMENT, {"using": by, "value": value})["value"]

It seems to start as it should..:

Successfully signed in to IMDB
Processing Trakt Data
Processing Trakt Data Complete
Processing IMDB Data
Processing IMDB Data Complete
Setting Trakt Ratings
Rating TV show (1 of 30): Aspiranterna (1998): 5/10 on Trakt
Rating TV show (2 of 30): Trazan Apansson (1978): 5/10 on Trakt
Rating episode (3 of 30): Chock: Dödsängeln (1997): 4/10 on Trakt
Rating episode (4 of 30): Chock: Kött (1997): 5/10 on Trakt
Rating TV show (5 of 30): Byhåla 2 - Tillbaka till Fårrden (1992): 7/10 on Trakt
Rating episode (6 of 30): Chock: Helljus (1997): 4/10 on Trakt
Rating episode (7 of 30): Chock: På heder och samvete (1997): 5/10 on Trakt
Rating episode (8 of 30): Wallander: Indrivaren (2010): 5/10 on Trakt
Rating TV show (9 of 30): I.K. - Ivar Kreuger (1998): 5/10 on Trakt
Rating TV show (10 of 30): 24: Live Another Day (2014): 7/10 on Trakt
Rating TV show (12 of 30): Arne Dahl: Upp till toppen av berget (2012): 7/10 on Trakt
Rating TV show (13 of 30): Arne Dahl: Europa blues (2012): 7/10 on Trakt
Rating TV show (14 of 30): Arne Dahl: De största vatten (2012): 7/10 on Trakt
Rating TV show (15 of 30): Rasmus på luffen (1986): 8/10 on Trakt
Rating TV show (16 of 30): Weakest Link (2001): 3/10 on Trakt
Rating TV show (19 of 30): Mattemorden (2015): 5/10 on Trakt
Rating TV show (21 of 30): Arne Dahl: En midsommarnattsdröm (2015): 5/10 on Trakt
Rating TV show (22 of 30): Arne Dahl: Dödsmässa (2015): 6/10 on Trakt
Rating TV show (23 of 30): Arne Dahl: Mörkertal (2015): 5/10 on Trakt
Rating TV show (24 of 30): Arne Dahl: Efterskalv (2015): 5/10 on Trakt
Rating TV show (25 of 30): Arne Dahl: Himmelsöga (2015): 6/10 on Trakt
Rating TV show (26 of 30): Byhåla 3 (1993): 6/10 on Trakt
Rating TV show (28 of 30): Brandvägg (2006): 4/10 on Trakt
Rating TV show (29 of 30): Inside Look: The People v. O.J. Simpson - American Crime Story (2016): 8/10 on Trakt
Setting Trakt Ratings Complete
Setting IMDB Ratings
Rating movie: (1 of 1009) Kill List (2011): 4/10 on IMDB
Failed to rate movie: (1 of 1009) Kill List (2011): 4/10 on IMDB (tt1788391)
Rating movie: (2 of 1009) Knock at the Cabin (2023): 5/10 on IMDB
Failed to rate movie: (2 of 1009) Knock at the Cabin (2023): 5/10 on IMDB (tt15679400)
Rating movie: (3 of 1009) Nope (2022): 4/10 on IMDB
Failed to rate movie: (3 of 1009) Nope (2022): 4/10 on IMDB (tt10954984)
Rating movie: (4 of 1009) Saw: Rebirth (2005): 5/10 on IMDB
Failed to rate movie: (4 of 1009) Saw: Rebirth (2005): 5/10 on IMDB (tt0818519)
Rating movie: (5 of 1009) Gantz:O (2016): 6/10 on IMDB
Failed to rate movie: (5 of 1009) Gantz:O (2016): 6/10 on IMDB (tt5923962)
Rating movie: (6 of 1009) Paradise Lost: The Child Murders at Robin Hood Hills (1996): 8/10 on IMDB
Failed to rate movie: (6 of 1009) Paradise Lost: The Child Murders at Robin Hood Hills (1996): 8/10 on IMDB (tt0117293)
Rating movie: (7 of 1009) American Night (2021): 4/10 on IMDB
Failed to rate movie: (7 of 1009) American Night (2021): 4/10 on IMDB (tt5344054)
Rating movie: (8 of 1009) Blood Moon (2021): 4/10 on IMDB
Failed to rate movie: (8 of 1009) Blood Moon (2021): 4/10 on IMDB (tt11828648)
Rating movie: (9 of 1009) Detachment (2011): 6/10 on IMDB
Failed to rate movie: (9 of 1009) Detachment (2011): 6/10 on IMDB (tt1683526)
Rating movie: (10 of 1009) Stillwater (2021): 5/10 on IMDB
Failed to rate movie: (10 of 1009) Stillwater (2021): 5/10 on IMDB (tt10696896)
Rating movie: (11 of 1009) Halloween Kills (2021): 4/10 on IMDB
Failed to rate movie: (11 of 1009) Halloween Kills (2021): 4/10 on IMDB (tt10665338)
Rating movie: (12 of 1009) Venom: Let There Be Carnage (2021): 5/10 on IMDB
Failed to rate movie: (12 of 1009) Venom: Let There Be Carnage (2021): 5/10 on IMDB (tt7097896)
Rating movie: (13 of 1009) Shang-Chi and the Legend of the Ten Rings (2021): 7/10 on IMDB
Failed to rate movie: (13 of 1009) Shang-Chi and the Legend of the Ten Rings (2021): 7/10 on IMDB (tt9376612)
Rating movie: (14 of 1009) Texas Chainsaw Massacre (2022): 5/10 on IMDB
Failed to rate movie: (14 of 1009) Texas Chainsaw Massacre (2022): 5/10 on IMDB (tt11755740)
Rating movie: (15 of 1009) Turning Red (2022): 7/10 on IMDB
Failed to rate movie: (15 of 1009) Turning Red (2022): 7/10 on IMDB (tt8097030)
Rating movie: (16 of 1009) Fresh (2022): 5/10 on IMDB
Failed to rate movie: (16 of 1009) Fresh (2022): 5/10 on IMDB (tt13403046)
Rating movie: (17 of 1009) Death on the Nile (2022): 6/10 on IMDB
Failed to rate movie: (17 of 1009) Death on the Nile (2022): 6/10 on IMDB (tt7657566)
Rating movie: (18 of 1009) The Batman (2022): 8/10 on IMDB
Failed to rate movie: (18 of 1009) The Batman (2022): 8/10 on IMDB (tt1877830)
Rating movie: (19 of 1009) The Sadness (2021): 5/10 on IMDB
Failed to rate movie: (19 of 1009) The Sadness (2021): 5/10 on IMDB (tt13872248)
Rating movie: (20 of 1009) The Bad Guys (2022): 7/10 on IMDB
Failed to rate movie: (20 of 1009) The Bad Guys (2022): 7/10 on IMDB (tt8115900)
Rating movie: (21 of 1009) The Lost City (2022): 6/10 on IMDB
Failed to rate movie: (21 of 1009) The Lost City (2022): 6/10 on IMDB (tt13320622)
Rating movie: (22 of 1009) The Northman (2022): 5/10 on IMDB
Failed to rate movie: (22 of 1009) The Northman (2022): 5/10 on IMDB (tt11138512)
Rating movie: (23 of 1009) The Jack Bull (1999): 6/10 on IMDB
Failed to rate movie: (23 of 1009) The Jack Bull (1999): 6/10 on IMDB (tt0171410)
Rating movie: (24 of 1009) Wanderlust (2012): 6/10 on IMDB
Failed to rate movie: (24 of 1009) Wanderlust (2012): 6/10 on IMDB (tt1655460)
Rating movie: (25 of 1009) Watcher (2022): 4/10 on IMDB
Failed to rate movie: (25 of 1009) Watcher (2022): 4/10 on IMDB (tt12004038)
Rating movie: (26 of 1009) Small Town Killers (2017): 7/10 on IMDB
Failed to rate movie: (26 of 1009) Small Town Killers (2017): 7/10 on IMDB (tt5458566)
Rating movie: (27 of 1009) Everything Everywhere All at Once (2022): 9/10 on IMDB
Failed to rate movie: (27 of 1009) Everything Everywhere All at Once (2022): 9/10 on IMDB (tt6710474)
Rating movie: (28 of 1009) Uncharted (2022): 5/10 on IMDB

I see now thanks for clarifying!

So it seems the first error should be fixed. Currently even when watchlist sync is disabled the script will still get your watchlist data, but won't make any changes to it. That's why you were getting the error related to reading the watchlist file even with watchlist sync disabled. This will be improved in a future update to improve performance when certain sync options are disabled, but as of right now it won't have any effect other than a longer initial processing time.

 button = wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, '[data-testid="hero-rating-bar__user-rating"] button.ipc-btn')))
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

As for the second error I wasn't able to reproduce it but based on the error you provided this would indicate that the script is timing out while waiting for the rate button to appear on IMDB when rating items. Based on the log output you provided it looks like the script is trying to connect with the correct IMDB ID so it should be connecting to the correct URL. I think the cause of this error might be due to rate limiting on the IMDB website (429 or 503 error for example). This can happen sometimes when processing large amounts of data.

I also noticed some other potential issues while investigating where the script would execute before certain elements were loaded while rating, adding or removing items on IMDB. This could also be another potential cause of the error you described.

Both of these issues have been patched in the latest version v1.5.0. When you get some time give it a try and let me know how it goes. The latest version also has proper error logging in log.txt. It's in the same folder as the script. So if you're still getting the same behavior on v1.5.0 you can copy and relevant error logs and paste them here.

python -m pip install IMDBTraktSyncer --upgrade

If v1.5.0 doesn't fix this error it will be a little tricky to troubleshoot. You can do a little troubleshooting yourself if you're decent with a text editor. If you open IMDBTraktSyncer.py in a text editor like Notepad++ and look for the line options.add_argument("--headless=new"), you can put a # before it like this #options.add_argument("--headless=new") then save it and it will unhide the browser so you can see exactly what's happening when the script is rating items. This might give some clues to what's happening. If you installed with pip the script will most likely be in C:\Users\denni\AppData\Local\Programs\Python\Python311\Lib\site-packages\IMDBTraktSyncer\IMDBTraktSyncer.py.

06-19-2023_327

Closing this issue as fixed. If you are still experiencing this error leave another comment here and I will reopen the issue for further investigation. Thanks!

Thanks to a video provided by another user #56 the original cause of this issue may have been caused by the setting Show reference view with full cast and crew (advanced view) located here https://www.imdb.com/preferences/general.

When enabled, the links to the movie pages https://www.imdb.com/title/tt10521092/ are redirected to https://www.imdb.com/title/tt10521092/reference which has a different layout causing the script to fail.

I will work on implementing a fix for this in the next patch. In the meantime, as a workaround unticking the setting Show reference view with full cast and crew (advanced view) https://www.imdb.com/preferences/general, saving and then running the script again should solve this.

A fix for the Show reference view with full cast and crew (advanced view) IMDB setting causing the script to fail has been implemented in v1.6.0.