rbreu/beeref

Exception when URL contains non-ASCII characters

Closed this issue · 1 comments

Describe the bug
When an image with a URL containing non-ASCII characters is dropped to BeeRef, BeeRef hangs and the following error is printed to the terminal:

INFO beeref.fileio: Loading image from file PyQt6.QtCore.QUrl('https://upload.wikimedia.org/wikipedia/commons/a/af/Holden-modified_WW_II_jeep_field-ambulance_for_the_Pacific_Theater–left–National_Archives_fig-11.jpg')
WARNING Qt: QFSFileEngine::open: No file name specified
CRITICAL __main__: Unhandled exception
Traceback (most recent call last):
  File "beeref/fileio/__init__.py", line 100, in run
  File "beeref/fileio/__init__.py", line 63, in load_images
  File "beeref/fileio/image.py", line 89, in load_image
  File "urllib/request.py", line 216, in urlopen
  File "urllib/request.py", line 519, in open
  File "urllib/request.py", line 536, in _open
  File "urllib/request.py", line 496, in _call_chain
  File "urllib/request.py", line 1391, in https_open
  File "urllib/request.py", line 1348, in do_open
  File "http/client.py", line 1283, in request
  File "http/client.py", line 1294, in _send_request
  File "http/client.py", line 1132, in putrequest
  File "http/client.py", line 1212, in _encode_request
UnicodeEncodeError: 'ascii' codec can't encode character '\u2013' in position 94: ordinal not in range(128)

To Reproduce
I'm using BeeRef 0.3.1 installed with flatpak, in Debian 12, Gnome, Wayland.

Steps to reproduce the behavior:

  1. Start BeeRef.
  2. Open the URL https://upload.wikimedia.org/wikipedia/commons/a/af/Holden-modified_WW_II_jeep_field-ambulance_for_the_Pacific_Theater%E2%80%93left%E2%80%93National_Archives_fig-11.jpg in Firefox.
  3. Drag the image from Firefox to BeeRef.
  4. Observe that BeeRef is no longer responding to user input and a traceback is printed to the terminal.

Expected behavior
BeeRef should import the image.

According to QUrl documentation, the problem is probably caused by using a wrong URL representation. Apparently, there is "unencoded representation [...] for showing to users" and "encoded representation [...] you would send to a web server".

The function beeref.fileio.image.load_image() calls path.url(), which produces unencoded representation. Replacing it with path.toEncoded() will produce encoded representation, which should fix this problem. However, I have not tested this.

Screenshots
N/A

Debug log:
Debug log contains the traceback above.

Thanks a lot for the detailed report! Commit 327126b fixes this and will go out with the next release.