pearu/pylibtiff

The 'ascii' encoding cannot open images whose file names contain special characters

yang-521 opened this issue · 13 comments

Test environment:
win10
python3.8.10
pylibtiff0.4.4
filename: 483×55³MDI.tif

site-packages\libtiff\libtiff_ctypes.py", line 489, in open
tiff = libtiff.TIFFOpen(filename, mode.encode('ascii'))
ctypes.ArgumentError: argument 1: <class 'TypeError'>: wrong type

image

The .encode('ascii') is on the mode argument, not the filename. Something earlier on or perhaps what you're actually passing to .open seems to be wrong. Could you provide the actual code you are providing? Do you have a longer/fuller traceback for the exception?

If there is no ³, it can be read normally

from libtiff import TIFF path = r"C:\Users\Yang\Desktop\483×55³MDI.tif" image = TIFF.open(path)
Traceback (most recent call last):
File "D:\python\file_tools\test.py", line 5, in
image = TIFF.open(path)
File "D:\python\file_tools\lib\site-packages\libtiff\libtiff_ctypes.py", line 489, in open
tiff = libtiff.TIFFOpen(filename, mode.encode('ascii'))
ctypes.ArgumentError: argument 1: <class 'TypeError'>: wrong type
Warning: filename argument is of wrong type or encoding: 'gbk' codec can't encode character '\xb3' in position 28: illegal multibyte sequence

Ok so this has nothing to do with the ascii encoding. Due to the historic error handling of pylibtiff the failed encoding is only shown as a printed warning (the one that mentions gbk). The later error about an incorrect argument type (ArgumentError/TypeError) comes after when an exception is passed to TIFFOpen.

A lot of this code has been updated but apparently not released. Could you try installing pylibtiff from the github master branch and see if you get a different result?

I updated to 0.6.1, but I don't have the libtiff.dll file, I don't know where to download a valid dll file, I once downloaded one, but it doesn't seem to be the right file. 😂

I found a dll file that was available, and after testing it still couldn't read the special characters correctly
image

When I change the file name to normal, the following message appears
image

I do not know if it is a new feature, but this situation is not encountered in version 0.4.4

You have to pip install from the source code to generate the dll.

I tried to use pip install many times, but it was strange that no dll file was generated, I came up with another way to copy the dll file from version 0.4.4 to the system, and the result was that ordinary file names could be read normally. The message "JPEG compression support is not configured" no longer appears, but special characters cannot be read properly

I'm not sure I fully understand, but I realize you might be talking about the libtiff dll for the C libtiff library and not anything from pylibtiff, right? I also don't understand what situations worked for you and which ones didn't.

Just like the other filename issue you created (#152) some of these issues seem to be some disconnect/conflict between your python installation and your filesystem. The github master version of pylibtiff should be doing everything correctly as far as I understand and remember to be handling non-ascii filenames the best way possible. I'm not sure how else to test these types of things since a lot of them seem to be specific to your system.

I'll try again and see what the problem is

I can not find other problems, but there is a test result let me think that the ASCII encoding problem, I tried to write this file name into the ASCII encoding CSV file, failed, using UTF-8 can be written normally.

`import csv

file = "483×55³MDI.tif"
with open(r"C:\Users\Yang\Desktop\test.csv", 'w', encoding='ascii', newline='') as fp:
writer = csv.writer(fp)
writer.writerow([file])`
image

So does the ASCII encoding not support these special characters?

Your errors have nothing to do with "ascii". As your warning/error message said it is trying to use the gbk encoding:

Warning: filename argument is of wrong type or encoding: 'gbk' codec can't encode character '\xb3' in position 28: illegal multibyte sequence

There is some conflict/inconsistency between the encoding your file system is actually using to store filenames and what encoding python/file system is saying it is using.

As mentioned, the "ascii" you see in the code is all about the mode keyword argument which is not the problem here.

Ok, so does pylibtiff support reading byte objects? I want to avoid file name problems by converting images to byte objects.

I used TIFFfile instead and it reads the special characters normally