Weird error with unicode characters in path "Wildcards don't work in the directory specification"
ccoenen opened this issue · 18 comments
This may not be a bug in mini_exiftool, but right now, i don't really know what to make of it. If you could help me pin it down, that would be amazing.
I'm on windows, and i have path names that contain unicode characters. One path looks like this:
C:\tmp\2015-03-23 Test with german umlaut äöü\IMG_1000.JPG
Now if i fire up an irb
, i can open that file with ruby, but not with mini_magick
f = File.open("C:/tmp/2015-03-23 Test with german umlaut äöü/IMG_1000.JPG")
# => #<File:C:/tmp/2015-03-23 Test with german umlaut äöü/IMG_1000.jpeg>
f.size
# => 18713
# so far, so good! Let's try mini_exiftool, now.
require 'mini_exiftool'
# => true
m = MiniExiftool.new("C:/tmp/2015-03-23 Test with german umlaut äöü/IMG_1000.JPG")
# MiniExiftool::Error: Wildcards don't work in the directory specification
# No matching files
# from C:/Tools/Ruby21/lib/ruby/gems/2.1.0/gems/mini_exiftool-2.5.0/lib/mini_exiftool.rb:137:in `load'
# from C:/Tools/Ruby21/lib/ruby/gems/2.1.0/gems/mini_exiftool-2.5.0/lib/mini_exiftool.rb:101:in `initialize'
# from (irb):11:in `new'
# from (irb):11
# from C:/Tools/Ruby21/bin/irb:11:in `<main>'
note that this is not a "file not found", because i can easily provoke that:
m = MiniExiftool.new("C:/tmp/2015-03-23 Test with german umlaut äöü/IMG_1337.JPG")
# MiniExiftool::Error: File 'C:/tmp/2015-03-23 Test with german umlaut äöü/IMG_1337.JPG' does not exist.
# from C:/Tools/Ruby21/lib/ruby/gems/2.1.0/gems/mini_exiftool-2.5.0/lib/mini_exiftool.rb:121:in `load'
# ...
the same example works fine, if i change the directory name to omit the äöü
part:
m = MiniExiftool.new("C:/tmp/2015-03-23 Test without german umlaut/IMG_1000.JPG")
# => #<MiniExiftool:0x3031f50 @opts={:numerical=>false, :composite=>true, ...
The error message (Wildcards don't work in the directory specification) does not come from anywhere within mini_exiftool, at least not that i can find it with github's code search.
Exiftool itself is also not at fault (at least not alone), because i can do this without a problem:
> exiftool.exe "C:\tmp\2015-03-23 Test with german umlaut äöü\IMG_1000.JPG"
ExifTool Version Number : 9.90
File Name : IMG_1000.JPG
...
I'm really somewhat stuck.
This does not change if i'm using backslashes instead of forward slashes.
I'm doing a lot to handle encoding and escaping particularly for filenames in mini_exiftool. What is the result of
Encoding.find('filesystem')
on your windows system?
Encoding.find('filesystem')
# =><Encoding:Windows-1252>
This is a Windows 7 (x64) machine with this environment:
C:\Users\user>bundler env
Bundler 1.7.12
Ruby 2.1.5 (2014-11-13 patchlevel 273) [i386-mingw32]
Rubygems 2.4.6
This seems to be correct. I have no idea. Maybe a look at the executed command line will be helpful:
$DEBUG = true
m = MiniExiftool.new("C:/tmp/2015-03-23 Test with german umlaut äöü/IMG_1337.JPG")
exiftool -j "C:/tmp/2009-03-07 test Path ???/IMG_4224.JPG"
- it seems to replace the umlauts with question marks, which are a wildcard on windows (single character).
In which encoding is your source file written? Do you use the correct magic comment? http://en.wikibooks.org/wiki/Ruby_Programming/Encoding#Using_Encodings
The examples earlier were from irb, with no encoding set, explicitly.
Here's all of the encoding outputs for reference
Encoding.find('external')
# <Encoding:CP850>
Encoding.find('internal')
# nil
Encoding.find('filesystem')
#<Encoding:Windows-1252>
Encoding.find('locale')
#<Encoding:CP850>
From within the irb i ran the following commands:
Encoding.default_external = 'utf-8'
# "utf-8"
Encoding.default_internal = 'utf-8'
# "utf-8"
require 'mini_exiftool'
# true
m = MiniExiftool.new("C:/tmp/2009-03-07 test Path äöü/IMG_1000.JPG")
# MiniExiftool::Error: Wildcards don't work in the directory specification
# No matching files
# ...
I also put this into a ruby file (and i double checked that it was actually saved as UTF-8)
#encoding: UTF-8
Encoding.default_external = 'utf-8'
Encoding.default_internal = 'utf-8'
require 'mini_exiftool'
m = MiniExiftool.new("C:/tmp/2009-03-07 test Path äöü/IMG_4224.JPG")
puts m
It fails with the same Wildcards-Error-Message.
Could you try (UTF-8 encoded)?
#encoding: UTF-8
puts `exiftool.exe "C:/tmp/2009-03-07 test Path äöü/IMG_1000.JPG"`
it can't find the file, but what i find more interesting is, that the encoding is wonky, so maybe it's already broken before it hits exiftool? I ran these lines:
# encoding: UTF-8
# äöü
require 'open3'
Encoding.default_external = 'UTF-8'
paths = [
"\"C:/tmp/test äöü/201412050001hq.jpg\"",
"\"C:\\tmp\\test äöü\\201412050001hq.jpg\""
]
paths.each do |path|
puts "## Run with path: #{path}"
puts "*backticks*\n"
out = `exiftool.exe #{path} 2>&1`
puts ' ' + out
puts ' ' + out.force_encoding(Encoding.find('filesystem')).encode('UTF-8')
puts "*popen3*\n"
stdin, stdout, _ = Open3.popen3("exiftool.exe #{path} 2>&1")
stdin.close
out = stdout.read
puts ' ' + out
puts ' ' + out.force_encoding(Encoding.find('filesystem')).encode('UTF-8')
end
which produces this output:
## Run with path: "C:/tmp/test äöü/201412050001hq.jpg"
*backticks*
File not found: C:/tmp/test 巼/201412050001hq.jpg
File not found: C:/tmp/test äöü/201412050001hq.jpg
*popen3*
File not found: C:/tmp/test 巼/201412050001hq.jpg
File not found: C:/tmp/test äöü/201412050001hq.jpg
## Run with path: "C:\tmp\test äöü\201412050001hq.jpg"
*backticks*
File not found: C:/tmp/test 巼/201412050001hq.jpg
File not found: C:/tmp/test äöü/201412050001hq.jpg
*popen3*
File not found: C:/tmp/test 巼/201412050001hq.jpg
File not found: C:/tmp/test äöü/201412050001hq.jpg
The broken characters may not end up correctly in here, so i also made a screenshot from Notepad++, where broken characters are displayed as hex:
(just to make sure: i ran the same test on Ruby 2.2.1x64 on windows just now. Same output)
This might be interesting: http://www.sno.phy.queensu.ca/~phil/exiftool/exiftool_pod.html#windows_unicode_file_names this has been introduced/changed on 2015-01-04. My tests have been with 9.90, so this might explain some of the encoding weirdness.
I tried specifying the -charset FileName=cp1252
(and UTF8
, while i was at it), it didn't change the file not found. As long as that does not work in any way, i don't think mini_exiftool is to blame. If i can't get to the file from a simple backtick or popen3, i don't think mini_exiftool can.
How should i continue? Do we close this ticket unresolved (upstream problem somewhere)? Do we leave it open?
I don't get it?! I can use umlaut files with multi_exiftool?! What the actual f*ck?! Sorry. I'm going to post an example over there in the next few hours.
Here's the change i did, that fixes umlauts and lets all of multi_exiftools tests pass. ccoenen/multi_exiftool@4b836a6 For some reason, though, i can't get it to work in backticks or popen3 (as described above).
I'm here few years late but I had the same issue, my fix was install ruby and exiftool over Linux and it worked perfectly.
Hope this comment will be helpful.
@ManuelSamudio12 The intention was to get it working under Windows. ;-)
Same here, trying to create context menu with batch file
REG ADD "HKCR\*\shell\ExifTool\command" /t REG_SZ /d "\"%systemroot%\system32\cmd.exe\" /K exiftool \"%%L\"" /f
works just fine if the file is in standard named folder, but not work if the file is in folders with special charactersm in my case the special character is Δ
but when i cd
to the directory contains special character, and then run open command prompt in that folder / cmd
> then type exiftool <file>
, it works just fine (exiftool shows information about the file)