taichi-dev/taichi

windows下中文用户名造成的加载失败问题,Loading failure caused by Chinese user name under windows

dyangrun opened this issue · 10 comments

Describe the bug
使用windows操作系统,且用户名为中文,运行测试用例失败,显示加载.bc失败
Using windows operating system and user name in Chinese, running test case failed, showing failure to load .bc

Log/Screenshots

C:\Users\杨子锐>py I:\源码资源\pythonTest\taichiTest\test.py
[Release mode]
[T 02/03/20 14:54:22.687] [logging.cpp:taichi::Logger::Logger@68] Taichi core started. Thread ID = 9768
[Taichi version 0.4.2, cpu only, commit 832915b0]
[I 02/03/20 14:54:22.725] [memory_pool.cpp:taichi::Tlang::MemoryPool::MemoryPool@14] Memory pool created. Default buffer size per allocator = 1024 MB
[I 02/03/20 14:54:22.726] [taichi_llvm_context.cpp:taichi::Tlang::TaichiLLVMContext::TaichiLLVMContext@57] Creating llvm context for arch: x86_64
[I 02/03/20 14:54:22.778] [C:\Users\鏉ㄥ瓙閿怽AppData\Local\Programs\Python\Python38\lib\site-packages\taichi\lang\impl.py:materialize@124] Materializing layout...
[D 02/03/20 14:54:22.779] [snode.cpp:taichi::Tlang::SNode::create_node@48] Non-power-of-two node size 640 promoted to 1024.
[D 02/03/20 14:54:22.780] [snode.cpp:taichi::Tlang::SNode::create_node@48] Non-power-of-two node size 320 promoted to 512.
[W 02/03/20 14:54:22.781] [taichi_llvm_context.cpp:taichi::Tlang::module_from_bitcode_file@168] Bitcode loading error message:
Invalid bitcode signature
[E 02/03/20 14:54:22.781] [taichi_llvm_context.cpp:taichi::Tlang::module_from_bitcode_file@170] Bitcode C:\Users\鏉ㄥ瓙閿怽AppData\Local\Programs\Python\Python38\Lib\site-packages\taichi\core\../lib/runtime_x86_64.bc load failure.
[E 02/03/20 14:54:22.782] Received signal 22 (SIGABRT)

To Reproduce

import taichi as ti
ti.cfg.debug = True
ti.cfg.arch = ti.x86_64 # Run on GPU by default

n = 320
pixels = ti.var(dt=ti.f32, shape=(n * 2, n))

@ti.func
def complex_sqr(z):
  return ti.Vector([z[0] * z[0] - z[1] * z[1], z[1] * z[0] * 2])

@ti.kernel
def paint(t: ti.f32):
  for i, j in pixels: # Parallized over all pixels
    c = ti.Vector([-0.8, ti.sin(t) * 0.2])
    z = ti.Vector([float(i) / n - 1, float(j) / n - 0.5]) * 2
    iterations = 0
    while z.norm() < 20 and iterations < 50:
      z = complex_sqr(z) + c
      iterations += 1
    pixels[i, j] = 1 - iterations * 0.02

gui = ti.GUI("Fractal", (n * 2, n))

for i in range(1000000):
  paint(i * 0.03)
  gui.set_image(pixels)
  gui.show()

If you have local commits (e.g. compile fixes before you reproduce the bug), please make sure you first make a PR to fix the build errors and then report the bug.

其实这不是个bug,是个环境问题,主要是因为正好我遇到了,也许之后也会有人遇到,所以这里提交一个Issue,也许其他语言的windows用户名也会出现类似的问题。

Non-English windows username can check if it fails to load

Thanks so much for reporting! Yeah, a lot of Chinese users report Invalid bitcode signature on Windows yet I was struggling to reproduce it. I guess it's probably caused by the Chinese characters in paths.

Could anyone help confirm that without the Chinese characters, [Taichi version 0.4.2] works correctly on Windows? Thanks in advance.

Thanks so much for reporting! Yeah, a lot of Chinese users report Invalid bitcode signature on Windows yet I was struggling to reproduce it. I guess it's probably caused by the Chinese characters in paths.

Could anyone help confirm that without the Chinese chararcters, [Taichi version 0.4.2] works correctly on Windows? Thanks in advance.

我试了,如果使用英文用户名是没有问题,只有中文用户名有这个问题
English user name is wokr fine in [Taichi version 0.4.2]
不过还有一个载入dll错误的问题,按照#370的方法解决了

太好了,原来真的是这个导致的,谢谢你的实验。早期设计的时候我确实没有考虑到这个问题。根据你的输出

[W 02/03/20 14:54:22.781] [taichi_llvm_context.cpp:taichi::Tlang::module_from_bitcode_file@168] Bitcode loading error message:
Invalid bitcode signature
[E 02/03/20 14:54:22.781] [taichi_llvm_context.cpp:taichi::Tlang::module_from_bitcode_file@170] Bitcode C:\Users\鏉ㄥ瓙閿怽AppData\Local\Programs\Python\Python38\Lib\site-packages\taichi\core\../lib/runtime_x86_64.bc load failure.

看起来应该是C:\Users\杨子锐\由于编码的问题被转换成了C:\Users\鏉ㄥ瓙閿怽A,导致runtime_x86_64.bc文件并没有被正确找到,于是就Invalid bitcode signature了。我对Windows下的中文路径编码并不是很熟悉,不知你有没有什么好的解决方法?

谢谢!

我编写了一个简单的测试小程序,结果显示:
"杨子锐"的UTF-8编码,以GBK来解码就是"鏉ㄥ瓙閿",而"\"则变成了"怽"。

我在网上查阅资料得知:
中文windows系统中默认采用的是GBK编码格式。
而python在向C++程序传递字符串时,却采用了UTF-8格式。

考虑将 python/taichi/core/util.py:39 修改为:

lib_dir = os.path.join(package_root(), 'lib')
if get_os_name() == 'win':
    lib_dir = lib_dir.encode('gbk') # 这里要能通过某个Windows的API获取路径编码格式就更好了
else:
    lib_dir = lib_dir.encode('utf-8')
core.set_lib_dir(lib_dir)

希望能解决问题。

原来如此。指定GBK确实能够解决Windows下汉字的问题,不知道有没有更系统性的方法,对俄文之类的语言也能支持?

可以使用locale模块检测当前系统采用的默认编码格式,无论是否Windows:

>>> import locale
>>> locale.getdefaultlocale()
('zh_CN', 'cp936')

其中CP936就是GBK,用起来完全一样:

>>> '二三三'.encode('cp936')
b'\xb6\xfe\xc8\xfd\xc8\xfd'
>>> '二三三'.encode('gbk')
b'\xb6\xfe\xc8\xfd\xc8\xfd'

cp=code page,是Windows系统对不同国家地区语言的一种编号,好比linux中的LC_*环境变量。比如936=GBK,65001=UTF-8。Windows用户可以通过cmd命令chcp查阅和修改当前终端的code page。

考虑这样写:

import locale
...
def locale_encode():
    try:    encoding = locale.getdefaultlocale()[1]
    except: encoding = 'utf-8'
    return x.encode(encoding)
...
core.set_lib_dir(locale_encode(lib_dir))

@archibate Cool! 这样听起来就比较系统的解决了这个问题。可否开一个PR加入这个解决方案?

看起来这个问题已经解决了~
fix this

抱歉翻出来一个老帖,请问一下你当时是不是在用 Python 3.5?

抱歉翻出来一个老帖,请问一下你当时是不是在用 Python 3.5?

抱歉,刚刚才看到邮件。不是3.5,我当时用的是3.8