icatproject/python-icat

Tests may fail with UnicodeDecodeError

Closed this issue · 1 comments

If no locale is active (e.g. the environment variable LANG is not set) tests mail fail with an error like UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 5979: ordinal not in range(128). This varies with the Python version:

  • Python 3.7: all tests pass.
  • Python 3.3 through 3.6: test_05_setup.py::test_init fails with the error message cited above.
  • Python 2.6 and 2.7: all tests pass.

What happens is the following: test_05_setup.py::test_init reads a file example_data.yaml with test data. The file is opened in text mode without specifying an encoding. This falls back to the default encoding, which "is platform dependent (whatever locale.getpreferredencoding() returns)." If no locale is set in the environment, the default encoding used by Python 3.3 through 3.6 is ascii. But the file contains non-ascii characters.

The behavior of the test may be simulated by:

import yaml
with open("doc/examples/example_data.yaml", "rt") as f:
    data = yaml.load(f)

which also reproduces the same error depending on the Python version. The issue might be fixed by explicitly setting the LC_CTYPE locale in the test suite:

import locale
import yaml
locale.setlocale(locale.LC_CTYPE, "en_US.UTF-8")
with open("doc/examples/example_data.yaml", "rt") as f:
    data = yaml.load(f)