Explicitly opening CMakeLists.txt with UTF-8 encoding

Question

Explicitly opening CMakeLists.txt with UTF-8 encoding

vietjtnguyen opened this issue 3 years ago · 2 comments

This is a minor thing, but we had copy-pasted a comment into our deployment's CMakeList.txt which included non-ASCII quotes:

# FPP needs location files to detect where symbols come from.  This locations file needs to be built.  In order to do so, we run an internal CMake build specifically to generate these files.  It runs the same CMake files as the normal build.  FPRIME_FPP_LOCS_BUILD is set to true when it is running that build and false otherwise.
# You need to run that check in v3.RC2 because that special build does not define a “${PROJECT_NAME}” target, and thus the target_compile_options will all fail with “missing target”.  I have a fix submitted for v3 that will make that check unnecessary in the final release candidate.
# TODO (vnguyen): When we baseline against the officially release fprime v3 this should be removed.

The open() Python built-in (https://docs.python.org/3/library/functions.html#open) by default opens with rt which is read text mode which uses some default encoding. Apparently this encoding in our cross compilation environment is ASCII which caused fprime-util to barf on fprime-util generate:

$ fprime-util generate -DCMAKE_TOOLCHAIN_FILE=/opt/cross_toolchain/aarch64-gnu-4.9.toolchain.cmake
Traceback (most recent call last):
  File "/usr/bin/fprime-util", line 33, in <module>
    sys.exit(load_entry_point('fprime-tools==2.0.2', 'console_scripts', 'fprime-util')())
  File "/usr/lib/python3.6/site-packages/fprime_tools-2.0.2-py3.6.egg/fprime/util/__main__.py", line 14, in main
  File "/usr/lib/python3.6/site-packages/fprime_tools-2.0.2-py3.6.egg/fprime/util/build_helper.py", line 330, in utility_entry
  File "/usr/lib/python3.6/site-packages/fprime_tools-2.0.2-py3.6.egg/fprime/fbuild/builder.py", line 519, in find_nearest_deployment
  File "/usr/lib/python3.6/encodings/ascii.py", line 26, in decode
    return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 2000: ordinal not in range(128)

The following git patch explicitly opens with utf8 encoding. Not sure what the ramifications of this would be on other projects and users, but here it is for your consideration.

diff --git a/src/fprime/fbuild/builder.py b/src/fprime/fbuild/builder.py
index 5e7a2f8..76b66df 100644
--- a/src/fprime/fbuild/builder.py
+++ b/src/fprime/fbuild/builder.py
@@ -524,7 +524,7 @@ class Build:
         if not full_path.parents:
             raise UnableToDetectDeploymentException()
         if list_file.exists():
-            with open(list_file) as file_handle:
+            with open(list_file, encoding='utf8') as file_handle:
                 text = file_handle.read()
             if Build.VALID_CMAKE_LIST.search(text):
                 return full_path
diff --git a/src/fprime/fbuild/cmake.py b/src/fprime/fbuild/cmake.py
index 3e215ef..b7e85c8 100644
--- a/src/fprime/fbuild/cmake.py
+++ b/src/fprime/fbuild/cmake.py
@@ -393,7 +393,7 @@ class CMakeHandler:
         if not os.path.isfile(cmake_file):
             raise CMakeProjectException(source_dir, "No CMakeLists.txt is defined")
         # Test the cmake_file for project(
-        with open(cmake_file) as file_handle:
+        with open(cmake_file, encoding='utf8') as file_handle:
             project_lines = list(
                 filter(lambda line: "project(" in line, file_handle.readlines())
             )

EDIT: I should note the patch is based on 3819631ece03008d75bfa9ad978b9298f651c85c

Answer 1 · 2021-12-08T21:17:07.000Z

This should be a pretty effective change. Plain ASCII should work in UTF-8 but UTF-8 would also work in UTF-8.

Answer 2 · 2022-01-13T00:37:06.000Z

After digging, it seems that python will use the default platform encoding as set by the user's local. CMake supports "UTF-8" syntax and thus opening CMake files seems reasonable to force UTF-8.