stac-utils/pgstac

Windows UTF-8 Encoding

tariqksoliman opened this issue · 0 comments

In Connecting pgstac to an external Postgresql #203 the following error was mentioned:

UnicodeDecodeError: 'charmap' codec can't decode byte 0x8d in position 8173: character maps to <undefined>

The issue regards encoding and an immediate workaround on Windows systems is to set the following ENV:

set PYTHONUTF8=1

The old error in full was:

C:\Users\tsoliman\Documents>pypgstac migrate
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "C:\Users\tsoliman\AppData\Roaming\Python\Python311\Scripts\pypgstac.exe\__main__.py", line 7, in <module>
  File "C:\Users\tsoliman\AppData\Roaming\Python\Python311\site-packages\pypgstac\pypgstac.py", line 125, in cli
    fire.Fire(PgstacCLI)
  File "C:\Users\tsoliman\AppData\Roaming\Python\Python311\site-packages\fire\core.py", line 141, in Fire
    component_trace = _Fire(component, args, parsed_flag_args, context, name)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\tsoliman\AppData\Roaming\Python\Python311\site-packages\fire\core.py", line 466, in _Fire
    component, remaining_args = _CallAndUpdateTrace(
                                ^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\tsoliman\AppData\Roaming\Python\Python311\site-packages\fire\core.py", line 681, in _CallAndUpdateTrace
    component = fn(*varargs, **kwargs)
                ^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\tsoliman\AppData\Roaming\Python\Python311\site-packages\pypgstac\pypgstac.py", line 61, in migrate
    return migrator.run_migration(toversion=toversion)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\tsoliman\AppData\Roaming\Python\Python311\site-packages\pypgstac\migrate.py", line 148, in run_migration
    migration_sql = get_sql(file)
                    ^^^^^^^^^^^^^
  File "C:\Users\tsoliman\AppData\Roaming\Python\Python311\site-packages\pypgstac\migrate.py", line 104, in get_sql
    sqlstrs.extend(fd.readlines())
                   ^^^^^^^^^^^^^^
  File "C:\Program Files\Python311\Lib\encodings\cp1252.py", line 23, in decode
    return codecs.charmap_decode(input,self.errors,decoding_table)[0]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
UnicodeDecodeError: 'charmap' codec can't decode byte 0x8d in position 1085: character maps to <undefined>

Explicitly defining encoding='utf-8' somewhere likely fixes this for Windows users. Again, workaround is trivial though.

Thanks!