Invisible UTF-8 BOM char (`ufeff`) at beginning of script causing error when communicating with Snowflake
dp-rp opened this issue · 1 comments
Describe the bug
When trying to run a schemachange script with UTF-8 with BOM
encoding, the BOM char causes an SQL compilation error: syntax error line 1 at position 0 unexpected '\ufeff-'
To Reproduce
Steps to reproduce the behavior:
- Save a change script with UTF-8 with BOM encoding (e.g. you can set the encoding in Visual Studio Code and save the file to add the invisible char)
- Try running the script
- See error
Expected behavior
Schemachange should ignore the zero width no-break space char during SQL compliation.
Alternatively, if UTF-8 (without BOM) encoding is a strict requirement, an error with a message explicitly stating only UTF-8 encoding is supported should be thrown.
Schemachange (please complete the following information):
- Version: 3.6.1
Additional context
A provisioning application runs in our pipeline that calls schemachange and passes through it's stdout/stderr, here are the logs from our pipeline (with potentially sensitive information redacted):
2024-04-29T05:34:27.6745790Z SchemaChange command and arguments: schemachange -f D:\a\1\a/#REDACTED# -a #REDACTED# -u #REDACTED# -r #REDACTED# -w #REDACTED# -d #REDACTED# -c #REDACTED#.#REDACTED#.CHANGE_HISTORY --config-folder D:\a\1\a --create-change-history-table
2024-04-29T05:34:27.6765806Z Checking env vars used by SchemaChange...
2024-04-29T05:34:27.6791928Z [ WARNING! ]: Env var '#REDACTED#' hasn't been set - this may cause templating issues!
2024-04-29T05:34:27.6809225Z Starting SchemaChange process...
2024-04-29T05:34:34.6945752Z schemachange version: 3.6.1
2024-04-29T05:34:34.6954833Z Using config file: D:\a\1\a\schemachange-config.yml
2024-04-29T05:34:34.6965562Z Using root folder D:\a\1\a\#REDACTED#
2024-04-29T05:34:34.6974900Z Using variables:
2024-04-29T05:34:34.6984616Z #REDACTED#
2024-04-29T05:34:34.7122057Z
2024-04-29T05:34:34.7131787Z Using Snowflake account #REDACTED#
2024-04-29T05:34:34.7140633Z Using default role #REDACTED#
2024-04-29T05:34:34.7150643Z Using default warehouse #REDACTED#
2024-04-29T05:34:34.7160010Z Using default database #REDACTED#schema None
2024-04-29T05:34:34.7175506Z Using change history table #REDACTED#.#REDACTED#.CHANGE_HISTORY (last altered 2024-04-11 23:03:45.095000-07:00)
2024-04-29T05:34:34.7184072Z Max applied change script version: 2.0.5
2024-04-29T05:34:34.7193314Z Applying change script V2.0.6__fix_item_views_to_latest.sql
2024-04-29T05:34:34.7198819Z
2024-04-29T05:34:34.7207798Z fail: #REDACTED#[0]
2024-04-29T05:34:34.7217115Z Traceback (most recent call last):
2024-04-29T05:34:34.7226246Z File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\runpy.py", line 197, in _run_module_as_main
2024-04-29T05:34:34.7235446Z return _run_code(code, main_globals, None,
2024-04-29T05:34:34.7246241Z File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\runpy.py", line 87, in _run_code
2024-04-29T05:34:34.7255130Z exec(code, run_globals)
2024-04-29T05:34:34.7265857Z File "C:\hostedtoolcache\windows\Python\3.9.13\x64\Scripts\schemachange.exe\__main__.py", line 7, in <module>
2024-04-29T05:34:34.7273977Z File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\site-packages\schemachange\cli.py", line 896, in main
2024-04-29T05:34:34.7282715Z deploy_command(config)
2024-04-29T05:34:34.7292136Z File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\site-packages\schemachange\cli.py", line 577, in deploy_command
2024-04-29T05:34:34.7301169Z session.apply_change_script(script, content, change_history_table)
2024-04-29T05:34:34.7310517Z File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\site-packages\schemachange\cli.py", line 462, in apply_change_script
2024-04-29T05:34:34.7319558Z self.execute_snowflake_query(script_content)
2024-04-29T05:34:34.7336931Z File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\site-packages\schemachange\cli.py", line 367, in execute_snowflake_query
2024-04-29T05:34:34.7346539Z raise e
2024-04-29T05:34:34.7355724Z File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\site-packages\schemachange\cli.py", line 360, in execute_snowflake_query
2024-04-29T05:34:34.7364960Z res = self.con.execute_string(query)
2024-04-29T05:34:34.7374077Z File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\site-packages\snowflake\connector\connection.py", line 861, in execute_string
2024-04-29T05:34:34.7383196Z ret = list(stream_generator)
2024-04-29T05:34:34.7392363Z File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\site-packages\snowflake\connector\connection.py", line 879, in execute_stream
2024-04-29T05:34:34.7401523Z cur.execute(sql, _is_put_get=is_put_or_get, **kwargs)
2024-04-29T05:34:34.7410897Z File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\site-packages\snowflake\connector\cursor.py", line 1080, in execute
2024-04-29T05:34:34.7419680Z Error.errorhandler_wrapper(self.connection, self, error_class, errvalue)
2024-04-29T05:34:34.7429705Z File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\site-packages\snowflake\connector\errors.py", line 290, in errorhandler_wrapper
2024-04-29T05:34:34.7438247Z handed_over = Error.hand_to_other_handler(
2024-04-29T05:34:34.7448153Z File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\site-packages\snowflake\connector\errors.py", line 345, in hand_to_other_handler
2024-04-29T05:34:34.7457166Z cursor.errorhandler(connection, cursor, error_class, error_value)
2024-04-29T05:34:34.7466966Z File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\site-packages\snowflake\connector\errors.py", line 221, in default_errorhandler
2024-04-29T05:34:34.7475864Z raise error_class(
2024-04-29T05:34:34.7484951Z snowflake.connector.errors.ProgrammingError: 001003 (42000): SQL compilation error:
2024-04-29T05:34:34.7495847Z syntax error line 1 at position 0 unexpected '\ufeff-'.
2024-04-29T05:34:34.7501343Z
2024-04-29T05:34:34.7528544Z fail: #REDACTED#[0]
2024-04-29T05:34:34.7537519Z SchemaChange failed with exit code 1.
2024-04-29T05:34:34.8940991Z fail: #REDACTED#[0]
2024-04-29T05:34:34.8950908Z Failed to provision the tenant database.
Thank you for reporting the issue. We are relying on the snowflake-python-connector instead of adding additional checks for various encodings. Will table this for in a future release.