Parsing error on valid SQL using string literals with single quotes
Opened this issue · 2 comments
Describe the bug
sqlfmt fails to parse valid SQL when using string literals that contain single quotes.
To Reproduce
select '''' || 'quoted_text' || ''''
Expected behavior
sqlfmt runs without errors.
Actual behavior
1 file had errors while formatting.
0 files left unchanged.
test.sql
sqlfmt encountered an error: Could not parse SQL at position 35: '''
Additional context
Reproducible from https://sqlfmt.com/
These two are formatted without errors by sqlfmt, so it's only some specific combinations that break the parsing:
select '''quoted_text'''
select '''' || 'quoted_text'
My guess is we're parsing this as a triple-quoted expression followed by a single quote. We get this right (postgres dollar sign quotes):
select $$'$$ || 'quoted_text' || $$'$$
I thought triple quotes for string literals were more common, but a quick search of pg, mysql, sql server, snowflake, and bq shows that only bigquery allows them. maybe we should pull those out into a bq-specific dialect.
Hey! Is there a date for when this will be fixed? Having the same issue here. And, I'd guess it's potentially recurrent, since the replacement of single quotes is a quite common data quality practice.
Quick Examples:
- in snowflake -> "select '''3229''' as test, REPLACE(test, '''', '') as new",
- in sql server syntax select '''string''' as test, replace('''string''','''','') as test2, etc.
Thanks! :)
My guess is we're parsing this as a triple-quoted expression followed by a single quote. We get this right (postgres dollar sign quotes):
select $$'$$ || 'quoted_text' || $$'$$
I thought triple quotes for string literals were more common, but a quick search of pg, mysql, sql server, snowflake, and bq shows that only bigquery allows them. maybe we should pull those out into a bq-specific dialect.