goccy/bigquery-emulator

The BigQuery emulator doesn't support CREATE TABLE AS SELECT with column definition list

pablomarti opened this issue · 2 comments

What happened?

I tried to run a query CREATE TABLE AS SELECT with the BigQuery emulator and it failed with the error:

Failed to create compacted log: 500 failed to analyze: INVALID_ARGUMENT: CREATE TABLE AS SELECT with column definition list is unsupported [at 2:1]; message: failed to analyze: INVALID_ARGUMENT: CREATE TABLE AS SELECT with column definition list is unsupported [at 2:1], reason: jobInternalError

What did you expect to happen?

The BigQuery emulator should be able to support queries CREATE TABLE AS SELECT that include column definition lists.

How can we reproduce it (as minimally and precisely as possible)?

You can reproduce this error by running a query like this one:

CREATE TABLE IF NOT EXISTS {{project}}.{{dataset}}.table2
(
  field_x STRING NOT NULL,
  field_y STRING,
  field_z STRING
)
AS (
SELECT 
  field_a,
  field_b,
  field_c
FROM 
  `{{project}}.{{dataset}}.table1`
)

Anything else we need to know?

I think this is a limitation with SQLite and not an error with the codebase of the BigQuery emulator, after reading the docs https://www.sqlite.org/lang_createtable.html I think it is not possible to pass the column definitions (see points 2 and 3.1).

I still think the BigQuery emulator should take care of this because the emulator's purpose is to convert BigQuery compatible SQL to a SQLite compatible version that produces the same outcome.

One solution can be split the CREATE TABLE AS SELECT into two queries, one that creates the table, and the other one that runs a INSERT INTO [table] SELECT (I tested the last one and it works).

Having this feature CREATE TABLE AS SELECT is important because it allows to define the name of the column and if it allows null values.

Thank you for your work and for setting this BigQuery emulator, it is an excellent project!

Yes, we should update go-zetasqlite to format the query correctly.

The ZetaSQL AST includes ResolvedCreateTableAsSelectStmt.output_column_list() which should give us what we need to format it into SQLite.
https://github.com/google/zetasql/blob/589026c410c42de9aa8ee92ad16f745977140041/docs/resolved_ast.md#resolvedcreatetableasselectstmt

Here's the function that is currently responsible for create table formatting
https://github.com/goccy/go-zetasqlite/blob/main/internal/spec.go#L130