pg_sample doesn't correctly handle generated columns
nicholasd-ff opened this issue · 6 comments
If you create a sample database with a single table as follows:
CREATE TABLE some_numbers(
id int PRIMARY KEY GENERATED ALWAYS AS IDENTITY
, important_value int
, double_value int GENERATED ALWAYS AS (important_value*2) STORED
);
INSERT INTO some_numbers(important_value) (SELECT * FROM generate_series(1,100));
Then dump the table using:
pg_sample --limit="some_numbers=10%" --file=pg_sample_generated_bug.sql
You get a dump file with the generated columns in, which you can't import because the schema prevents it:
\i dump_files/pg_sample_generated_bug.sql
SET
SET
SET
SET
SET
set_config
------------
(1 row)
SET
SET
SET
SET
SET
SET
CREATE TABLE
ALTER TABLE
ALTER TABLE
ALTER TABLE
SET
SET
SET
SET
SET
ALTER TABLE
psql:dump_files/pg_sample_generated_bug.sql:85: ERROR: extra data after last expected column
CONTEXT: COPY some_numbers, line 1: "14 14 28"
ALTER TABLE
Note that this bug doesn't occur with raw pg_dump, which correctly omits the generated column.
Looks like we can omit the generated columns by querying the information schema:
SELECT column_name, is_generated FROM information_schema.columns WHERE table_name='some_numbers'
column_name | is_generated
-----------------+--------------
id | NEVER
important_value | NEVER
double_value | ALWAYS
I can try and make the change and make a PR if that's useful? I've not written much Perl lately but I think the change would only be a couple of lines around line 696 to extract columns that are not generated:
# Only extract non-generated columns
my ($cols_to_copy) = $dbh->selectrow_array(qq{
SELECT string_agg(column_name, ', ')
FROM information_schema.columns
WHERE table_name=?
AND table_schema=?
AND is_generated='NEVER'
}, undef, ($table->table, $table->schema) );
notice "only copying cols [$cols_to_copy] ";
$dbh->do(qq{
CREATE $unlogged TABLE $sample_table AS
SELECT $cols_to_copy
FROM ONLY $table
$tablesample
WHERE $where
$order
$limit
});
Thank you, @nicholasd-ff ! I will try to review your changes tomorrow.
@mla I wrote a test and found a bug, so I made a PR with the proper fix for you. Hope that's OK!
It currently depends on my earlier PR to update the base image, but if that's a problem I can unstack them for you.
Hm, and I've encountered another bug with the referential integrity stuff I thought I'd covered that with the test but obviously not. I'll investigate and update the PR because I think my generated table needs a FK constraint to properly check the functionality.
And the bug is now fixed, and the PR updated.