Migrations fail in upgrade from 4.x to 5.x with replicated database
william00179 opened this issue · 7 comments
Describe the bug
When the automatic migrations run on boot after upgrading from 4.x to 5.8.2 the following error is observed on startup.
[2023-07-21T03:14:10.526] [ERROR] server-impl.js - Failed to migrate db error: cannot delete from table "role_permission" because it does not have a replica identity and publishes deletes
at Parser.parseErrorMessage (/unleash/node_modules/pg-protocol/dist/parser.js:287:98)
at Parser.handlePacket (/unleash/node_modules/pg-protocol/dist/parser.js:126:29)
at Parser.parse (/unleash/node_modules/pg-protocol/dist/parser.js:39:38)
at TLSSocket.<anonymous> (/unleash/node_modules/pg-protocol/dist/index.js:11:42)
at TLSSocket.emit (node:events:513:28)
at addChunk (node:internal/streams/readable:324:12)
at readableAddChunk (node:internal/streams/readable:297:9)
at Readable.push (node:internal/streams/readable:234:10)
at TLSWrap.onStreamRead (node:internal/stream_base_commons:190:23) {
length: 261,
severity: 'ERROR',
code: '55000',
detail: undefined,
hint: 'To enable deleting from the table, set REPLICA IDENTITY using ALTER TABLE.',
position: undefined,
internalPosition: undefined,
internalQuery: undefined,
where: undefined,
schema: undefined,
table: undefined,
column: undefined,
dataType: undefined,
constraint: undefined,
file: 'execReplication.c',
line: '641',
routine: 'CheckCmdReplicaIdentity'
}
[ERROR] error: cannot delete from table "role_permission" because it does not have a replica identity and publishes deletes
at Parser.parseErrorMessage (/unleash/node_modules/pg-protocol/dist/parser.js:287:98)
at Parser.handlePacket (/unleash/node_modules/pg-protocol/dist/parser.js:126:29)
at Parser.parse (/unleash/node_modules/pg-protocol/dist/parser.js:39:38)
at TLSSocket.<anonymous> (/unleash/node_modules/pg-protocol/dist/index.js:11:42)
at TLSSocket.emit (node:events:513:28)
at addChunk (node:internal/streams/readable:324:12)
at readableAddChunk (node:internal/streams/readable:297:9)
at Readable.push (node:internal/streams/readable:234:10)
at TLSWrap.onStreamRead (node:internal/stream_base_commons:190:23)
When using replication with postgres all tables must have a PK or replica identity
Steps to reproduce the bug
No response
Expected behavior
The role_permission
table should have a PK or replica identity.
Logs, error output, etc.
No response
Screenshots
No response
Additional context
No response
Unleash version
No response
Subscription type
Open source
Hosting type
Self-hosted
SDK information (language and version)
No response
Hi @william00179 can you help me reproduce this?
I've started with a docker-compose.yml file which I usually use for testing, modified it to use a replicated database, and configured Unleash to use v4 (image: unleashorg/unleash-server:4) to later upgrade it to v5, but failing to reproduce the issue:
version: "3.9"
services:
unleash:
image: unleashorg/unleash-server:4
ports:
- "4242:4242"
environment:
DATABASE_URL: "postgres://postgres:unleash@db/unleash"
DATABASE_SSL: "false"
LOG_LEVEL: "debug"
INIT_ADMIN_API_TOKENS: "*:*.unleash-insecure-admin-api-token"
depends_on:
db:
condition: service_healthy
command: ["node", "index.js"]
healthcheck:
test: wget --no-verbose --tries=1 --spider http://localhost:4242/health || exit 1
interval: 1s
timeout: 1m
retries: 5
start_period: 15s
db:
expose:
- "5432"
image: bitnami/postgresql:15
environment:
# trust incoming connections blindly (DON'T DO THIS IN PRODUCTION!)
POSTGRES_HOST_AUTH_METHOD: "trust"
POSTGRESQL_PGAUDIT_LOG: READ,WRITE
POSTGRESQL_LOG_HOSTNAME: true
POSTGRESQL_REPLICATION_MODE: master
POSTGRESQL_REPLICATION_USER: repl_user
POSTGRESQL_REPLICATION_PASSWORD: repl_password
POSTGRESQL_DATABASE: unleash
ALLOW_EMPTY_PASSWORD: yes
healthcheck:
test:
[
"CMD",
"pg_isready",
"--username=postgres",
"--host=127.0.0.1",
"--port=5432",
]
interval: 2s
timeout: 1m
retries: 5
start_period: 10s
db-replica:
image: bitnami/postgresql:15
ports:
- "5432"
depends_on:
- db
environment:
POSTGRESQL_MASTER_HOST: db
POSTGRESQL_PGAUDIT_LOG: READ,WRITE
POSTGRESQL_LOG_HOSTNAME: true
POSTGRESQL_REPLICATION_MODE: slave
POSTGRESQL_REPLICATION_USER: repl_user
POSTGRESQL_REPLICATION_PASSWORD: repl_password
POSTGRESQL_MASTER_PORT_NUMBER: 5432
ALLOW_EMPTY_PASSWORD: yes
- I started the db in background (
docker compose up db db-replica -d
), then Unleash v4 (docker compose up unleash -d
). - I waited and opened the UI to validate everything was working and connected to the replica to validate some test data was replicated.
- I then stopped unleash container (
docker compose down unleash
), updated the version of Unleash in the docker-compose.yml to 5 - I started again the container (
docker compose up unleash -d
), and everything worked as expected.
I'm thinking maybe there are special conditions such as specific DB configuration, or specific unleash versions (notice I've used latest v4 and latest v5).
If possible, try to provide more details or a step-by-step way for reproducing the issue.
Closed due to inactivity.
This is because there is no primary key on the role_permission
table its unable to be replicated.
We experience this issue with AWS Aurora Postgres with a writer / reader cluster.
@william00179 do you have any advice on how this can be reproduced or solved? I've tried reproducing this in the past but didn't succeed
This can be solved by supplying a primary key for this table.
Postgres is unable to replicate the row without a primary key or replica identity being defined.
I think we should add primary keys to all tables that have it missing.
- role_permission
- api_token_project
- project_stats
I will handle it.
This should be solved in next release