Unleash/unleash

Migrations fail in upgrade from 4.x to 5.x with replicated database

william00179 opened this issue · 7 comments

Describe the bug

When the automatic migrations run on boot after upgrading from 4.x to 5.8.2 the following error is observed on startup.

[2023-07-21T03:14:10.526] [ERROR] server-impl.js - Failed to migrate db error: cannot delete from table "role_permission" because it does not have a replica identity and publishes deletes
    at Parser.parseErrorMessage (/unleash/node_modules/pg-protocol/dist/parser.js:287:98)
    at Parser.handlePacket (/unleash/node_modules/pg-protocol/dist/parser.js:126:29)
    at Parser.parse (/unleash/node_modules/pg-protocol/dist/parser.js:39:38)
    at TLSSocket.<anonymous> (/unleash/node_modules/pg-protocol/dist/index.js:11:42)
    at TLSSocket.emit (node:events:513:28)
    at addChunk (node:internal/streams/readable:324:12)
    at readableAddChunk (node:internal/streams/readable:297:9)
    at Readable.push (node:internal/streams/readable:234:10)
    at TLSWrap.onStreamRead (node:internal/stream_base_commons:190:23) {
  length: 261,
  severity: 'ERROR',
  code: '55000',
  detail: undefined,
  hint: 'To enable deleting from the table, set REPLICA IDENTITY using ALTER TABLE.',
  position: undefined,
  internalPosition: undefined,
  internalQuery: undefined,
  where: undefined,
  schema: undefined,
  table: undefined,
  column: undefined,
  dataType: undefined,
  constraint: undefined,
  file: 'execReplication.c',
  line: '641',
  routine: 'CheckCmdReplicaIdentity'
}
[ERROR] error: cannot delete from table "role_permission" because it does not have a replica identity and publishes deletes
    at Parser.parseErrorMessage (/unleash/node_modules/pg-protocol/dist/parser.js:287:98)
    at Parser.handlePacket (/unleash/node_modules/pg-protocol/dist/parser.js:126:29)
    at Parser.parse (/unleash/node_modules/pg-protocol/dist/parser.js:39:38)
    at TLSSocket.<anonymous> (/unleash/node_modules/pg-protocol/dist/index.js:11:42)
    at TLSSocket.emit (node:events:513:28)
    at addChunk (node:internal/streams/readable:324:12)
    at readableAddChunk (node:internal/streams/readable:297:9)
    at Readable.push (node:internal/streams/readable:234:10)
    at TLSWrap.onStreamRead (node:internal/stream_base_commons:190:23)

When using replication with postgres all tables must have a PK or replica identity

Steps to reproduce the bug

No response

Expected behavior

The role_permission table should have a PK or replica identity.

Logs, error output, etc.

No response

Screenshots

No response

Additional context

No response

Unleash version

No response

Subscription type

Open source

Hosting type

Self-hosted

SDK information (language and version)

No response

Hi @william00179 can you help me reproduce this?

I've started with a docker-compose.yml file which I usually use for testing, modified it to use a replicated database, and configured Unleash to use v4 (image: unleashorg/unleash-server:4) to later upgrade it to v5, but failing to reproduce the issue:

version: "3.9"
services:
  unleash:
    image: unleashorg/unleash-server:4
    ports:
      - "4242:4242"
    environment:
      DATABASE_URL: "postgres://postgres:unleash@db/unleash"
      DATABASE_SSL: "false"
      LOG_LEVEL: "debug"
      INIT_ADMIN_API_TOKENS: "*:*.unleash-insecure-admin-api-token"
    depends_on:
      db:
        condition: service_healthy
    command: ["node", "index.js"]
    healthcheck:
      test: wget --no-verbose --tries=1 --spider http://localhost:4242/health || exit 1
      interval: 1s
      timeout: 1m
      retries: 5
      start_period: 15s
  db:
    expose:
      - "5432"
    image: bitnami/postgresql:15
    environment:
      # trust incoming connections blindly (DON'T DO THIS IN PRODUCTION!)
      POSTGRES_HOST_AUTH_METHOD: "trust"
      POSTGRESQL_PGAUDIT_LOG: READ,WRITE
      POSTGRESQL_LOG_HOSTNAME: true
      POSTGRESQL_REPLICATION_MODE: master
      POSTGRESQL_REPLICATION_USER: repl_user
      POSTGRESQL_REPLICATION_PASSWORD: repl_password
      POSTGRESQL_DATABASE: unleash
      ALLOW_EMPTY_PASSWORD: yes
    healthcheck:
      test:
        [
          "CMD",
          "pg_isready",
          "--username=postgres",
          "--host=127.0.0.1",
          "--port=5432",
        ]
      interval: 2s
      timeout: 1m
      retries: 5
      start_period: 10s
  db-replica:
    image: bitnami/postgresql:15
    ports:
      - "5432"
    depends_on:
      - db
    environment:
      POSTGRESQL_MASTER_HOST: db
      POSTGRESQL_PGAUDIT_LOG: READ,WRITE
      POSTGRESQL_LOG_HOSTNAME: true
      POSTGRESQL_REPLICATION_MODE: slave
      POSTGRESQL_REPLICATION_USER: repl_user
      POSTGRESQL_REPLICATION_PASSWORD: repl_password
      POSTGRESQL_MASTER_PORT_NUMBER: 5432
      ALLOW_EMPTY_PASSWORD: yes
  • I started the db in background (docker compose up db db-replica -d), then Unleash v4 (docker compose up unleash -d).
  • I waited and opened the UI to validate everything was working and connected to the replica to validate some test data was replicated.
  • I then stopped unleash container (docker compose down unleash), updated the version of Unleash in the docker-compose.yml to 5
  • I started again the container (docker compose up unleash -d), and everything worked as expected.

I'm thinking maybe there are special conditions such as specific DB configuration, or specific unleash versions (notice I've used latest v4 and latest v5).

If possible, try to provide more details or a step-by-step way for reproducing the issue.

Closed due to inactivity.

This is because there is no primary key on the role_permission table its unable to be replicated.

We experience this issue with AWS Aurora Postgres with a writer / reader cluster.

@william00179 do you have any advice on how this can be reproduced or solved? I've tried reproducing this in the past but didn't succeed

This can be solved by supplying a primary key for this table.

Postgres is unable to replicate the row without a primary key or replica identity being defined.

I think we should add primary keys to all tables that have it missing.

  1. role_permission
  2. api_token_project
  3. project_stats

I will handle it.

This should be solved in next release