3.0.1 database upgrade failure
wolfchimneyrock opened this issue · 4 comments
Description
Registry
Version: 3.0.0.M5 -> 3.0.1 upgrade
Persistence type: postgresql
We perform database schema upgrades with a special 'bootstrap' instance of apicurio which has admin privileges to our postgres database (and can thus create tables, etc) whereas the client serving apicurio instances run with either read-only or read-write (but not db/table admin) access.
typically the bootstrap process starts an admin instance of apicurio, and then waits until /health/ready reports that the storage layer is ready, then stops the admin apicurio (assuming the database was successfully upgraded)
this worked for all of the 2.x upgrades we handled.
with 3.0.x upgrade from db version 100 to 101, it looks like the /health/ready endpoint reports the storage is available before the database schema change is finished - resulting in further failures to start apicurio 3.0.1. rolling back to 3.0.0M5 works fine, since the new tables are ignored.
Steps to Reproduce
start apicurio 3.0.1 with a db version 100
poll /health/ready until response code 200 returned
terminate apicurio using SIGTERM
Expected vs Actual Behaviour
expected: database is fully upgraded before /health/ready reports 200 code.
actual: database upgrade is still in progress when /health/ready reports 200.
Alternatively, the database upgrade could be idempotent and could resume a partial interrupted application.
Logs
WARN [io.qua.run.con.DeprecatedRuntimePropertiesRecorder] (main) The 'quarkus.log.console.color' config property is deprecated and should not be used anymore
INFO [io.api.reg.sto.imp.sql.SqlRegistryStorage] (main) SqlRegistryStorage constructed successfully.
INFO [io.api.reg.sto.imp.sql.RegistryDatasourceProducer] (main) Using postgresql SQL storage.
INFO [io.api.reg.sto.imp.sql.SqlRegistryStorage] (main) Checking to see if the DB is initialized.
INFO [io.api.reg.sto.imp.sql.SqlRegistryStorage] (main) Database was already initialized, skipping.
INFO [io.api.reg.sto.imp.sql.SqlRegistryStorage] (main) Checking to see if the DB is up-to-date.
INFO [io.api.reg.sto.imp.sql.SqlRegistryStorage] (main) Build's DB version is 101
INFO [io.api.reg.sto.imp.sql.SqlRegistryStorage] (main) Old database version detected, upgrading.
INFO [io.api.reg.sto.imp.sql.SqlRegistryStorage] (main) Upgrading the Apicurio Hub API database.
INFO [io.api.reg.sto.imp.sql.SqlRegistryStorage] (main) Database type: postgresql
INFO [io.api.reg.sto.imp.sql.SqlRegistryStorage] (main) From Version: 100
INFO [io.api.reg.sto.imp.sql.SqlRegistryStorage] (main) To Version: 101
ERROR [io.qua.run.Application] (main) Failed to start application (with profile [prod]): java.lang.RuntimeException: Failed to start quarkus
at io.quarkus.runner.ApplicationImpl.doStart(Unknown Source)
at io.quarkus.runtime.Application.start(Application.java:101)
at io.quarkus.runtime.ApplicationLifecycleManager.run(ApplicationLifecycleManager.java:111)
at io.quarkus.runtime.Quarkus.run(Quarkus.java:71)
at io.quarkus.runtime.Quarkus.run(Quarkus.java:44)
at io.quarkus.runtime.Quarkus.run(Quarkus.java:124)
at io.quarkus.runner.GeneratedMain.main(Unknown Source)
Caused by: io.apicurio.registry.storage.impl.sql.jdb.RuntimeSqlException: org.postgresql.util.PSQLException: ERROR: relation "outbox" already exists
at io.apicurio.registry.storage.impl.sql.jdb.UpdateImpl.execute(UpdateImpl.java:28)
at io.apicurio.registry.storage.impl.sql.AbstractSqlRegistryStorage.lambda$upgradeDatabaseRaw$3(AbstractSqlRegistryStorage.java:344)
at java.base/java.util.ArrayList.forEach(ArrayList.java:1511)
at io.apicurio.registry.storage.impl.sql.AbstractSqlRegistryStorage.upgradeDatabaseRaw(AbstractSqlRegistryStorage.java:337)
at io.apicurio.registry.storage.impl.sql.AbstractSqlRegistryStorage.lambda$initialize$0(AbstractSqlRegistryStorage.java:234)
at io.apicurio.registry.storage.impl.sql.AbstractHandleFactory.withHandle(AbstractHandleFactory.java:45)
at io.apicurio.registry.storage.impl.sql.AbstractHandleFactory.withHandleNoException(AbstractHandleFactory.java:98)
at io.apicurio.registry.storage.impl.sql.HandleFactoryProducer_ProducerMethod_produceHandleFactory_NU1A3ot6FkfWU5W0o_6TCeQ4KO4_ClientProxy.withHandleNoException(Unknown Source)
at io.apicurio.registry.storage.impl.sql.AbstractSqlRegistryStorage.initialize(AbstractSqlRegistryStorage.java:223)
at io.apicurio.registry.storage.impl.sql.SqlRegistryStorage_Subclass.initialize$$superforward(Unknown Source)
at io.apicurio.registry.storage.impl.sql.SqlRegistryStorage_Subclass$$function$$110.apply(Unknown Source)
at io.quarkus.arc.impl.AroundInvokeInvocationContext.proceed(AroundInvokeInvocationContext.java:73)
at io.quarkus.arc.impl.AroundInvokeInvocationContext$NextAroundInvokeInvocationContext.proceed(AroundInvokeInvocationContext.java:97)
at io.apicurio.common.apps.logging.LoggingInterceptor.logMethodEntry(LoggingInterceptor.java:53)
at io.apicurio.common.apps.logging.LoggingInterceptor_Bean.intercept(Unknown Source)
at io.quarkus.arc.impl.InterceptorInvocation.invoke(InterceptorInvocation.java:42)
at io.quarkus.arc.impl.AroundInvokeInvocationContext.proceed(AroundInvokeInvocationContext.java:70)
at io.quarkus.arc.impl.AroundInvokeInvocationContext$NextAroundInvokeInvocationContext.proceed(AroundInvokeInvocationContext.java:97)
at io.apicurio.registry.metrics.health.readiness.PersistenceTimeoutReadinessInterceptor.intercept(PersistenceTimeoutReadinessInterceptor.java:29)
at io.apicurio.registry.metrics.health.readiness.PersistenceTimeoutReadinessInterceptor_Bean.intercept(Unknown Source)
at io.quarkus.arc.impl.InterceptorInvocation.invoke(InterceptorInvocation.java:42)
at io.quarkus.arc.impl.AroundInvokeInvocationContext.proceed(AroundInvokeInvocationContext.java:70)
at io.quarkus.arc.impl.AroundInvokeInvocationContext$NextAroundInvokeInvocationContext.proceed(AroundInvokeInvocationContext.java:97)
at io.apicurio.registry.metrics.health.liveness.PersistenceExceptionLivenessInterceptor.intercept(PersistenceExceptionLivenessInterceptor.java:25)
at io.apicurio.registry.metrics.health.liveness.PersistenceExceptionLivenessInterceptor_Bean.intercept(Unknown Source)
at io.quarkus.arc.impl.InterceptorInvocation.invoke(InterceptorInvocation.java:42)
at io.quarkus.arc.impl.AroundInvokeInvocationContext.proceed(AroundInvokeInvocationContext.java:70)
at io.quarkus.arc.impl.AroundInvokeInvocationContext.proceed(AroundInvokeInvocationContext.java:62)
at io.apicurio.registry.metrics.StorageMetricsInterceptor.intercept(StorageMetricsInterceptor.java:41)
at io.apicurio.registry.metrics.StorageMetricsInterceptor_Bean.intercept(Unknown Source)
at io.quarkus.arc.impl.InterceptorInvocation.invoke(InterceptorInvocation.java:42)
at io.quarkus.arc.impl.AroundInvokeInvocationContext.perform(AroundInvokeInvocationContext.java:30)
at io.quarkus.arc.impl.InvocationContexts.performAroundInvoke(InvocationContexts.java:27)
at io.apicurio.registry.storage.impl.sql.SqlRegistryStorage_Subclass.initialize(Unknown Source)
at io.apicurio.registry.storage.impl.sql.SqlRegistryStorage.initialize(SqlRegistryStorage.java:35)
at io.apicurio.registry.storage.impl.sql.SqlRegistryStorage_Subclass.initialize$$superforward(Unknown Source)
at io.apicurio.registry.storage.impl.sql.SqlRegistryStorage_Subclass$$function$$109.apply(Unknown Source)
at io.quarkus.arc.impl.AroundInvokeInvocationContext.proceed(AroundInvokeInvocationContext.java:73)
at io.quarkus.arc.impl.AroundInvokeInvocationContext$NextAroundInvokeInvocationContext.proceed(AroundInvokeInvocationContext.java:97)
at io.apicurio.common.apps.logging.LoggingInterceptor.logMethodEntry(LoggingInterceptor.java:53)
at io.apicurio.common.apps.logging.LoggingInterceptor_Bean.intercept(Unknown Source)
at io.quarkus.arc.impl.InterceptorInvocation.invoke(InterceptorInvocation.java:42)
at io.quarkus.arc.impl.AroundInvokeInvocationContext.proceed(AroundInvokeInvocationContext.java:70)
at io.quarkus.arc.impl.AroundInvokeInvocationContext$NextAroundInvokeInvocationContext.proceed(AroundInvokeInvocationContext.java:97)
at io.apicurio.registry.metrics.health.readiness.PersistenceTimeoutReadinessInterceptor.intercept(PersistenceTimeoutReadinessInterceptor.java:29)
at io.apicurio.registry.metrics.health.readiness.PersistenceTimeoutReadinessInterceptor_Bean.intercept(Unknown Source)
at io.quarkus.arc.impl.InterceptorInvocation.invoke(InterceptorInvocation.java:42)
at io.quarkus.arc.impl.AroundInvokeInvocationContext.proceed(AroundInvokeInvocationContext.java:70)
at io.quarkus.arc.impl.AroundInvokeInvocationContext$NextAroundInvokeInvocationContext.proceed(AroundInvokeInvocationContext.java:97)
at io.apicurio.registry.metrics.health.liveness.PersistenceExceptionLivenessInterceptor.intercept(PersistenceExceptionLivenessInterceptor.java:25)
at io.apicurio.registry.metrics.health.liveness.PersistenceExceptionLivenessInterceptor_Bean.intercept(Unknown Source)
at io.quarkus.arc.impl.InterceptorInvocation.invoke(InterceptorInvocation.java:42)
at io.quarkus.arc.impl.AroundInvokeInvocationContext.proceed(AroundInvokeInvocationContext.java:70)
at io.quarkus.arc.impl.AroundInvokeInvocationContext.proceed(AroundInvokeInvocationContext.java:62)
at io.apicurio.registry.metrics.StorageMetricsInterceptor.intercept(StorageMetricsInterceptor.java:41)
at io.apicurio.registry.metrics.StorageMetricsInterceptor_Bean.intercept(Unknown Source)
at io.quarkus.arc.impl.InterceptorInvocation.invoke(InterceptorInvocation.java:42)
at io.quarkus.arc.impl.AroundInvokeInvocationContext.perform(AroundInvokeInvocationContext.java:30)
at io.quarkus.arc.impl.InvocationContexts.performAroundInvoke(InvocationContexts.java:27)
at io.apicurio.registry.storage.impl.sql.SqlRegistryStorage_Subclass.initialize(Unknown Source)
at io.apicurio.registry.storage.impl.sql.SqlRegistryStorage_ClientProxy.initialize(Unknown Source)
at io.apicurio.registry.storage.RegistryStorageProducer.raw(RegistryStorageProducer.java:94)
at io.apicurio.registry.storage.RegistryStorageProducer.current(RegistryStorageProducer.java:51)
at io.apicurio.registry.storage.RegistryStorageProducer_ProducerMethod_current_KUuvkPhY_4l3q6EULlF1GPXbKes_Bean.doCreate(Unknown Source)
at io.apicurio.registry.storage.RegistryStorageProducer_ProducerMethod_current_KUuvkPhY_4l3q6EULlF1GPXbKes_Bean.create(Unknown Source)
at io.apicurio.registry.storage.RegistryStorageProducer_ProducerMethod_current_KUuvkPhY_4l3q6EULlF1GPXbKes_Bean.create(Unknown Source)
at io.quarkus.arc.impl.AbstractSharedContext.createInstanceHandle(AbstractSharedContext.java:119)
at io.quarkus.arc.impl.AbstractSharedContext$1.get(AbstractSharedContext.java:38)
at io.quarkus.arc.impl.AbstractSharedContext$1.get(AbstractSharedContext.java:35)
at io.quarkus.arc.generator.Default_jakarta_enterprise_context_ApplicationScoped_ContextInstances.c120(Unknown Source)
at io.quarkus.arc.generator.Default_jakarta_enterprise_context_ApplicationScoped_ContextInstances.computeIfAbsent(Unknown Source)
at io.quarkus.arc.impl.AbstractSharedContext.get(AbstractSharedContext.java:35)
at io.quarkus.arc.impl.ClientProxies.getApplicationScopedDelegate(ClientProxies.java:21)
at io.apicurio.registry.storage.RegistryStorageProducer_ProducerMethod_current_KUuvkPhY_4l3q6EULlF1GPXbKes_ClientProxy.arc$delegate(Unknown Source)
at io.apicurio.registry.storage.RegistryStorageProducer_ProducerMethod_current_KUuvkPhY_4l3q6EULlF1GPXbKes_ClientProxy.isReady(Unknown Source)
at io.apicurio.common.apps.config.impl.DynamicConfigStartup.onStart(DynamicConfigStartup.java:46)
at io.apicurio.common.apps.config.impl.DynamicConfigStartup_Bean.doCreate(Unknown Source)
at io.apicurio.common.apps.config.impl.DynamicConfigStartup_Bean.create(Unknown Source)
at io.apicurio.common.apps.config.impl.DynamicConfigStartup_Bean.create(Unknown Source)
at io.quarkus.arc.impl.AbstractSharedContext.createInstanceHandle(AbstractSharedContext.java:119)
at io.quarkus.arc.impl.AbstractSharedContext$1.get(AbstractSharedContext.java:38)
at io.quarkus.arc.impl.AbstractSharedContext$1.get(AbstractSharedContext.java:35)
at io.quarkus.arc.generator.Default_jakarta_enterprise_context_ApplicationScoped_ContextInstances.c55(Unknown Source)
at io.quarkus.arc.generator.Default_jakarta_enterprise_context_ApplicationScoped_ContextInstances.computeIfAbsent(Unknown Source)
at io.quarkus.arc.impl.AbstractSharedContext.get(AbstractSharedContext.java:35)
at io.quarkus.arc.impl.ClientProxies.getApplicationScopedDelegate(ClientProxies.java:21)
at io.apicurio.common.apps.config.impl.DynamicConfigStartup_ClientProxy.arc$delegate(Unknown Source)
at io.apicurio.common.apps.config.impl.DynamicConfigStartup_ClientProxy.arc_contextualInstance(Unknown Source)
at io.apicurio.common.apps.config.impl.DynamicConfigStartup_Observer_Synthetic_NzGHHybqmtc7Ct8_H1RkCbxYXQo.notify(Unknown Source)
at io.quarkus.arc.impl.EventImpl$Notifier.notifyObservers(EventImpl.java:351)
at io.quarkus.arc.impl.EventImpl$Notifier.notify(EventImpl.java:333)
at io.quarkus.arc.impl.EventImpl.fire(EventImpl.java:80)
at io.quarkus.arc.runtime.ArcRecorder.fireLifecycleEvent(ArcRecorder.java:155)
at io.quarkus.arc.runtime.ArcRecorder.handleLifecycleEvents(ArcRecorder.java:106)
at io.quarkus.deployment.steps.LifecycleEventsBuildStep$startupEvent1144526294.deploy_0(Unknown Source)
at io.quarkus.deployment.steps.LifecycleEventsBuildStep$startupEvent1144526294.deploy(Unknown Source)
... 7 more
Caused by: org.postgresql.util.PSQLException: ERROR: relation "outbox" already exists
at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2725)
at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2412)
at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:371)
at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:502)
at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:419)
at org.postgresql.jdbc.PgPreparedStatement.executeWithFlags(PgPreparedStatement.java:194)
at org.postgresql.jdbc.PgPreparedStatement.executeUpdate(PgPreparedStatement.java:155)
at io.agroal.pool.wrapper.PreparedStatementWrapper.executeUpdate(PreparedStatementWrapper.java:90)
at io.apicurio.registry.storage.impl.sql.jdb.UpdateImpl.execute(UpdateImpl.java:26)
... 102 more
Thank you for reporting an issue!
Pinging @jsenko to respond or triage.
additionally, SIGTERM is meant to be a "soft" shutdown - maybe if a database upgrade is in process and it receives SIGTERM, it will wait until the upgrade is complete? or maybe a different signal SIGUSR1 etc... could be used?
We've had a bug in our upgrade scripts that is now fixed on main. We will do a 3.0.2 release today that fixes the issue.
Apologies to @wolfchimneyrock and others. As @carlesarnal says we had a bug in the upgrade scripts that is now resolved. Version 3.0.2 is released and should fix the problem.