SQL Server deadlocks during recovery/restart
JustinPihony opened this issue · 1 comments
Versions used
Akka version: 2.6.13
Akka Persistence JDBC: 5.0.4
mssql-jdbc: 9.5.0.jre8-preview
Remember Entities is also being used
Expected Behavior
The application should recover and restart any entities without exceptions
Actual Behavior
Upon restarting (normally or through recovery) and an old entity is restarted then a deadlock is encountered.
Relevant logs
RecoveryFailed(com.microsoft.sqlserver.jdbc.SQLServerException: Transaction (Process ID 538) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.)
[ERROR] [...Entity] [] [...actor.default-dispatcher-20] - Supervisor RestartSupervisor saw failure: Exception during recovery. Last known sequence number [0]. PersistenceId [...], due to: Exception during recovery. Last known sequence number [0]. PersistenceId [...], due to: Transaction (Process ID 209) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.
akka.persistence.typed.internal.JournalFailureException: Exception during recovery. Last known sequence number [0]. PersistenceId [...], due to: Exception during recovery. Last known sequence number [0]. PersistenceId [...], due to: Transaction (Process ID 209) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.
An example deadlocked query set:
(@P0 nvarchar(4000))select max(x2.x3) from (select top (1) "persistence_id" as x4, "sequence_number" as x3 from "event_journal" where "persistence_id" = @P0 order by "sequence_number" desc) x2
AND
@P0 bit,@P1 nvarchar(4000),@P2 bigint,@P3 nvarchar(4000),@P4 bigint,@P5 nvarchar(4000),@P6 varbinary(max),@P7 int,@P8 nvarchar(4000),@P9 varbinary(max),@P10 int,@P11 nvarchar(4000))insert into "event_journal" ("deleted","persistence_id","sequence_number","writer","write_timestamp","adapter_manifest","event_payload","event_ser_id","event_ser_manifest","meta_payload","meta_ser_id","meta_ser_manifest") values (@P0,@P1,@P2,@P3,@P4,@P5,@P6,@P7,@P8,@P9,@P10,@P11) select SCOPE_IDENTITY() AS GENERATED_KEYS
These deadlocks appear to be the result of the schema having VARCHAR
columns (8-bit characters) while the JDBC driver by default (because Java strings are Unicode) transmits strings as NVARCHAR
values (Unicode). This results in SQL Server having to do a data conversion which requires bypassing the indexing and doing a full-table scan (see https://www.sqlshack.com/query-performance-issues-on-varchar-data-type-using-an-n-prefix/).
In the process of doing that scan, row locks seem to build up (pure speculation on my part: locks are acquired synchronously and released asynchronously, so a fast scan acquires locks much faster than it releases them?) which eventually puts SQL Server into a mode where row locks escalate into page locks and then table locks (see https://docs.microsoft.com/en-us/troubleshoot/sql/performance/resolve-blocking-problems-caused-lock-escalation).
The JDBC driver can be configured to send strings as VARCHAR
values; the easiest way in the context of akka-persistence-jdbc
is to append ;sendStringParametersAsUnicode=false
to the configured JDBC connection string. This will allow the index to be used and avoid a table scan and deadlocks.
-
The .Net port uses
NVARCHAR
exclusively in their schema: I think there's a strong case for updating our schema along similar lines to match the default JDBC semantics and prevent the data conversion. -
Migration of existing tables from
VARCHAR
toNVARCHAR
is unclear. Some SO answers seem to indicate it would be a manual process which would involve, in some cases, dropping and recreating primary keys. I'd expect most SQL Server users to have a DBA they can reach out to for expertise in doing that migration, and in the meantime the;sendStringParametersAsUnicode=false
mitigation will work as long they don't need unicode entity IDs, tags, etc. This is something that can be called out in release notes and other documentation.