Error related with availability group
Closed this issue · 14 comments
We've got following error when execute sp_WhoIsActive:
Executed as user: *******\sqlservice. Warning: Null value is eliminated by an aggregate or other SET operation.
[SQLSTATE 01003] (Message 8153) The target database, '*****', is participating in an availability group and is
currently not accessible for queries. Either data movement is suspended or the availability replica is not enabled
for read access. To allow read-only access to this and other databases in the availability group, enable read access
to one or more secondary availability replicas in the group. For more information, see the ALTER AVAILABILITY
GROUP statement in SQL Server Books Online. [SQLSTATE 42000] (Error 976, Level 14, State 1, Line 38)
Warning: The join order has been enforced because a local join hint is used. [SQLSTATE 01000] (Error 8625).
The step failed.
Version: Who Is Active? v11.35 (2020-10-04)
Additional context:
This error occurs when we try to additionally use Telegraf plugin for monitoring SQL Server.
We have Always On Availability Groups.
This error occurs on healthy cluster.
My idea that Telegraf
exec some queries on not available db and sp_WhoIsActive
tries to get additional information about this query and fails.
Could you assist us to solve this issue? Any ideas?
Thank you!
This is a similar issue to here. I can add in a similar fix.
@erikdarlingdata, It will be great!
Reviewing things a little bit, it looks like the two places this might happen are in the locks_cursor
and the blocks_cursor
, since those seem to be the only two queries that try to explicitly access a database.
@MaxShoshin can you confirm this is where you hit the error? Or provide the command that runs when you hit the error?
We still get error (on your branch):
Message 976, Level 14, State 1, Line 331
The target database, '********', is participating in an availability group and is currently
not accessible for queries. Either data movement is suspended or the availability replica
is not enabled for read access. To allow read-only access to this and other databases in
the availability group, enable read access to one or more secondary availability replicas
in the group. For more information, see the ALTER AVAILABILITY GROUP statement in
SQL Server Books Online.
@MaxShoshin okay, you're gonna have to give me more information than that. I don't have an AG set up to test things on, so I was going by the most obvious potential spots where the error might be raised.
Can you do some troubleshooting locally to try to figure out where it's happening? I'm not even sure what command you're running to generate the error.
Thanks!
@erikdarlingdata, we call simple sp_WhoIsActive
When Telegraf is working we got errors.
It seems that telegraf call some queries from this file:
https://github.com/influxdata/telegraf/blob/master/plugins/inputs/sqlserver/sqlqueriesV2.go
@MaxShoshin right, but I don't have an AG or use Telegraf for anything, so this is where you're gonna have to help me out. I have no idea where else in the query to try the change to help you out. I can't repro your environment to work on this issue, sorry.
I've tried to reproduce this with docker-compose... unsuccessfully...
May be we can try to run different version of sp_whoisactive with additional diagnostics?
We are ready to provide additional information, but we didn't know what information do you need or how we can collect it (like original line number where error occurs).
Thank you!
Sure, here's what you'll wanna do:
EXEC sp_WhoIsActive
@show_own_spid = 1,
@get_full_inner_text = 1;
Run that, and click on the sql_text
column, that'll give you the whole query that WhoIsActive runs. Paste that into a new SSMS window, and get rid of the XML artifacts like <?query --
and --?>
.
After that, you'll have to declare a few variables to make things run correctly:
DECLARE
@i bigint = 922337203685477580,
@recursion smallint = 1,
@blocker bit = 0;
You'll also wanna delete some of the header information that comes along:
(@recursion SMALLINT, @filter sysname, @not_filter sysname, @first_collection_ms_ticks BIGINT)DECLARE @blocker BIT;
SET @blocker = 0;
DECLARE @i INT;
SET @i = 2147483647;
After that, you can hit F5 and it'll run. Lemme know if that gives you a better idea of where the error is coming from.
Thanks!
@MaxShoshin In addition to the above, for the impacted databases, can you please tell us the value of the secondary_role_allow_connections_desc
column in the sys.availability_replicas
view?
secondary_role_allow_connections_desc = NO
See sql_text for sql text.
We got following error:
Message 976, level 14, state 1, line 38
The target database, '******************************', is participating in an availability group and is currently not accessible for queries. Either data movement is suspended or the availability replica is not enabled for read access. To allow read-only access to this and other databases in the availability group, enable read access to one or more secondary availability replicas in the group. For more information, see the ALTER AVAILABILITY GROUP statement in SQL Server Books Online.
@MaxShoshin can you let me know which version of SQL Server this is happening on? There are behavioral changes post-2016.
Also, what state is the database in? If it's reverting, initializing, still being recovered, etc. it might also need to be handled differently.
We have several versions of SQL servers:
14.0.3238.1
14.0.3381.3
14.0.3411.3
@MaxShoshin haven't heard back from you in a while, so I'm going to close this one out.