averemee-si/oracdc

Missing data when mining on (RAC + DataGuard with Single Instance Standby)

Felix0525 opened this issue · 14 comments

Missing data when mining on (RAC + DataGuard with Single Instance Standby)

Hi
I have encountered a problem, when mining on physical standby database (primary is RAC).
Some data will be lost

Here is v$archived_log view rows:

NAME |THREAD#|SEQUENCE#|FIRST_CHANGE#|NEXT_CHANGE#|BYTES |CREATE_TIME |ARCHIVED_TIME |ACTUAL_LAG_SECONDS |
-----------------------------------------------+-------+---------+-------------+------------+---------+-------------------+-------------------+-----------------------------------------+
/app/oracle/oradata/arch/1_19641_1029337344.dbf| 1| 19641| 73690669482| 73690670319| 199680|2022-12-01 06:36:55|2022-12-01 06:37:09| 10770|
/app/oracle/oradata/arch/2_18701_1029337344.dbf| 2| 18701| 73690669387| 73690670437| 264704|2022-12-01 06:36:53|2022-12-01 06:37:12| 10772|
/app/oracle/oradata/arch/1_19640_1029337344.dbf| 1| 19640| 73688099480| 73690669482|501363200|2022-11-30 22:31:47|2022-12-01 06:36:55| 39878|
/app/oracle/oradata/arch/1_19642_1029337344.dbf| 1| 19642| 73690670319| 73693211872|902452224|2022-12-01 06:37:09|2022-12-01 09:14:33| 10756|
/app/oracle/oradata/arch/2_18702_1029337344.dbf| 2| 18702| 73690670437| 73693758285|902059008|2022-12-01 06:37:12|2022-12-01 09:28:08| 10753|

Here is the SQL

select NAME, THREAD#, SEQUENCE#, FIRST_CHANGE#, NEXT_CHANGE#, BLOCKS*BLOCK_SIZE BYTES, FIRST_TIME, (SYSDATE-FIRST_TIME)*86400 ACTUAL_LAG_SECONDS
from V$ARCHIVED_LOG
where ARCHIVED='YES' and STANDBY_DEST='NO' and DELETED='NO' and (? >= FIRST_CHANGE# or ? <= NEXT_CHANGE#)
and SEQUENCE# >=
(select min(SEQUENCE#)
from V$ARCHIVED_LOG
where ARCHIVED='YES' and STANDBY_DEST='NO' and ? between FIRST_CHANGE# and NEXT_CHANGE# and THREAD#=?)
and THREAD#=?
order by SEQUENCE#;

Current THREAD# value is 1

Is there any way to solve this problem?

Thanks

Hi Felix,

Is standby single instance database?

Best regards,
Aleksei

Hi Aleksei

Is standby single instance database?
Yes

OK, thanks.
This occurs due to fact that that connector executes on start

select VERSION, INSTANCE_NUMBER, INSTANCE_NAME, HOST_NAME, THREAD#,
       (select nvl(CPU_CORE_COUNT_CURRENT, CPU_COUNT_CURRENT) from V$LICENSE) CPU_CORE_COUNT_CURRENT
from   V$INSTANCE;

to detect THREAD# to connect.
I need some time to think about the best ways to solve this problem. It is possible that we need to do the same as for the RAC - but instead of active instance detection detection of redo threads is required.

Working on it.

Best regards,
Aleksei

Yes, I read the code and know how to handle for RAC. The connector will start more than one tasks, and each instance is sent to a different topic.

But in this case, how to get the "complete" data of a table, how to ensure the order of the event(because data from the same table is sent to multiple partitions)

First - I will add logic to include missed THREAD#'s for single instance standby for RAC.
Second - partitions in Kafka topic are exactly same as different THREAD# in Oracle RAC (Parallel Server for oldies like me 😺 ). This means that custom consumer is required, or sorting by SCN/COMMIT_SCN on producer side

@Felix0525 What you think about statements/suggestions/actions above?

Hmm, I am not very familiar with Oracle.
But for the processing steps you describe, I have a question:

On the primary database,
First, user A connects to instance 1 and adds a new row of data D1 (suppose it belongs to archive log1_100.dbf).
Then user B connects to instance 2 and deletes data D1 (suppose it belongs to archive log 2_55.dbf).
If 2_55.dbf arrives at the standby database before 1_100.dbf (10 minutes slower)

For this case, will it cause data D1 not to be deleted eventually?

For this case we need to know SCN for operations, and process operations in order of SCN/COMMIT_SCN
Need some time to recreate RAC environment and perform some tests.

Best regards,
Aleksei

Hi @Felix0525 ,

Please upload results of following queries from your single instance standby for RAC:

select THREAD#,STATUS,ENABLED from V$THREAD;
select THREAD#,PROCESS,STATUS from V$MANAGED_STANDBY where PROCESS like '%RFS%';
select THREAD#,ROLE,ACTION from V$DATAGUARD_PROCESS where NAME like '%RFS%';

Thanks!

Best regards,
Aleksei

1.select THREAD#,STATUS,ENABLED from V$THREAD;

THREAD#|STATUS|ENABLED|
-------+------+-------+
1|OPEN |PUBLIC |
2|OPEN |PUBLIC |

2.select THREAD#,PROCESS,STATUS from V$MANAGED_STANDBY where PROCESS like '%RFS%';
THREAD#|PROCESS|STATUS|
-------+-------+------+
0|RFS |IDLE |
0|RFS |IDLE |
0|RFS |IDLE |
1|RFS |IDLE |
0|RFS |IDLE |
2|RFS |IDLE |
0|RFS |IDLE |
0|RFS |IDLE |

3.select THREAD#,ROLE,ACTION from V$DATAGUARD_PROCESS where NAME like '%RFS%';

ORA-00942: Table or views not exists

Thanks!

Hi @Felix0525

Hope you are doing well :-)

Please check fixes from 38-single-instance-dg4rac branch.

Thanks!

Best regards,
Aleksei

Thanks!
I'll try it on my environment soon 🤝