[Bug] Broken connector reported as "connected"
Closed this issue ยท 13 comments
Is there an existing issue for this?
- I have searched the existing issues
Describe the issue
In the Fivetran UI we can see that we have a connector (Salesforce sandbox) that is broken:
We also have our own System status dashboard where we rely on the data coming out of the models in this dbt package.
The issue here is that it's showing that this connector is healthy, described as "connected"
I quickly started to dig into what might be going wrong, and perhaps here is something. If I run:
select *
from "DB"."SCHEMA"."stg_fivetran_log__log"
where event_type = 'SEVERE'
or event_subtype like 'sync%'
and connector_id = 'crusade_plausible'
order by CREATED_AT desc
I'm getting a sync_start then a Severe log, and then a sync_end log.
When this later is categorised, this might be where the issue is presented?
dbt_fivetran_log/models/fivetran_log__connector_status.sql
Lines 83 to 101 in 691b82a
For us this is a high priority issue since we rely on the data from our System status dashboard where we monitor all our Fivetran connectors.
Relevant error log or model output
No response
Expected behavior
We expect the model fivetran_log__connector_status
to report the connector as broken if it's broken in the Fivetran UI.
dbt Project configurations
models:
+persist_docs:
relation: true
columns: true
data_eng:
sources:
+materialized: table
fivetran_log:
+schema: # leave these blank to use the target_schema
staging:
+schema: # leave these blank to use the target_schema
vars:
fivetran_log:
fivetran_log_database: db
fivetran_log_schema: schema
Package versions
packages:
- package: fivetran/fivetran_log
version: 0.5.0
What database are you using dbt with?
snowflake
dbt Version
Dbt version 1.0.0
Additional Context
No response
Are you willing to open a PR to help address this issue?
- Yes.
- Yes, but I will need assistance and will schedule time during our office hours for guidance
- No.
Hi @carlioth thanks so much for opening this issue and providing such detailed notes on your investigation.
Taking a look at what you provided above, I would agree that this connector should be showing as broken and not as connected. Based off the staging query you have above, I would have thought the the below line would have captured the SEVERE
event and logged an error time of 2022-02-25 06:26:08.978
.
However, after looking further into this I can see that the logic is in fact working but not in the way we would like. We have made the assumption that a broken connector would not have a sync_end
event. This in fact seems to not be the case as I can see the SEVERE
record and then a subsequent sync_end
event. This sync_end
event is then negating the last_error_at
field due to the below line.
For your data, since the last_error_at
is technically less than the last_sync_completed_at
field, this is then not recorded as a broken
.
Before we take any next steps, I would like to understand better why this had a sync_end
event following a failure. I will follow up with our engineering team to get a better understanding of this. Further, would you be able to share the entire contents of the JSON object that includes the SEVERE
warning you provided in the screenshot above?
Hi @fivetran-joemarkiewicz
Thanks for the quick respons and good details.
The full log for the SEVERE
says:
{"reason":"java.lang.Exception: Authentication failure. Reconnect the connector with the latest username and password","taskType":"reconnect","status":"FAILURE_WITH_TASK"}
For full transparency in my query above I've removed all the logs with WARNING
as eventtype, they are also included in the model fivetran_log__connector_status.
The reason why I've removed those events is because we are right now drowning in those events. We are approx. getting 60 of these warnings per second:
{"type":"table_excluded_by_system","message":"salesforce_icrm_prod.<TABLE> has been Excluded by system. Reason : Not queryable"}
Update:
We are now seeing the same behaviour but for another connector. This time the error is:
{"reason":"com.fivetran.core.PrimaryKeyContainsNull: Null primary key found while syncing table *****
Looking at the logs we are getting the same once, first sync_start
, then the Severe log, and then sync_end
Thanks for these detailed updates! I am still looking into this, but hopefully will come to a conclusion soon. I will keep you updated on my end!
Any updates on this @fivetran-joemarkiewicz
Hi @carlioth, I apologize but I do not have a strong update at the time being.
The last movement on my end was working with the product manager for the Fivetran Log connector who believe sync_end
events should exist for all connectors (even broken ones). If this is the case, then my team and I will want to update the broken
status logic in this package accordingly.
However, the PM was not 100% and as of Friday was looping up with the engineering team to confirm this. I hope to have an update this week!
Hi @fivetran-joemarkiewicz, any updates on this issue?
Hi @andersrundberg thanks for reaching out! I have not been able to connect with our engineering team on this at the moment to verify, but I am reasonably certain that via the connector December 2021 release notes we are going to want to update the logic within our dbt package to account for sync_end
events for failed connectors since they now should register this log regardless of success or failure.
That being said, we will want to update this on our end within the dbt package to reflect the current state of the log connector. I will make an update this week in a working branch and share it here for you to test out before we roll out any updates in the next release.
Thank you so much for your patience!
@fivetran-joemarkiewicz cool, please tag @carlioth when there any news.
Hi @andersrundberg and @carlioth
Thank you again for your patience and helping us identify this issue within the package. I am currently working on a fix and believe to be on the right track. When you have availability, would you be able to test the below version of the package within your dbt project and see if it is able to identify the broken/paused/working connectors properly?
packages:
- git: https://github.com/fivetran/dbt_fivetran_log.git
revision: bugfix/connector-status
warn-unpinned: false
Let me know if you have any questions and if you do or do not see the issue be resolved within this branch of the package.
Thanks!
Hi @fivetran-joemarkiewicz
I've now tested this and now I'm getting the status of broken
for the broken connectors.
Working as expected ๐๐ป
Do you have any ETA when you think you will be able to release this fix?
That's great to hear! I still want to make a few minor changes, but will be opening this issue up for a PR review today and hopes to have the fix released tomorrow!
Edit: We typically do release freezes on Friday afternoons. Because of that, a more realistic timeline would be a Monday release.
Hi All,
I just wanted to share that the PR has been merged with the fix and the new v0.5.3
release has been cut! You should be seeing the latest version of the package to be live on the dbt hub at the top of the hour.
Feel free to create a new issue if you encounter any questions while using the Fivetran Log package. Thanks again for your help in raising and resolving this issue.