fivetran/dbt_xero

[Bug] Unique combination test failing - impacts production deployment

cmcau opened this issue · 12 comments

cmcau commented

Is there an existing issue for this?

  • I have searched the existing issues

Describe the issue

I am seeing this error when I run my production deployment job:
FAIL 1 dbt_utils_unique_combination_of_columns_xero__general_ledger_journal_line_id__source_relation

which means that these models are skipped:

xero__balance_sheet_report .......... [SKIP]
xero__profit_and_loss_report ........ [SKIP]

and these tests are skipped:

dbt_utils_unique_combination_of_columns_xero__balance_sheet_report_source_relation__date_month__account_name
not_null_xero__profit_and_loss_report_profit_and_loss_id
unique_xero__profit_and_loss_report_profit_and_loss_id

The error is:
23:09:33 Completed with 1 error and 0 warnings: 23:09:33 23:09:33 23:09:33 Failure in test dbt_utils_unique_combination_of_columns_xero__general_ledger_journal_line_id__source_relation (models/xero.yml)

and when I download the compiled code I get

`with validation_errors as (

select
    journal_line_id, source_relation
from insights.matchandwood_xero.xero__general_ledger
group by journal_line_id, source_relation
having count(*) > 1

)

select *
from validation_errors
`

if I run that directly in Snowflake I get 1 record returned:

JOURNAL_LINE_ID is null, SOURCE_RELATION is an empty string (I think, it doesn't say NULL) and a count of 4 records.

Relevant error log or model output

Failure in test dbt_utils_unique_combination_of_columns_xero__general_ledger_journal_line_id__source_relation (models/xero.yml)

Expected behavior

Job to complete successfully, no models skipped

dbt Project configurations

Values set for xero_schema and xero_database

Package versions

packages:

packages:

  • package: fivetran/xero
    version: [">=0.6.0", "<0.7.0"]

What database are you using dbt with?

snowflake

dbt Version

v.1.6

Additional Context

No response

Are you willing to open a PR to help address this issue?

  • Yes.
  • Yes, but I will need assistance and will schedule time during our office hours for guidance
  • No.

HI @cmcau thanks for opening this issue and I am sorry to see you are running into this test failure. I do have a few questions I before I dive further into what may be causing this issue and a possible solution:

  • Were you running this package previously and seeing success, then this started failing randomly? Or is this the first time you are attempting to run this package?
  • It looks like the null journal_line_id may be caused when a journal_id doesn't have associated lines. See the join here where I could see this being a result. Would you be able to inspect the raw data Fivetran syncs and confirm if there are in fact journals that do not have associated journal_line_ids. If that is the case, we may be able to resolve this issue by adjusting the test to reference a unique combination of journal_id, journal_line_id, and source_relation.

Let me know on the above two questions and from there we should have a path forward on what needs to be updated in the package.

cmcau commented

Hi,

  • no, this is the first time I've tried to use the package with this Xero connection
  • yes, it's only happening when journal_line_id is null but I wasn't sure if I could override that at all.

Thanks @cmcau does it seem appropriate based on the underlying data that journal_line_id is null for the records in the end model? This would mean that the journal table has no associated journal_line_ids. This may be appropriate and something we can easily address in the next update, but I want to make sure this is an expected result.

Hi @fivetran-joemarkiewicz, I am having the same issue as @cmcau when running the Xero package for the first time, on the exact same test for the journal_line_id in the xero__general_ledger model.
I can confirm from my side that the raw data shows a few journals without journal lines.

Thanks for adding to the conversation @jsagasta! Does it seem appropriate for those journals to not have journal lines? I think the best approach forward is to adjust the dbt test, but I want to make sure the raw data is appropriate to not have journal lines associated before doing so.

@fivetran-joemarkiewicz, I have reached out to the actual Xero admin team to answer this, and it seems like those journals don't exist in Xero (at least in the Xero UI)...
This brings a question about the Fivetran Xero connector, if those journals don't exist, is there a soft delete flag that could be added to the source table?
I'll keep you updated if I find anything else.

@jsagasta that is really interesting! Thank you for sharing and this makes me second guess the quick fix of just adjusting the test as it seems there may be something in the connector that needs to get sorted out.

In the meantime I would recommend opening a support ticket with Fivetran to bring up these deleted journals that are not showing as such in the connector.

Thanks @fivetran-joemarkiewicz, I'll go ahead and submit a ticket.
Let us know once you've pushed your changes to the package.

HI @jsagasta I likely will not push changes to the package until an update is applied to the connector. If this test failure is becoming too noisy and you are comfortable with the soft deleted journals still existing in the source then you can always disable this test in your dbt_project.yml.

Currently I am in the camp of wanting to keep this test in the package to raise awareness that there are deleted journals in the data that likely need to be addressed.

Hi @cmcau @jsagasta thank you for opening a support ticket and having our connector team take a look at the issue from the connector side. It turns out these deleted journals are still provided in the Xero API response and will still continue to be synced.

As such, I will accept this ticket to update the test to include journal_id in the unique test as well. If either of you are interested in opening a PR I would be happy to review it! From my understanding all we would look to update is adding journal_id to the following test

dbt_xero/models/xero.yml

Lines 117 to 120 in 37c8d4b

- dbt_utils.unique_combination_of_columns:
combination_of_columns:
- journal_line_id
- source_relation

If you are not interested in a PR then no worries at all! We will plan to roll this out early next week 😄

Hi @fivetran-joemarkiewicz, thanks for the update on this. I went ahead and created a PR for this.
Thanks again!

Amazing, thanks for opening the PR @jsagasta! I will review today or tomorrow and let you know if there are any final questions before merging in!