Payments: littlepay_transaction_id missing from some fct_payments_rides_v2 rows
Closed this issue · 1 comments
Describe the bug
Captured by dbt tests reflected in Sentry issues #72503 and #72504 - a subset of rows in fct_payments_rides_v2 lack a value for littlepay_transaction_id
.
To Reproduce
The failing rows in question are almost entirely composed of payments from MST whose corresponding micropayments have a charge_type
value of pending_charge_fare
. Almost every micropayment with that charge_type
for that agency results in a fct_payments_rides_v2
row that isn't mapped to a littlepay_transaction_id
. These rows do map to valid littlepay_transaction_id
values in int_littlepay__cleaned_micropayment_device_transactions
, so the culprit is probably a disconnect somewhere in an adjacent stg/int model or in one of the joins within the fct_payments_rides_v2
model definition. Other agencies' micropayments rows with the same charge_type
value appear to map correctly, so comparing tables across agencies for presence/absence of different columns could be a good start.
In addition to that identified pattern that accounts for most of the test failures, there are a handful of other rows in fct_payments_rides_v2
that lack a littlepay_transaction_id
value for other reasons. These may have to be treated individually - some may just be due to bad data or testing data.
Expected behavior
The fct_payments_rides_v2
model is intended to reflect a broader, perhaps less tightly controlled dataset than fct_aggregations
, so it is possible that a few rows could end up as true mysteries at the end of an investigation into these test failures. If so, the tests in question should likely be loosened with, say, a 0.99 threshold.
Investigation notes:
- A significant subset of the affected rows (98%) are caused by issue #647; these are rows that are specifically being dropped in
int_littlepay__cleaned_micropayment_device_transactions
but are subsequently being reintroduced intopayments_rides
. - All rows did originally have a transaction ID.
WITH problem_rows AS (SELECT *
FROM `cal-itp-data-infra.mart_payments.fct_payments_rides_v2`
WHERE littlepay_transaction_id IS NULL),
map AS (
SELECT *
FROM `cal-itp-data-infra.staging.stg_littlepay__micropayment_device_transactions`
),
cleaned AS (
SELECT *
FROM `cal-itp-data-infra.staging.int_littlepay__cleaned_micropayment_device_transactions`
),
compare AS (
SELECT
problem_rows.*,
map.littlepay_transaction_id AS original_lp_id,
cleaned.littlepay_transaction_id AS cleaned_lp_id
FROM problem_rows
LEFT JOIN map USING (micropayment_id)
LEFT JOIN cleaned USING (micropayment_id)
)
SELECT
original_lp_id IS NULL,
cleaned_lp_id IS NULL,
participant_id,
COUNT(*),
COUNT(DISTINCT compare.original_lp_id)
FROM compare
GROUP BY 1,2, 3
ORDER BY 4 DESC