Testing for late-arriving facts
jtcohen6 opened this issue · 1 comments
jtcohen6 commented
Include in package integration tests event_update
:
- new event, same
session_id
as earlier event - new event, same
page_view_id
as earlier event (?) - same
event_id
as earlier event (?)
Our approach to these edge cases has been in flux over the past few releases. I think we have a good grasp of what desired behavior is as of 0.7.1. Let's include these cases in our integration tests to ensure the modeling operates as expected.
jtcohen6 commented
This relates somewhat to #80.
I think it would be compelling to include a custom data test that returns, with warn/error thresholds (configurable?), the % of events that are "late-arriving" and % of sessions that are thereby "split." This would vary depending on the user's configured value of snowplow:page_view_lookback_days
.