dbt-labs/snowplow

Testing for late-arriving facts

jtcohen6 opened this issue · 1 comments

Include in package integration tests event_update:

  • new event, same session_id as earlier event
  • new event, same page_view_id as earlier event (?)
  • same event_id as earlier event (?)

Our approach to these edge cases has been in flux over the past few releases. I think we have a good grasp of what desired behavior is as of 0.7.1. Let's include these cases in our integration tests to ensure the modeling operates as expected.

This relates somewhat to #80.

I think it would be compelling to include a custom data test that returns, with warn/error thresholds (configurable?), the % of events that are "late-arriving" and % of sessions that are thereby "split." This would vary depending on the user's configured value of snowplow:page_view_lookback_days.