dbt-labs/snowplow

Map inferred_user_id in snowplow_page_views

jtcohen6 opened this issue · 1 comments

Rationale

The Snowplow package has two first-class outputs: snowplow_sessions and snowplow_page_views. Both should include the stitched user identity, inferred_user_id, that is the product of snowplow_id_map.

We frequently find ourselves building reports on top of both tables, and we should be able to perform counts of distinct visitors that agree between them.

Really good idea! Can we additionally try to set the user_id field more appropriately for snowplow_page_views? Right now, it's only set if the user's persistent id was known at the time of that page view. If we're going to join to the id map, are we also able to set the snowplow_user_custom_id field if it was known in the scope of the session?