4-5-4 Calendar Date Tools

This repo contains LookML code that can be used to filter a date partitioned BigQuery table by date intervals based on a custom 4-5-4 retail calendar. An except of this custom calendar is provided in 454_calendar.csv.

Why Use A Custom Calendar?

The retail industry are primary users of 4-5-4 calendars as it allows for a better year-over-year comparison:

The 4-5-4 calendar is a guide for retailers that ensures sales comparability between years by dividing the year into 
months based on a 4 weeks – 5 weeks – 4 weeks format. The layout of the calendar lines up holidays and ensures the same 
number of Saturdays and Sundays in comparable months. Hence, like days are compared to like days for sales reporting 
purposes. 

The calendar used for examples in this repo are baesed off of the NRF's 4-5-4 calendar, but with some modifications.

An except of this custom calendar is provided in 454_calendar.csv.

The Problem

It is easy to use BigQuery's native date functions to filter a date partitioned table if the user is only concerned about calendar date fields. BigQuery's date functions do not support, however, custom calendars (e.g., 4-5-4, 4-4-5) definitions.

For example, to find the beginning of the start date of the current calendar month, you can run:

SELECT
    DATE_TRUNC(CURRENT_DATE(), MONTH)

Unfortunately, the start date of the month in a 4-5-4 calendar does not align with the calendar date.

For example, the start of calendar year 2021 may be 2021-01-01 (YYYY-MM-DD), but in terms of a custom 4-5-4 calendar the start of 2021 is actually 2020-12-27.

There are additional constraints in Looker that must be dealt with when you are trying to use date filters based on a custom calendar.

When working in BigQuery alone, you could have a mapping table (e.g. 454_calendar.csv) and use that to determine dates for filtering:

declare start_date as date;
declare end_date as date;

set start_date = (select min(date(cal_date)) from 454_calendar where 454_year = 2021 and 454_week = 2);
set end_date = (select max(date(cal_date)) from 454_calendar where 454_year = 2021 and 454_week = 2);

--filter sample_table for second week of 2021 in 454 calendar
select 
*
from sample_table
where partition_field_date between start_date and end_date

Problems arise in Looker when you try to do the above in a derived_table SQL block for a view, but then create an NDT off of that view. You'll notice errors because the declare statements being nested in the NDT.

The Approach

To be able to filter in Looker, a combination of BigQuery User Defined Functions, LookML code, and SQL conditional statements are used.

The BigQuery UDFs can be found in the bq_udf folder in this repo and the functions will have to be created in your BigQuery project to use the LookML blck. The LookML blocks can be found in 454_tools.view.

Views

There are 5 views defined in 454_tools.view file:

  • 454_tools_parameters
    • Contains only parameters. Can be extended by other views for binding filters
  • 454_tools
    • Generates the final filters that are applied to partition field date
  • repository_variables
    • Contains dimensions and filters for using custom repository variables. Dynamic in the sense that it changes based on run date.
  • py_repository_variables
    • Combined with repository_variables to conduct year-over-year analysis
  • 454_fixed_period_filters
    • Contains dimensions and filters for finding specific 454 dates (i.e., not dynamic)

All filters are viewable under the view label: _Reporting Date Tools and group label: 4-5-4 Tools ({{ _view._name}})

454 Tools

The 454_tools view extends 454_parameters, 454_fixed_period_filters, repository_variables, and py_repository_variables. This is the view that is extended in a view that is based off a table so that users have access to the date filters tools.

An example of how to extend 454_tools in a view so that users can see date filters in the Explore:

Show Example

include: "/path/to/454_tools.view"

view: orders {
    extends: [454_tools] ## Now this view can use the date filters in 454_tools
    sql_table_name: dataset.orders ;;
    
    ## overrides the partition_field_param parameter in 454_tools.view
    ## with the field specific for this view
    parameter: partition_field_param {
      type: unquoted
      hidden: yes
      allowed_value: {
        label: "order_date"
        value: "order_date"
      }
    }
    
    dimension_group: order_date {
        # order_date is a timestamp
        # dataset.orders table is partitioned on date(order_date)
        type: time
        sql: ${TABLE}.order_date 
    }
    
    dimension: order_id {
        type: number
        sql: ${TABLE}.order_id
    }
    
    dimension: sales {
        type: number
        sql: ${TABLE}.sales
    }
    
    measure: total_sales {
        type: sum
        sql: ${sales}
    }
    
}

explore:  orders {
    from: orders
    
    # partition field must be defined to use 454_tools
    always_filter: {
      filters: [partition_field_param: ""]
    }

    # need these to make sure partition fields are always being filtered
    sql_always_where:
      ${454_FILTER} ;;
}

454_FILTER

In example above, the orders Explore always filters ${454_FILTER} which is a filter defined in 454_tools view. This is the final filter applied where the partition_field_param (i.e., partition date field) is filtered in a WHERE clause.

Using Looker's Liquid syntax, we apply different filters based on certain conditions regarding the parameters used. Furthermore, we assume the partition_field_param is a timestamp and it is converted to a DATE when filtered.

The 454_FILTER is essentially the following:

date( partition field ) between start_date and end_date

and if a year-over-year analysis is desired:

date( partition field ) between start_interval and end_interval
or 
date( partition field) between previous_year_start_interval and previous_year_end_interval

Start Interval

End Interval

+/- N Intervals

Fixed Period Filters

These filters can be used for when users want to view a very specific time interval within 454 calendar. In this instance, the user has to define the year and the week, month, or quarter.

Repository Variables

The repository_variable and py_repository_variable views contains LookML code to custom variables -- similar to Looker filter expressions -- to represent custom filter definitions based on current date (repository_variable) and previous year (py_repository_variable).

Key differences compared to Fixed Period Filters:

  • Users can choose from a menu of human-readable options to apply date filters
  • It is dynamic/rolling because it is always based on the current date
    • Useful for creating automated reports
What Year Is It In 454 calendar?

This question alone is difficult to answer when you're at the beginning/end of a calendar year. As mentioned above, a year can start in December or January in a 454 calendar.

What Week Is It In 454 calendar?
What Month Is It In 454 calendar?
What Quarter Is It In 454 calendar?
What Is The Equivalent of This Week/Month/Quarter Last Year In 454 calendar?

Parameters

The 454_tools_parameters view contains all the parameters required for users to filter dates. Separating them out allows for users to extend just the parameters in an NDT and then bind the filters.

An example is shown below where we have a view called orders and an Explore based off that view also called orders. The orders view/Explore is built on top of a fact table in BigQuery that contains orderline level of granularity and is partitioned on order_date field. A summary of orders aggregated at order_date level is created with an NDT called order_summary. Users can now use order_summary Explore to join other views on a much smaller scale.


include: "/path/to/454_tools.view"

view: orders {
    extends: [454_tools]
    sql_table_name: dataset.orders ;;
    
    ## overrides the partition_field_param parameter in 454_tools.view
    ## with the field specific for this view
    parameter: partition_field_param {
      type: unquoted
      hidden: yes
      allowed_value: {
        label: "order_date"
        value: "order_date"
      }
    }
    
    dimension_group: order_date {
        # order_date is a timestamp
        # dataset.orders table is partitioned on date(order_date)
        type: time
        sql: ${TABLE}.order_date 
    }
    
    dimension: order_id {
        type: number
        sql: ${TABLE}.order_id
    }
    
    dimension: sales {
        type: number
        sql: ${TABLE}.sales
    }
    
    measure: total_sales {
        type: sum
        sql: ${sales}
    }
    
}

explore: orders {
    from: orders
    # A fact table that contains orderline level information
}

view: order_summary {
    # NDT that aggregates orders by day
    extends: [454_parameters]
    derived_table: {
      explore_source: orders {
        timezone: "query_timezone"
        column: order_date_date {} 
        column: order_id {}
        column: total_sales {}
        filters: {
          # 454_tools.view requires a partition_field_param to be defined
          field: orders.partition_field_param
          value: "order_date"
        }
        ### Following bind filters are for parameters in 454_parameters
        bind_filters: {
        to_field: orders.FY_start
        from_field: order_summary.FY_start
        }
        bind_filters: {
          to_field: orders.N_start
          from_field: order_summary.N_start
        }
        bind_filters: {
          to_field: orders.454interval_start
          from_field: order_summary.454interval_start
        }
        bind_filters: {
          to_field: orders.FY_end
          from_field: order_summary.FY_end
        }
        bind_filters: {
          to_field: orders.N_end
          from_field: order_summary.N_end
        }
        bind_filters: {
          to_field: orders.454interval_end
          from_field: order_summary.454interval_end
        }
        bind_filters: {
          to_field: orders.compare_yoy
          from_field: order_summary.compare_yoy
        }
        bind_filters: {
          to_field: nested_sb_customer_transactions_daily__details.po_category
          from_field: order_summary.po_category
        }
      }
    }
    
    dimension: order_date {
        sql: ${TABLE}.order_date_date
    }
    
    dimension: order_id {}
    
    dimension: total_sales_per_order {
        sql: ${TABLE}.total_sales
    }
    
    measure: total_sales {
        type:sum
        sql: ${total_sales_per_order}
    }   
    
}

explore: order_summary {
    from: order_summary
    join: some_other_view {
        sql_on: ${order_sumamry.order_id} = ${some_other_view.order_id}
        relationship: one_to_one
    }
 }


Examples

When you want to filter anything by a date interval, you need a "start date" and an "end date".

There are two categories of filters: Begin and End that correspond to how "start date" and "end date" are defined, respectively.

Each filter category has 4 options:

  • Fiscal Year
  • Interval
  • Interval Type

All of the 4 Begin filter options affect how the "start date" is defined. All of the 4 End options affect how the "end date" is defined.

One important thing to note is that Begin and End are independent of each other. That means that you can use different options for Begin and End

For example, I want to look at data from 2021FW1 (ie first fiscal wk of 2021) until today:

  • Begin
    • Fiscal Year = 2021
    • Interval = 1
    • Interval Type = Fiscal Week
  • End
    • Interval = today
    • Interval Type = Repository Variables

Another example, I want to look at data from beginning of current fiscal quarter until the end of the next fiscal week:

  • Begin
    • Interval = fiscal_day_current_fiscal_quarter
    • Interval Type = Repository Variables
  • End
    • Interval = 2
    • Interval Add/Sub = +
    • Interval Type = +/- Fiscal Weeks