Google Workspace Analytics Audit Logs [BETA]

This is not an oficially supported Google product.

The reports made available in the Workspace Analytics Block enable customers to get insight into the areas of Workspace Adoption, Collaboration and Security. This allows customers to review the effectiveness of Google Workspace within their organization and highlights any areas that may need improvement through targeted training outreach or other specific guidance.

This block contains metrics for the following services, which is filtered by activity.record_type: gmail, drive, login, meet, and rules

Schemas and API reference for each can be found here:

Requirements

  1. Service Log Exports to BigQuery:

    • This block requires setting up service log exports to BigQuery. Once this is set up, there will be 2 tables created in the BigQuery project and dataset specified: "activity" and "usage"
  2. Looker User Attribute:

    • You will need create a user attribute that will be used to apply a filter for each Organizational Unit:

      Name: workspace_analytics_organizational_unit
      Label: Workspace Analytics Organizational Unit
      Type: string
      User Access: view
      Hide values: no
      
    • The value should be in a JSON string format, so if your domain is my_org.com, then this should be inputted as ["my_org.com"]

    • For an administrator with access to all data, set the user attribute the primary domain to view all OU events as well as system generated and external user events. For lower level access, the user attribute can be set to the OU path, e.g. ["my_org.com","team_a","foo"]. See example below for the "OU Lookup" derived table

  3. Custom Visualizations (Optional):

    • Go to Marketplace > Discover and search for "Chord Visualization" and install
    • Go to Marketplace > Discover and search for "Sankey Visualization" and install
    • These visualizations are used on the Adoption and Collaboration dashboard, but are not required for the block to function

Installation

Constants

On block installation you will need to specify the following constants:

constant: WORKSPACE_ANALYTICS_CONNECTION_NAME {
  value: "bigquery"
  export: override_required
}

constant: WORKSPACE_ANALYTICS_PROJECT_ID {
  value: "GCP Project ID"
  export: override_required
}

constant: WORKSPACE_ANALYTICS_DATASET_NAME {
  value: "GCP BigQuery Dataset Name"
  export: override_required
}

constant: WORKSPACE_ANALYTICS_PRIMARY_DOMAIN {
  value: "MY_PRIMARY_DOMAIN.COM"
  export: override_required
}

constant: WORKSPACE_ANALYTICS_SECONDARY_DOMAINS {
  value: "MY_SECONDARY_DOMAIN_1.COM,MY_SECONDARY_DOMAIN_2.COM,..."
  export: override_required
}
  • Select the BigQuery connection that has access to the Workspace dataset (as specified in these docs). Looker will require PDTs to be enabled on this connection to create fact tables
  • Enter the Workspace project name and dataset name for the activity table, e.g. if the activity table is bq-project.my-dataset.activity then enter:
    GCP Project Name: bq-project
    Reporting Dataset Name: my-dataset
    
  • The Primary and Secondary domains can be found on the Workspace Admin Console

Model Access

  • Ensure that all users that want to leverage this block have access to the newly created model (this will need to be done after the block is installed)
  • Users will need to have a Looker role that either includes all model sets (e.g. an admin) or have a new role created which includes the model workspace_audit_logs

Files

Views

Raw Tables

The raw folder contains the autogenerated LookML base definition of the 2 tables: activity, usage

  • All fields are hidden by default and only exposed in the refinement layer

Refined

The refined folder contains the refined versions of the 2 views: activity, usage

  • These tables are queried as a derived table and include 2 additional columns:
    • A hidden _PARTITIONTIME column used for filtering on the partition date
    • GENERATE_UUID() AS pk for a primary key
  • Fields and customizations added here is shared logic to be leveraged in all subsequent views

The activity view is extended to specific views for each service: activity_drive, activity_gmail, activity_login, activity_meet, activity_rules

  • Fields and customizations added here is specific to each service

The usage view is extended to a specific view for: usage_customer and usage_user

  • Fields and customizations added here is specific to each usage view
  • The customer usage report aggregates Google Workspace service usage information, for all users, across an entire domain. The user usage report returns Google Workspace service usage information for a particular user in your domain

PDTs

Fact Tables

  • activity_aa - There are 13 aggregate tables (4 for drive, gmail, and login, plus 1 for drive collaboration) that contain measures with by the timeframe (date, week, month, or year) and OU dimensions
  • activity_active_user_facts - 1, 7, and 30 day roll up facts for active users across all products by Date and OU
  • activity_drive_facts - Fact table for drive documents
  • activity_meet_facts - Metrics for each conference and handles UNNESTING for participant level data
  • ou_user_lookup - This fact table shows the mapping of users to Organizational Unit(s). It is used for active metrics per OU and for the OU lookup for drive target users
  • usage_user_drive_facts - 1, 7, and 30 day roll up facts for active users across docs, sheets, and slides by Date (use activity_active_user_facts for breakdown by OU)

OU Lookup

  • ou_lookup - This fact table shows the Organizational Unit tree and is inner joined on every Activity explore with an access filter in order to only display data based on a user's workspace_analytics_organizational_unit user attribute. The table is in the following format:
ou_name ou_id json_ou_path
my_org.com ["my_org.com"] [my_org.com]
my_org.com ["my_org.com"] [my_org.com,team_a]
my_org.com ["my_org.com"] [my_org.com,team_a,foo]
my_org.com ["my_org.com"] [my_org.com,team_b]
my_org.com ["my_org.com"] [my_org.com,team_b,foo]
my_org.com ["my_org.com"] [my_org.com,team_b,bar]
team_a ["my_org.com","team_a"] [my_org.com,team_a]
team_a ["my_org.com","team_a"] [my_org.com,team_a,foo]
foo ["my_org.com","team_a","foo"] [my_org.com,team_a,foo]
team_b ["my_org.com","team_b"] [my_org.com,team_b]
team_b ["my_org.com","team_b"] [my_org.com,team_b,foo]
team_b ["my_org.com","team_b"] [my_org.com,team_b,bar]
foo ["my_org.com","team_b","foo"] [my_org.com,team_b,foo]
bar ["my_org.com","team_b","bar"] [my_org.com,team_b,bar]

By specifying a user attribute, a Looker admin can ensure that a user with an ou_id set to ["my_org.com","team_a","foo"] will only be be able to see activity data from users in that OU. Admins can set ["my_org.com"] in order to see data from all OUs.

Note that ou_name is not unique (e.g. team_a and team_b both contain a "foo") so the ou_id is used.

Common

The common folder contains sets that are shared between the views.

Explores

The explore folder contains the join logic between the views.

Base Explores

The activity_base explore contains shared logic for all activity explores. The access_filter is applied to ou_lookup.ou_id and a sql_always_where uses liquid conditions to mirror any filters applied to the time_usec to the _PARTITIONTIME column.

The usage_base explore follows the same logic to mirror filters applied to usage.date to the to the _PARTITIONTIME column. There is no ou data in the usage explores.

Activity [Drive, Gmail, Login, Meet, Rules]

The activity explore extends the base explore and is hidden and only used as a filter lookup.

The drive, gmail, login, meet, and rules explores extends the base explore.

The activity_consolidated is used for top level facts across all products.

Usage [Customer, User]

The usage_customer explore contains aggregated company-wide metrics across all products, rolled up by date.

The usage_user explore contains aggregated user metrics across all products, rolled up by user email and date.

Model

The model includes all the explore definitions.

Dashboards

  • Adoption and Collaboration - This dashboard contains internal utilization metrics on product usage activity, collaboration between Organizational Units, andfile sharing / creation
  • Security Audit - This dashboard has an external focus around how workspace is interacting with users outside your org, focusing on events like data exfiltration, suspicious events, and spam emails

FAQS

Which tables are being used in the dashboards?

All Dashboards have been build using the Activity table and do not include the Usage table for the following reasons:

  1. The Activity table contains OU data which lets users filter by OU
  2. The Usage table is preaggregated data and does not allow users to drill down to row level events
  3. Activity log events are available within 10 minutes, however the Usage log events have a lag time of 1-3 days

The Usage explores are hidden by default however can still be accessed as required

Why am I seeing a small variance in counts?

For performance reasons, this block utilizes allow_approximate_optimization to leverage BigQuery's HLL++ functions to approximate distinct counts. This enables distinct count with the aggregate tables, thus boosting query performance, though with 2% potential error.

What is an active user?

Active users are defined per product:

Product SQL Active User Description
docs WHERE activity.record_type = 'drive' AND activity.event_name IN ("edit", "view") AND activity.drive.doc_type = 'document' Viewing or editing any document
sheets WHERE activity.record_type = 'drive' AND activity.event_name IN ("edit", "view") AND activity.drive.doc_type = 'spreadsheet' Viewing or editing any spreadsheet
slides WHERE activity.record_type = 'drive' AND activity.event_name IN ("edit", "view") AND activity.drive.doc_type = 'presentation' Viewing or editing any presentation
drive WHERE activity.record_type = 'drive' AND activity.event_name IN ("storage_usage_update","download","upload","move","trash","untrash") Any of the following drive events: "storage_usage_update","download","upload","move","trash","untrash"
meet WHERE activity.record_type = 'meet' All meet events
calendar WHERE activity.record_type = 'calendar' All calendar events
gmail WHERE activity.record_type = 'gmail' AND gmail.message_info.action_type IN (10, 69, 71) Any of the following gmail actions: sending a message, changing spam classification, and any post delivery actions on messages

Can I filter by Groups instead of OUs?

OUs are the only available user mapping that comes out of the box as it is included in each row of the table. If you want to arrange your users in a different group, this can be achieved by using refinements (see example below).

How do I customize this block to add my own data?

You can leverage refinements to join your own data to this block, e.g. if you have a table which contains a mapping of your user emails and teams, this can be specified in the refinements.lkml file:

# include imported project's explores and views
include: "//workspace-audit-logs/explores/**/*.explore.lkml"
include: "//workspace-audit-logs/views/**/*.view.lkml"

# refine the activity_base explore to modify all explores
explore: +activity_base {
  always_filter: {
    filters: [my_teams.team: ""] # add to always_filter condition to make the filter appear by default
  }

  join: my_teams {
    sql_on: ${activity.email} = ${my_teams.email};;
    type: left_outer
    relationship: many_to_one
  }
}

# create a view for the table you want to join
view: my_teams {
  sql_table_name: `my_project.my_dataset.table_name` ;;

  dimension: email {
    primary_key: yes
    hidden: yes
    sql: ${TABLE}.email ;;
  }

  dimension: team {
    view_label: "Activity"
    sql: ${TABLE}.team ;;
  }
}