/metrics-1

Primary LanguagePythonApache License 2.0Apache-2.0

TwitterOSS Metrics

General

This is the README for the TwitterOSS Metrics repo, which generates periodic reports based on the health of Twitter Open Source projects.

For more info, see twitter.github.io/metrics

Dependencies

Service Details
CHAOSS Augur Used to retrieve metrics such as Aggregate Summary, Bus Factor, and Repo Commits.
GitHub Actions Runs a weekly cron job that runs scripts in order to fetch data and generate reports.
GraphQL Directly used to fetch metrics from the GitHub GraphQL API.
Twitter Service Indirectly used for personal access token environment variable.
Metrics Dashboard Contains all reports for Repositories in repos-to-include.txt.
Slack reports Repo Runs a cron job and posts a message to slack with daily project activity based on metrics repo.
Year In Review Weekly updating, sliding window overview of past 12 months of activity on Twitter's Open Source Projects.

Service Outage Impact

If the service experiences problems:

  • Year in Review, Metrics Dashboard, and Slack Reports Repo will be unable to update.

Build

Environment Setup

  1. Clone Repo
    $ git clone https://github.com/twitter/metrics.git  
    $ cd ./metrics

Tracking new repositories and orgs

Edit repos-to-include.md

If you want to track an org and all its repositories which are hosted github.com/<org_name>, add <org_name>/* as a new line in repos-to-include.md. If you want to track some and not all repositories of an org, add <org_name>/<repo_name> as new lines for each public repo in repos-to-include.md.

Run The Scripts

$ python scripts/fetch_all_metrics.py

  • Reads all the repositories and orgs listed in repos-to-include.md
  • Requests GitHub GraphQL API
  • Creates one JSON file for each repository with format METRICS-YYYY-MM-DD.json
  • Saves the file inside _data/<owner>/<repo>/

$ python scripts/fetch_year_in_review.py

  • Hits aggregate_summary endpoint
  • Creates one JSON file that includes the metrics from the endpoint (watchers, stars, counts, merged PRs, committers, commits)
  • Saves the file inside _metadata/augur/

$ python scripts/gen_weekly_report.py

  • Iterates over every project listed inside _data
  • Picks the latest two Metrics which are atleast 6 days apart
  • Generates a Report based on these two Metrics files
  • Saves the json inside _data directory corresponding to each project, format WEEKLY-YYYY-MM-DD.json
  • Creates a _post for this report with some specific variables and the layout version

Additional Notes

  • GitHub Actions Config

    • Environment variables

      • OAUTH_TOKEN: Personal Access Token with repo access of a GitHub account.
      • GH_USERNAME: Username of the GitHub account.
  • Use Python 3.

  • _data contains all the data files

  • Files in _posts leverage _layouts and _data and generate HTML files

  • Don't change html files inside layouts. Create new layouts with new version.

  • Maintain versions of metrics layouts (See METRICS_VERSION inside the script to generate reports. Also create a new _layout for each metrics version). If you add more data, the new posts should be on a new version (which wouldn't break previous pages)

  • Use repos-to-include.md and repos-to-exlude.md files to add org/repository for respective purposes.

  • Prepend {{ site.url }}{{ site.baseurl }} and use relative URLs

    • e.g. {{ site.url }}{{ site.baseurl }}/css/main.css
  • Execute all the scripts from the home of the directory. e.g. python3 scripts/fetch_all_metrics.py