This project analyzes a Git project's commit activity. We use the PyDriller Python library to traverse git commits, we store the data in Google Cloud Storage, and we analyze the data with an interactive Rill dashboard.
Here's the dashboard deployed for the DuckDB repository: https://ui.rilldata.com/demo/rill-github-analytics/duckdb_commits
Answer questions like:
- What parts of your codebase are the most active? What parts have the most churn?
- How large are commits? What do commits that touch many files have in common?
- How productive are your contributors? How does productivity change week over week?
- What parts of your codebase are different contributors working on? With what programming languages?
Follow the instructions below to analyze your own Git project.
To start, you'll want to clone this repository so you can edit the files and run the scripts.
git clone https://github.com/rilldata/rill-examples.git
cd rill-examples/rill-github-analytics
- Create a bucket in Google Cloud Storage
- See these instructions for setting up a GCS service account: https://docs.rilldata.com/deploy/credentials/gcs#how-to-create-a-service-account-using-the-google-cloud-console
- Save the service account key as a JSON file
- Edit the following variables in
download_commits.py
:
REPO_SLUG
REPO_URL
(if your repo isn't on GitHub)BUCKET_PATH
GCP_SERVICE_ACCOUNT_KEY_FILE
- Run the script locally or setup a cronjob to run it periodically.
The project uses Poetry to manage its Python virtual environment. Install Poetry and run the following commands:
poetry install
poetry run python3 download.py
- Upon completion, find the following files at your provided
BUCKET_PATH
:
commits/commits_{TIMESTAMP}.parquet
commits/modified_files_{TIMESTAMP}.parquet
- Copy the
sources/duckdb_commits_source.yaml
andsources/duckdb_modified_files_source.yaml
files and edit them to point to your bucket. - Copy the
models/duckdb_commits_model.sql
file and edit it to point to your new sources. - Copy the
dashboards/duckdb_commits.yaml
file and edit it to point to your new model. - Configure your storage credentials: https://docs.rilldata.com/deploy/credentials/
- Install and start Rill
curl -s https://cdn.rilldata.com/install.sh | bash
rill start
- Explore your dashboard!
Run rill deploy
in your project directory and follow the instructions. See docs.