Building reports from the Augur database schema is an important way for catalogouing questions that our users ask, as well as how we can make the most of the extensive, validated data Augur gathers. Everything is not "dashboard material". Over time, reports people find useful will become APIS in Augur that can be used to construct automated community reports. This is a repository for engaged requirements gathering in the agile spirit.
The template directory is where we begin to understand what the reports "are", as we expect individual users to create their own directories where they make new reports, work with us to create new reports, and then share them with the CHAOSS community. The goal is to then build generalized augur queries/APIs, etc, and make them available as tools.
After finishing setup, use the START_HERE.ipynb file to identify the Repository ID's available in your database.. You will, in most cases, be interested in a subset of all available repositories at any given moment, and this notebook lets you grab the ID info you will need to run those reports.
- Python 3.x
- pip
- virtualenv package
pip3 install virtualenv
- Install
geckodriver
for your platform if you want to write annotated PNG files out. This is a great way to automate report generation!- osx:
brew install geckodriver
- Linux, Windows: Download the latest geckodriver release for your platform from
https://github.com/mozilla/geckodriver/releases
and follow installation instructions. You can also get source code from that link.
- osx:
- Fork the augur-community-reports repository
- Clone your fork of the repository locally
git clone https://github.com/<your-fork>/augur-community-reports
- Create your python virtual environment wherever you routinely store them. We use a
virtualenvs
directory.
virtualenv --python=python3 virtualenvs/augur-community-reports
- Activate your virtual environment
source ../../virtualenvs/augur-community-reports/bin/activate
- Install the necessary Python libraries
pip install -r requirements.txt
- Change into the directory of your clone
cd augur-community-reports
CREATE USER chaoss WITH PASSWORD 'port88';
GRANT CONNECT ON DATABASE augur TO chaoss;
GRANT USAGE ON SCHEMA augur_data TO chaoss;
GRANT USAGE ON SCHEMA spdx TO chaoss;
GRANT USAGE ON SCHEMA augur_operations TO chaoss;
GRANT SELECT ON ALL TABLES IN SCHEMA augur_data TO chaoss;
GRANT SELECT ON ALL TABLES IN SCHEMA spdx TO chaoss;
GRANT SELECT ON ALL TABLES IN SCHEMA augur_operations TO chaoss;
ALTER DEFAULT PRIVILEGES IN SCHEMA augur_data
GRANT SELECT ON TABLES TO chaoss;
There are two directories the project starts with:
CHAOSS-Example
, which is an example against a publicly available Augur database of the CHAOSS Project's organization on GitHub andtemplates
, which is a copy of the same notebooks found inCHAOSS-Example
that we intend you to make a copy of for your project, which you can do on most linux based systems by running the commandcp -R templates my-project-name
(consider replacingmy-project-name
with a meaningful project name).- In your new directory, edit the
config.json
file in a text editor so that it contains credentials for your Augur database. In the directory where you want to run Jupyter Lab from, create a file called "config.json":
{
"connection_string": "sqlite:///:memory:",
"database": "augur_mine",
"host": "servernameorip.augurlabs.io",
"password": "<your password>",
"port": 5433,
"schema": "augur_data",
"user": "augur",
"user_type": "read_only"
}
- From your augur-community-reports home directory, with your local python virtual environment activated, and requirements installed, run the command:
jupyter lab
.
In any NEW jupyter notebooks, place this text as the first cell, if you do not begin by copying one of the existing notebooks:
import psycopg2
import pandas as pd
import sqlalchemy as salc
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
import warnings
import datetime
import json
warnings.filterwarnings('ignore')
with open("config.json") as config_file:
config = json.load(config_file)
database_connection_string = 'postgres+psycopg2://{}:{}@{}:{}/{}'.format(config['user'], config['password'], config['host'], config['port'], config['database'])
dbschema='augur_data'
engine = salc.create_engine(
database_connection_string,
connect_args={'options': '-csearch_path={}'.format(dbschema)})
In both the pull request and contributor templates the control cell is used to configure what is in the report.
- Repo_set: Takes a list of repo_ids you need visualizations for.
- Display_grouping: Can be set as 'repo' or 'competitors'. 'repo' groups the visualizations by repo, and 'competitors' groups the visualizations by chart, so data can be easily compared against other repos.
- Not_alised_repos: Takes a list of repo_ids you do not want aliased, when display_grouping is set to 'competitors'
- Save_files: Can be set to True or False, when True all the visualizations will be export as PNG's
- Begin_date and End_date: Take a string in date form, i.e. '2020-03-30'
- Group_by: Determines how data is grouped in bar charts. Can be set to 'year', 'quarter', or 'month'
- Time and Num_contributions_required: Constraints for a repeat contributor. Indicate the number of contributions a contributor must make in the time to be considered a repeat contributor. Time is in days.
- scatter_plot_outliers_removed: Indicates the number of outliers you would like to remove on the days_to_first_response scatter plot. When you have a small number of outliers, this variable is useful for improving the utility of the visualizations.
Copyright © 2020 University of Nebraska at Omaha, University of Missouri and CHAOSS Project at the Linux Foundation
Notebooks in intial release Authored by Andrew Brain, and Gabe Heim. Don't believe everything you see in a commit history. ;)
Augur is free software: you can redistribute it and/or modify it under the terms of the MIT License as published by the Open Source Initiative. See the LICENSE file for more details.
This work has been funded through the Alfred P. Sloan Foundation, Mozilla, The Reynolds Journalism Institute, and 9 Google Summer of Code Students.