Objective

In this project, we are writing a web application using the GitHub REST API. The app includes Scrum-style visualisation (burndown chart) and improved issues management.

Description (by Janice)

Purpose of Burndown Chart

Predict completion time -> Avoid missing the deadline
Visulization of progress to the whole team
Display the remaining work across time

Litmitations

Burndown chart doesn’t show any changes (+/-) (e.g.) Chart may be flatten due to completed work = new work
- Does not show the impact of scope changes
Cannot show the dependency work (Bottleneck) / any blocked issues (Long WIP)
- Reducing WIP : Get more stories done with fewer defects found in proudction
Cannot show the imbalance work (productiviy imbalance)
- Can make use of this point if assigning issue to developer is one of our functions
- Solution: Burndown chart by developer

Solution

Burn up chart
- inclusion of the scope line
- tracks when work has been added to or removed from the project
Phil Goodwin and Russ Rufer’s Modified Burn Down Chart
- Alternative way to show the extra work
- New Velocity prediction
Adding WIP line for chain issue? (Bottleneck)

Terminologies

Velocity: total effort estimates associated with user stories that were completed during an iteration
Scope creep: Uncontrolled growth in a project’s scope

Issue Classification (by Alex)

Most time of the software cycle is spent on maintenance instead of planning, designing, implementation etc.

The informality of Modern Code Review has it upsides and downsides. One of the challenges is to keep MCR manageable for a large project. 'Issues' is a vague term that actually encompasses many things. Not all reports are bona fide. Thus, having too many issue reports is a severe distraction for large projects.

There are many ways to classify issues. We override GitHub's default labelling system with our own:

software bug report (Action: Assign issue to developer team.)
documentation errors (e.g. broken links, not clear, typos. Action: Assign issue to documentation team.)
performance issues (e.g. slow, huge memory consumption. Action: Assign issue to tester team to feedback to developer team.)
question/technical support (e.g. how to ..., cannot install. Action: Direct the users to user manual or the ops team.)
feature requests (save for later, when there is time capacity and resources)
invalid/spam (Action: the issue should be closed immediately)

We do supervised learning to classify these issues using the spacy library's en_core_web_sm pipeline (includes the corpus, tagger, lemmatizer and a CNN model). As in multi-class classification, the output confidence percentage that the issue belonging to categories, not the category itself. The issue is tagged only if confidence percentage is high, above 50%.

With such classifications, it is also easier to find duplicate issues. If the we continue to improve the application, pull requests can also be classified in such way (except "question/technical support").

>> See our unit test here <<

Reference

Implementation

We are using the Flask framework, since many functions can be added by directly calling Python, including NLP (for classification of issues) and complex visualisation like class diagrams.

Flow

Schema design

Everything GitHub supports is hosted by GitHub. Everything GitHub does not support is hosted by our small database. We keep the following data:

Collection roles

owner	reponame	collaborator	role
str	str	str	'developer'\|'documentation'\|'tester'\|'support'

Collection tasks

owner	reponame	githubIssueID	startdate	enddate	originalenddate	assignee	status
str	str	int	date	date	date (absent if `'status' == 'normal'`)	str	'normal'\|'resolved'\|'delayed'

Collection backlogs

owner	reponame	githubIssueID	log
str	str	int	str

AlexanderPoone/Proj_Mgmt_Project