/scrum

Releng agile sprint process

Primary LanguagePython

Releng Agile Process Tracking

This repo contains documentation and issues related to how the Releng team at Mozilla runs its agile development process.

Based heavily on the Taskcluster team's repo.

Table of Contents


Definitions

In order to avoid confusion when discussing our process, it's helpful to define the various terms we use to categorize our work at different levels of granularity and how they roll up into each other.

We generally follow the Atlassian model.

Theme -> Initiative -> Sprint -> Epic -> Story (-> Subtask)

Theme

A theme is a large focus area than spans the Mozilla organization and is pertinent to and addressable by the Releng team. Themes are roughly analogous to organization objectives. Themes are gathered here.

Initiative

An initiative is a collection of one or more sprints that, taken together, address one or more themes. Initiatives map to major roadmap projects that the Releng team would like to accomplish, and can be either new functionality or substantial reworks of existing functionality. Depending on the project area, an initiative can be a thin wrapper around a single sprint if that sprint is high value and self-contained. Initiatives are gathered here.

Sprint

The sprint is made up of a small collection of epics, and should be resolvable without a period of ~3 weeks. This can be a thin wrapper around a single epic if the user story is large and/or important enough. Sprints can address one or more initiatives; all the epics in a given Sprint do not need to service the same initiative. Sprints are tracked in Jira.

Epic

An epic is a collection of user stories that describes plainly a feature of Releng infrastructure.

Story

A story is a single, cohesive task as represented by a Jira story. Larger stories should be decomposed into smaller subtasks. No single story should take more than 1 engineer week to accomplish, modulo time to review and deploy. Stories should address a single epic.

A well-defined story is the basis for all of our estimation. It cannot be overstated how important it is to make sure tasks are decomposed to the point where they can be estimated accurately. It underpins everything else.

An Example

Here is a top-down categorization example from an actual sprint:

We should revisit this once we have clearer examples.

Tools

All Releng work for sprints is tracked in Jira.

Sometimes Bugzilla bugs or Github issues will be part of our sprint. When this happens we will create a corresponding story or Epic in Jira and reference the original bug or issue.

We are using Jira to manage our agile process. Jira allows for:

  • dependency tracking
  • estimate tracking
  • sprints
  • burndown charts

Any issues that are purely process-based (e.g. scrum documentation) should be filed in the scrum repo. Epics are represented in Jira as specially-tagged issues. If we provide a label that matches the scrum repo's initiative, those epics will show up in the appropriate initiative's search link.

Roles

We have three defined roles in our agile process:

1. Product owner

The Product owner manages the product backlog and keeps the rest of the team working on the most important thing. They have the final decision on matters of scope. They can also change the scope of the current sprint or, at their discretion, end the sprint early.

1a. Epic owner

For sprints composed of multiple, unrelated epics, a different Product owner can be assigned per epic.

2. Scrum master

The Scrum master deals with the scrum process itself. They run the kick-off meetings, the daily stand-ups, and the wrap-up meetings (review & retrospective) for every sprint. They prompt developers for status and follow-up when developers are blocked. They support the Product owner in whatever way the Product owner deems necessary. The Scrum master also works with Future sprint champions to ensure high quality, well-scoped milestones for future sprints.

3. Future sprint champion

The Future sprint champion is responsible for organizing issues in the Product Backlog into cohesive epics that the team can work towards in a future sprint. They are expected to devote a few hours every week to triaging issues in the backlog to refine the scope for the future sprint. Note: there can be multiple Future sprint champions active at one time.

Sprint process

Planning a sprint

During the previous sprint, one or more Future sprint champions do the work organize the issues into epics, and epics into new sprints. User stories are discussed and converted into Epics. New issues are filed or existing issues are tagged against the upcoming sprint.

Jira allows setting dependencies between issues. While we need to be careful to avoid having too many dependencies within a given sprint to avoid blocking work, we should leverage dependencies to help reveal the critical path for sprints.

When planning for upcoming sprints, we should focus on the needs of Releng and the community deployment. We can act on specific requests or use cases from the Firefox deployment, but we should not assume anything on their behalf. This will help avoid Mozilla-specific solutions and potential re-work later on.

As much as possible, RFCs should be written outside the sprint process, with only implementation happening within the context of sprints. If the amount of work required for implementation is unclear, estimates should be higher.

Estimation

The Releng team uses Fibonacci-scale story points for estimation. Each team member is assumed to be able to deliver on 3-4 story points in a given week. This gives individuals time to deal with interrupts and also to pursue some professional development time. Care should also be taken to plan around upcoming holidays and vacation to avoid over-packing the next sprint.

The first step for sprint estimation is adding up the available story points for all team members over the coming weeks to see what the maximum achievable amount of work is. Combined point estimates for milestones need to fit within that limit to be achievable and realistic.

Story point estimates are made on each issue, often through consultation with other team members. Care should be taken to avoid adding too much work to a given sprint, and the issue estimates are a good tool to gauge how conservative to be.

Some areas of the Releng code are easier to deal with than others. If the issues or epics involve modifying code in a service that is less well-known or difficult to test (e.g. auth), story point estimates should be higher by default to recognize the need for closer inspection on review.

Choosing the next sprint

As we near the end of the current sprint, the milestone for the next sprint is often obvious based on organizational needs or follow-on work that builds on the current sprint. If there are multiple milestones possible, they should all be added to the roadmap. This helps keep customers and external parties informed.

If multiple future sprints are ready and there are no other factors to aid decision, the Scrum master will decide which sprint will come next.

Workspace configuration

TBD

Starting the sprint

At the start of the sprint, the Scrum master will update Jira to create a sprint with the sprint Epics. The Product owner and Scrum master will decide how and when they want to meet to review sprint progress, possibly leveraging an existing 1x1 if one exists.

We will hold a kick-off meeting where the Product owner will provide context enough for people to get started and assist developers in choosing their first issue to work on. The Scrum master will remind everyone of any process changes being adopted for the new sprint. Work begins.

During the sprint

The team will hold daily stand-up meetings for the duration of the sprint. Our daily stand-ups take place in Matrix asynchronously at the end of our day. The format of the update is:

  • DONE: work that has been completed
  • TODO: work that is coming up next
  • BLOCKERS: what is preventing you from getting the work done
  • INTERRUPTS: what is delaying you from getting the work done

The Product owner is paying attention to these updates to make sure that the most important things are being worked on and are being picked up next.

The Scrum master is paying attention to these updates to make sure that blockers are highlighted and interrupts are kept to a minimum.

Once a week, the Product owner and the Scrum master will meet to assess sprint progress. Using the burndown chart, they figure out whether the sprint is still on track and can be completed within the allotted time. If necessary, the critical path can be refined to allow partial delivery of the original objective. Under extraordinary circumstances, they can opt to end the current sprint.

Ending the sprint

As the sprint draws to a close the Scrum master with schedule two meetings for a Review and a Retrospective. These meetings are usually scheduled back-to-back for convenience.

In the Review meeting, the Product owner will lead the discussion. The focus will we be on the content of the sprint and the outcomes achieved. Feedback on the sprint content will also be collected. At the end of the Review meeting, the team will triage any outstanding issues from the sprint. The Product owner will engage with developers who still have work marked as In Progress to figure out how and when to wrap-up that work during the Time between sprints. All issues that have not been started will be moved back into the Product Backlog.

In the Retrospective meeting, the Scrum master will lead the discussion. The focus will be on the sprint process itself in an attempt to uncover grey areas and process improvements. If the team agrees to adopt a particular process change, it should be clearly marked in the meeting notes with an ACTION label and the person responsible for that action, if appropriate.

Notes for both meetings should be taken using the templates provided:

As process changes are adopted, this document should be updated to reflect the new process.

Time between sprints

We allow ourselves one week between sprints. This buffer allows us to analyze the previous sprint without the pressure of immediately jumping into something new. Pragmatically speaking, it also gives us the time to finish off sprint items that may still require deployment, to address unanticipated fallout from the previous sprint, and to deal with the backlog of small, timely requests that may have accumulated during the sprint.