libreoffice-ci

GSoC Project: LibreOffice CI Test Selection with Machine Learning

The goal of this project is to select unit tests based on (patch,test) pair. Three models (testlabelselect, testfailure, testoverall) are trained to predict unit tests results given a patch on different levels.

The work is based on Mozilla's bugbug and rust-code-analysis.

Models

testlabelselect model predicts the failing probability of each unit test given the patch.

	Fail (Predicted)	Pass (Predicted)
Fail (Actual)	3860	203
Pass (Actual)	191593	1109768

testfailure model predicts the overall failing probability of a patch based on patch features only.

	Fail (Predicted)	Pass (Predicted)
Fail (Actual)	614	527
Pass (Actual)	2155	4863

testoverall model improves upon testfailure by using testlabelselect predictions to predict whether a patch will fail any unit test.

	Fail (Predicted)	Pass (Predicted)
Fail (Actual)	810	331
Pass (Actual)	2413	4605

A smart inference is built based on testlabelselect and testoverall predictions. By setting a threshold for the number of failed unit tests, 91% of failures can be captured, while reducing computation by 57%.

	Fail (Predicted)	Pass (Predicted)
Fail (Actual)	10617	1054
Pass (Actual)	30103	39815

Currently, the smart inference is integrated into Jenkins to save computation. If a patch is likely to fail any unit test, the sequential fast track will be run because it is assumed that the patch will fail some unit tests and there is no need to run everything. If it is likely to pass, the normal track will be run to ensure code correctness.

testlabelselect is not directly used to select unit tests because it is not able to capture all failures, about 5% failures will escape and it could cause severe problem.

Environment

Install build-essential and zstd:

sudo apt install build-essential
sudo apt install zstd

Clone libreoffice:

git clone https://gerrit.libreoffice.org/core libreoffice

Install rust:

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
export PATH="~/.cargo/bin:$PATH"

Install rust-code-analysis:

cargo install rust-code-analysis-cli rust-code-analysis-web

Install conda:

wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh

Clone libreoffice-ci:

git clone https://github.com/baolef/libreoffice-ci.git
cd libreoffice-ci

Install Python dependencies:

conda env create -f environment.yml
conda activate libreoffice-ci

Data

To extract features for past gerrit pushes, extract data/jenkinsfullstats.csv from data/jenkinsfullstats.csv.xz first, and then run:

python dataset/mining.py --path ../libreoffice

To extract all unit tests, extract pushes features data/commits.json first, and then run:

python dataset/mapping.py

To extract features for unit tests, extract pushes features data/commits.json and data/tests.json first, and then run:

python dataset/test_history.py --path data/commits.json

To convert one database format (eg. data/commits.json) into another (eg. data/commits.pickle.zstd):

python dataset/convert.py data/commits.json data/commits.pickle.zstd

Training

To train a model (eg. testlabelselect, testoverall) after extracting necessary data:

python train.py testlabelselect
python train.py testoverall

Training a model with full dataset may be time and memory consuming, --limit argument can be used to train a subset:

python train.py testlabelselect --limit 16384

Detailed training scripts are available for ungrouped data scripts/train.sh and grouped data scripts/train_group.sh.

Inference

To inference a model (eg. testlabelselect) after training necessary models (eg.testlabelselect, testoverall) for a commit hash (eg. a772976f047882918d5386a3ef9226c4aa2aa118):

python test.py testlabelselect --revision a772976f047882918d5386a3ef9226c4aa2aa118

If a commit hash is not specified, it will perform inference on the last commit.

Detailed inference script is available in scripts/test.sh.