The static source code metrics were collected previously as part of the SmartSHARK ecosystem. To recreate the collection process a combination of the plugins vcsSHARK, mecoSHARK and optionally serverSHARK for orchestration of plugins is needed.
We provide several different entry points for this replication kit. The first starts with a database dump as the source and extracts the data from the database. The second starts with the already extracted data and just re-created the plots and tables in the paper. In addition, we provide our fine-tuned model to play with live on the website.
The raw data used in this paper comes from a SmartSHARK database dump containing 54 Apache projects in Java used in a previous study. If you only want to re-create the plots and tables you can skip this section.
Download the database dump and import it into your local MongoDB. You may have to create a user first. While we have a dump that only contains our study subjects, the data is also contained in any official release > 2.1 of SmartSHARK.
wget https://mediocre.hosting/smartshark_emse.agz
# restore mongodb dump
mongorestore -uUSER -p'PASSWORD' --authenticationDatabase=admin --gzip --archive=smartshark_emse.agz
After restoring the MongoDB dump, the changes need to be extracted.
python diff_metrics.py
This creates pickle files, one for each project, this is not very space efficient. It creates about ~4GB of data which is also the reason we do not bundle this.
The raw data from the manual classification phase is available in './data/change_type_label_export2.pickle'. This is an export from the visualSHARK frontend.
This is further aggregated in the Jupyter Notebook 'notebooks/ReadManualData.ipynb'. The results of this step and the pickled files from 'diff_metrics.py' are then aggregated in the Jupyter Notebook 'notebooks/CreateDataset.ipynb'.
The metrics data with available ground truth manual classifications is now available in 'data/only_changes.csv' and 'data\all_changes.csv'. To complement the ground truth with predictions for all other data we first fetch the fine-tuned model. The fine-tuning step itself is explained in the model evaluation part.
cd ft/fine_tuned
wget https://smartshark2.informatik.uni-goettingen.de/sebert/seBERT_fine_tuned_commit_intent.tar.gz
tar -xzf seBERT_fine_tuned_commit_intent.tar.gz
Then we add the predictions of the fine-tuned model via the 'AddPredictrions.ipynb' notebook.
This uses all of the aggregated data to create the plots and tables.
python -m venv .
source bin/activate
pip install -r requirements.txt
Due to the restriction of Github for files > 100 MB we do not bundle these files.
cd data
wget https://zenodo.org/record/5494134/files/all_changes_sebert.csv.gz
The final data is distributed in the data directory. For all plots and tables only 'notebooks/CreatePlotsTables.ipynb' is needed.
source bin/activate
cd notebooks
jupyter lab
To evaluate if the pre-trained seBERT can be adapted for this task with fine-tuning we run 100 fine-tuning iterations and measure the classification performance.
cd ft/models
wget https://smartshark2.informatik.uni-goettingen.de/sebert/seBERT_pre_trained.tar.gz
tar -xzf seBERT_pre_trained.tar.gz
This generates data for each training and testing step. As we distribute the model evaluation, we use this to make sure that every model uses the same data for each run.
cd ft
python generate_multi_label_folds.py
This generates submission scripts for a SLURM HPC system. It simply generates a submission script for each of the 100 runs for seBERT fine-tuning and the RandomForest baseline. After generating the scripts they can be submitted to the HPC system. Be aware that this is using SLURM and that you need about ~900GB of disk space as quite a bit of model data is generated. The HPC System also needs GPU nodes, in our case we evaluated the model on Nvidia Quattro RTX 5000 GPU nodes.
cd ft
python generate_commit_intent_ml_scripts.py
We also provide a Jupyter notebook which shows the fine-tuning and evaluation step 'notebooks/FineTuneModel.ipynb'.
After we evaluate the model performance and we are satisfied we generate the final model using all available ground truth data.
cd ft
python generate_sebert_intent_model.py
If you just want to try out the final fine-tuned model you can use the live version on the website or you can use our provided Jupyter Notebook 'notebooks/AddPredictions.ipynb'. However, you need to download and extract the fine-tuned model first to use the notebook.
cd ft/fine_tuned
wget https://smartshark2.informatik.uni-goettingen.de/sebert/seBERT_fine_tuned_commit_intent.tar.gz
tar -xzf seBERT_fine_tuned_commit_intent.tar.gz