Resource limitations frequently hinder the deployment of robust machine learning models in real-world applications. This research proposes BugBERT, a solution tailored for binary classification of software issues as either bugs or non-bugs, specifically within resource-constrained environments. Leveraging TinyBERT, a distilled variant of the BERT language representation model, BugBERT aims to achieve high classification accuracy while maintaining computational efficiency. Through this exploration, the study contributes to the field of efficient deep learning odels for specialized classification tasks on devices with limited resources.
Json files containing the dataset of 5,591 tickets used in experiments. Dataset 📜 is split into 7 files to avoid the anonymisation limitation of 1MB performed by anonymous.4open.science The dataset coming from the conference paper: "It’s not a bug, it’s a feature: how misclassification impacts bug prediction" by Herzig, Kim and Just, Sascha and Zeller, Andreas.)
RaspberryPiUtils: Contains the scripts which were used to create docker image and run the API which wraps BugBert. Additionaly contains a sample script used to hit the API BugBertTest.py: Script to test BugBert model BugBertImplementation: Jupyter book to build BugBert