3244-pulmonary-fibrosis-notebooks

This repository contains the notebooks we used for CS3244 Machine Learning in our project to predict the prognosis of pulmonary fibrosis disease using patient CT scans and metadata. Our linear regression and ridge regression models as well as all deep learning models are included. Ablation study notebooks are pre-fixed with ablation (these do not use the optimal weight file of fold-1.h5, we train each of them using the kaggle GPU and test them separately).

Our playground notebook to gather data for our meso-analysis is located in meso_analysis_rmse_values.ipynb. A few plots in our report is not generated by python code but done manually in Tableau.

In order to run the notebooks, the paths inside the notebook would have to be updated as currently the paths in all the jupyter notebooks are exported from our kaggle notebooks.

If you would like to run the notebooks in kaggle itself, all the versions of our notebooks and experiments are also located here: https://www.kaggle.com/wekandoit/layers-changed-training-notebook-efficientnet-b1

The best model weights

The best model's weights are saved in fold-1.h5. In order to run inference with this weight file, you have to load all imports and helper functions before the training step in the notebooks. Then, run the build_model function in the testing chunk of the code for inference.

Acknowledgements

Some of the notebooks here heavily reference and are built upon Khoong Wei Hao's work here: https://www.kaggle.com/khoongweihao/efficientnets-quantile-regression-inference

Contributors and Teammates

  • Vanessa Tan
  • Li Huihui
  • Lee Hui Qi
  • Chattoraj Ayush
  • Mani Mekala Sharad Hosur
  • Aiden Low