
A project to investigate the fairness of machine learning approaches to predicting the risks of stillborn and pre-term pregnancies.

This project makes use of both R and Python, so we will need to be extra careful with dependencies management and interoperability.

Python setup

For Python, I've been working in a virtual environment titled "venv" in the root directory of this project. Create one with:

$ python -m venv venv

To execute anything Python-related from the command line, source the virtual environment first.

$ source venv/bin/activate

To install the same packages across machines, run (after having sourced the venv):

$ pip install -r requirements.txt

Or to save a new dependencies list:

$ pip freeze > requirements.txt

If the pip install doesn't work, I believe that all the packages used for now are: numpy, pandas, tensorflow, scikit-learn, and lightgbm.

R setup

For our R scripts and mixed R/Python scripts, RStudio is an editor and interpreter. We currently require the tidyverse, caret, and fakeR packages, as well as reticulate, which provides Python interoperability.

For package versions, here is the call to utils::sessionInfo() on one machine:

