/PDPilotUserStudy

Primary LanguageJupyter NotebookBSD 3-Clause "New" or "Revised" LicenseBSD-3-Clause

PDPilot User Study

Contents

This repository contains the material for the user study on PDPilot.

  • introduction-and-interview-slides.pdf: Slides to introduce the study and review PDP and ICE plots and slides for the semi-structured interview.
  • protocol.pdf: A document that outlines the protocol of the study. In particular, it includes the script for the tutorial of how to use PDPilot, the questions that were asked to participants during training, and the prompt for the model analysis.
  • ames: The Ames, Iowa housing dataset, trained model, and notebooks for preprocessing the dataset, training the model, calculating the plots, and running PDPilot. This dataset is for the main model analysis that the participant performs.
  • bike-rental: The Bike Sharing dataset and notebook for training the model and running PDPilot. This dataset is used during training.
  • churn: The Churn dataset and notebook for training the model and running PDPilot. This dataset is used during the tutorial.

Installation

To run the code for the study, you'll need Python 3.8 - 3.11, XGBoost, Jupyter lab or notebook, and pdpilot. Below are examples of setting up an environment using conda and venv.

conda:

git clone https://github.com/DanielKerrigan/PDPilotUserStudy.git
cd PDPilotUserStudy
conda env create -f environment.yml
conda activate pdpilot-study
jupyter-notebook

venv:

git clone https://github.com/DanielKerrigan/PDPilotUserStudy.git
cd PDPilotUserStudy
python -m venv pdpilot-study
source pdpilot-study/bin/activate
python -m pip install -r requirements.txt
jupyter-notebook

Additionally, download this JSON file containing pre-computed PDP and ICE plots and place it in the ames folder.