propensity.jl

WIP: A propensity score utility for Julia

Demo (WIP)

Installation

$ julia -e  'using Pkg; pkg"add https://github.com/XXX";'

Load Demo Data

Subjects: 400 subjects (male) from retrospective cohort study hospital with suspected MI.
Outcome:: 30-day mortality (death=1)
Intervention:: Rapid administration of a new clot-busting drug (trt=1) versus a standard therapy (trt=0)
Source:: http://web.hku.hk/~bcowling/data/propensity.csv

using Propensity
using CSV

df = CSV.File("../../data/propensity.csv") |> DataFrame;
 
# subset to relevant covariates
df = select(df, Not([:death, :male]))

Fit logit function for propensity of intervention

# Fit function
logit = fit_logit("trt", df)

# Assign propensity scores
df = assign_propensity_scores(df,logit)

Inspect Propensity Scores by Intervention Status

df[!, Symbol("Treatment")] .= ifelse.(
  df.trt .== 1, "Treatment", "No Treatment")

plot_prop_by_factor(df, "Treatment")

Inspect Propensity Scores by Covariates

df = assign_quartile(df, "age", "age_quartiles")

plot_prop_by_covariate(
        df,
        "Treatment",
        "age",
        "age_quartiles"
    )

WIP Outline:

Interface

select covariates (artifact -> list of covariates)

Core

calculate propensity score
- fit scores with logit (artifact -> trained model object)
  - fit with sampling from majority class
  - fit n models (when above > 1)
- predict score with trained model (artifact -> scores per instance table)

Stratification/matching

Inspect covariates by propensity scores (quartiles)
find matches by propensity score (if exists)
- methods: greedy/random, nearest (knn), threshold (based on distance)
- (artifact -> matched/sampled training instances table)

Propensity Score Weighting

Analysis

plot scores
tests

Resources

reycn/propensity-fix

propensity.jl

Demo (WIP)