WIP: A propensity score utility for Julia
Installation
$ julia -e 'using Pkg; pkg"add https://github.com/XXX";'
Load Demo Data
- Subjects: 400 subjects (male) from retrospective cohort study hospital with suspected MI.
- Outcome:: 30-day mortality (death=1)
- Intervention:: Rapid administration of a new clot-busting drug (trt=1) versus a standard therapy (trt=0)
- Source:: http://web.hku.hk/~bcowling/data/propensity.csv
using Propensity
using CSV
df = CSV.File("../../data/propensity.csv") |> DataFrame;
# subset to relevant covariates
df = select(df, Not([:death, :male]))
Fit logit function for propensity of intervention
# Fit function
logit = fit_logit("trt", df)
# Assign propensity scores
df = assign_propensity_scores(df,logit)
Inspect Propensity Scores by Intervention Status
df[!, Symbol("Treatment")] .= ifelse.(
df.trt .== 1, "Treatment", "No Treatment")
plot_prop_by_factor(df, "Treatment")
Inspect Propensity Scores by Covariates
df = assign_quartile(df, "age", "age_quartiles")
plot_prop_by_covariate(
df,
"Treatment",
"age",
"age_quartiles"
)
WIP Outline:
Interface
- select covariates (artifact -> list of covariates)
Core
- calculate propensity score
- fit scores with logit (artifact -> trained model object)
- fit with sampling from majority class
- fit n models (when above > 1)
- predict score with trained model (artifact -> scores per instance table)
- fit scores with logit (artifact -> trained model object)
Stratification/matching
- Inspect covariates by propensity scores (quartiles)
- find matches by propensity score (if exists)
- methods: greedy/random, nearest (knn), threshold (based on distance)
- (artifact -> matched/sampled training instances table)
Propensity Score Weighting
- TBD
Analysis
- plot scores
- tests
Resources