/propensity-fix

WIP: Propensity score utilities for Julia

Primary LanguageJuliaMIT LicenseMIT

propensity.jl

WIP: A propensity score utility for Julia

Demo (WIP)

Installation

$ julia -e  'using Pkg; pkg"add https://github.com/XXX";'

Load Demo Data

  • Subjects: 400 subjects (male) from retrospective cohort study hospital with suspected MI.
  • Outcome:: 30-day mortality (death=1)
  • Intervention:: Rapid administration of a new clot-busting drug (trt=1) versus a standard therapy (trt=0)
  • Source:: http://web.hku.hk/~bcowling/data/propensity.csv
using Propensity
using CSV

df = CSV.File("../../data/propensity.csv") |> DataFrame;
 
# subset to relevant covariates
df = select(df, Not([:death, :male]))

Fit logit function for propensity of intervention

# Fit function
logit = fit_logit("trt", df)

# Assign propensity scores
df = assign_propensity_scores(df,logit)

Inspect Propensity Scores by Intervention Status

df[!, Symbol("Treatment")] .= ifelse.(
  df.trt .== 1, "Treatment", "No Treatment")

plot_prop_by_factor(df, "Treatment")

Inspect Propensity Scores by Covariates

df = assign_quartile(df, "age", "age_quartiles")

plot_prop_by_covariate(
        df,
        "Treatment",
        "age",
        "age_quartiles"
    )

WIP Outline:

Interface

  • select covariates (artifact -> list of covariates)

Core

  • calculate propensity score
    • fit scores with logit (artifact -> trained model object)
      • fit with sampling from majority class
      • fit n models (when above > 1)
    • predict score with trained model (artifact -> scores per instance table)

Stratification/matching

  • Inspect covariates by propensity scores (quartiles)
  • find matches by propensity score (if exists)
    • methods: greedy/random, nearest (knn), threshold (based on distance)
    • (artifact -> matched/sampled training instances table)

Propensity Score Weighting

  • TBD

Analysis

  • plot scores
  • tests

Resources