/R4Econ

R Codes from Various Projects using Panel Data

Primary LanguageHTML

This is a work-in-progress website of support files for doing Panel Data Statistics/Econometrics Analysis, produced by Fan. Materials gathered from various projects in which R codes are used. An effort is made to use only base R and tidyverse packages whenever possible to reduce dependencies. The goal of this repository is to make it easier to find/re-use codes produced for various projects.

R files are linked below by section. Various functions are stored in corresponding .R files. To use the files, clone the repository, and then source the preamble.R file. Some files also have examples/instructions created using Jupyter notebooks and are shown as HTML files. See here for Github set up.

Bullet points show which base R, tidyverse or other functions/commands are used to achieve various objectives.

Please contact FanWangEcon for issues or problems.

1. Summary Statistics

1.1 Tabulate and Counting

  1. Tabulation Categorical as Matrix: ipynb | R | html | pdf
    • Many-Category Categorical Variable, Tabulation shown as Matrix.
    • core: group_by + summarise(freq = n()) + mutate + min(ceiling(sqrt(count))) + substring + dim/reshape
  2. By Groups, Count Variables Observations: ipynb | R | html | pdf
    • By Groups, Count non-NA observations of All Variables.
    • core: group_by + summarise_if(is.numeric, funs(sum(is.na(.)==0)))
  3. By Groups, Count Unique Individuals: ipynb | R | html | pdf
    • By Groups, Count Unique Individuals and non-NA observations of other Variables.
    • core: group_by + mutate_if + mutate + n_distinct + slice(1L)

1.2 Averaging

  1. All Variables Summary Stats: ipynb | R | html | pdf
    • All Variables: N + NAcount + Mean + SD + Percentiles.
    • core: summarise_if(is.numeric) + gather + separate + spread + select
  2. By Groups, One Variable All Statistics: ipynb | R | html | pdf
    • Pick stats, overall, and by multiple groups, stats as matrix or wide row with name=(ctsvar + catevar + catelabel).
    • core: group_by + summarize_at(, funs()) + rename(!!var := !!sym(var)) + mutate(!!var := paste0(var,'str',!!!syms(vars))) + gather + unite + spread(varcates, value)
  3. By Groups, Multiple Variables Mean + SD + N: ipynb | R | html | pdf
    • By Groups, All Numeric Variables Mean + SD + N, groups = rows, variables = columns
    • core: group_by + summarise_if(is.numeric(fun)) + gather + separate + spread + mutate + select + spread + unite
  4. By Groups, Multiple Variables Mean + SD + Percentiles: ipynb | R | html | pdf
    • By Groups, All Numeric Variables Mean + SD + Percentiles, groups = row-groups, variables = rows
    • core: summarise_if(is.numeric) + gather + separate + spread + select
  5. By within Individual Groups Variables, Averages: ipynb | R | html | pdf
    • By Multiple within Individual Groups Variables; Averages for All Numeric Variables within All Groups of All Group Variables; Long to Wide to very Wide.
    • core: gather + group_by + summarise_if(is.numeric, funs(mean(., na.rm = TRUE))) + mutate(all_m_cate = paste0(variable, '_c', value)) + gather + unite + spread (note: gather twice, spread at end)

2. Data/Variable Generation

  1. Quantiles from Multiple Variables: ipynb | R | html | pdf
    • Dataframe of Variables' Quantiles by Panel Groups; Quantile Categorical Variables for Panel within Group Observations; Quantile cut variable suffix and quantile labeling; Joint Quantile Categorical Variable with Linear Index.
    • core: group_by + slicke(1L) + lapply(enframe(quantiles())) + reduce(full_join) + mutate_at(funs(q=f_cut(.,cut)))) + levels() + rename_at + unlist(lapply) + mutate(!!var.qjnt.grp.idx := group_indices(., !!!syms(vars.quantile.cut.all)))

3. Linear Regressions

  1. IV/OLS Regression: ipynb | R | html | pdf
    • IV/OLS Regression store all Coefficients and Diagnostics as Dataframe Row.
    • core: library(aer) + ivreg(as.formula, diagnostics = TRUE) + gather + drop_na + unite
  2. M Outcomes and N RHS Alternatives: ipynb | R | html | pdf
    • There are M outcome variables and N alternative explanatory variables. Regress all M outcome variables on N endogenous/independent right hand side variables one by one, with controls and/or IVs, collect coefficients.
    • core: bind_rows(lapply(listx, function(x)(bind_rows(lapply(listy, regf.iv)))) + select/starts_with/ends_with + reduce(full_join)
  3. Regression Decomposition: ipynb | R | html | pdf
    • Post multiple regressions, fraction of outcome variables' variances explained by multiple subsets of right hand side variables.
    • core: gather + group_by(variable) + mutate_at(vars, funs(mean = mean(.))) + rowSums(matmat) + mutate_if(is.numeric, funs(frac = (./value_var)))*

4. Optimization

4.1 Planner's Problem

  1. CES Objective Function: ipynb | R | html | pdf
    • Constant Elasticity of Substitution Planner Welfare Objective Function.
    • core: prod/mean/pow, logspace, geom_bar+identity+dodge
  2. CES Subsidy Optimization Over Panel Groups: ipynb | R | html | pdf
    • Constant Elasticity of Substitution Planner Welfare Subsidies Optimizer Over Quantile/Individual Groups.
    • core: optim(x, obj, func.params), do.call(func_str, func.params); setNames+list+append

4.2 Optimization Support

  1. Constrained Share Parameters to Unconstrained Parameters: ipynb | R | html | pdf
    • Constrained: a + b + c = Z, a >= 0, b >= 0, c >= 0; Unconstrained maximands of a and b for optimization.
    • core: f - f/(1+exp(x)), while, runif + qexp + qnorm/dnorm

5. Graphing

  1. Line Plot with Two Categories, as Color and Subplot: ipynb | R | html | pdf
    • Optimal choices/value-function along states. Asset as X-axis, shocks as color, potentially another state as subplots.
    • core: unique + mutate(var := as.factor(var)) + ggplot + facet_wrap + geom_line + geom_point + labs + theme(axis.text.x = element_text(angle = 90, hjust = 1))

6. Tools

  1. List of List to Dataframe: ipynb | R | html | pdf
    • Results stored as nested named list (with different keys in sub-lists).
    • core: as.data.frame + gather + separate(sep(\.), extra='merge') + spread + column_to_rownames

7. Support

  1. Installations: ipynb | R | html | pdf
    • Conda R Package Installations.
  2. Controls: ipynb | R | html | pdf
    • Graph Sizing, Warnings, Table Col/Row Max Display, Timer.