This is script useful for Shapley Value Decomposition implemented in R. This script is based on the Shapley Value Decomposition formulation presented by Anthony F. Shorrocks, 2013.
Anthony F. Shorrock. Decomposition procedures for distributional analysis: a unified framework based on the Shapley value. J Econ Inequal (2013) 11:99–126. DOI: 10.1007/s10888-011-9214-z
- Shapley Value Decomposition runs in two modes
- Un-parallelized mode uses
ShapleyValue.Decomposition()
function. This is suitable for a small dataset and - Parallelized mode uses
ShapleyValue.Decomposition.parallel()
function. This is most suitable for a large dataset. It dependes onforeach
anddoParallel
R-packages.
- Un-parallelized mode uses
- See
run_demo_script.R
for details.
# UN-PARALLELIZED MODE ---
source("ShapleyValueDecomposition.R")
ShapleyValue.Decomposition(dat)
# PARALLELIZED MODE ---
source("ShapleyValueDecomposition_parallel.R")
ShapleyValue.Decomposition.parallel(dat, n_cores=4)
dat
: R dataframe containing the input data of Group and Observation. See File Format for details.n_cores
: The number of cores to use for parallel execution.
- The function
ShapleyValue.Decomposition
generates output of the following in form of a R-list.G
: Gini Index (global)G_k
: Gini Index per groupW
: Within group inequality decompositionW_k
: Within group inequality per groupB
: Between group inequality decompositionO
: Overlap Effect
- A tab separated file with two columns.
- The first column with header
Group
. This column includes group labels. - The second column with header
Observation
. This column contains numeric values indicating individual observation or income amount. These values may or may not be in a sorted order. - See
dummy_observations.tsv
for details.
Group Observation
A 2
A 6
B 10
C 18
B 20
A 25
C 30
C 50
C 55
B 84