/practical-statistics-for-data-scientists

Code repository for O'Reilly book

Primary LanguageJupyter NotebookGNU General Public License v3.0GPL-3.0

Book cover

Code repository

Practical Statistics for Data Scientists:

50+ Essential Concepts Using R and Python

by Peter Bruce, Andrew Bruce, and Peter Gedeck

R

Run the following commands in R to install all required packages

if (!require(vioplot)) install.packages('vioplot')
if (!require(corrplot)) install.packages('corrplot')
if (!require(gmodels)) install.packages('gmodels')
if (!require(matrixStats)) install.packages('matrixStats')

if (!require(lmPerm)) install.packages('lmPerm')
if (!require(pwr)) install.packages('pwr')

if (!require(FNN)) install.packages('FNN')
if (!require(klaR)) install.packages('klaR')
if (!require(DMwR)) install.packages('DMwR')

if (!require(xgboost)) install.packages('xgboost')

if (!require(ellipse)) install.packages('ellipse')
if (!require(mclust)) install.packages('mclust')
if (!require(ca)) install.packages('ca')

Python

We recommend to use a conda environment to run the Python code.

conda create -n sfds python
conda activate sfds

pip install jupyter
pip install pandas
pip install matplotlib
pip install scipy
pip install statsmodels
pip install wquantiles
pip install seaborn
pip install scikit-learn
pip install pygam
pip install dmba
pip install pydotplus

pip install imbalanced-learn
pip install prince

conda install --yes -c conda-forge xgboost
conda install --yes graphviz

See also