/Business-Analytics

Data Analytics, Statistics, Visualization (R / Python)

Primary LanguageHTMLMIT LicenseMIT

Business-Analytics

license

This is one of the continuously updated repositories that documents personal journey on learning data science related topics. Currently, contents are organized into two separate repositories based on the following table's description.

Repository Documentation Focus
machine-learning Machine learning, algorithm and programming (mainly in Python)
Business-Analytics All other data analytic related stuffs, e.g. concepts, statistics, visualizations (R, Python)

Within each section, documentations are listed in reverse chronological order of the start date (the date when the first notebook in that folder was created, if the notebook happened to be updated, then the actual date will be at the top of each notebook). Each of them are independent of one another unless specified.

Documentation Listings

finding_groups : 2015.11.10

Examples of how finding similar patterns using hierarchical clustering algorithms can be applied to the supply chain’s and human resource’s business field.

Statistics

frequentist_statistics : 2016.07.27

  • Notes for frequentist statistics inference (t-test, anova, proportion test, chi-square, power). [Rmarkdown]
  • Bonferroni correction for multiple hypothesis testing. [nbviewer]
  • Spearman rank correlation. [nbviewer]

bandits : 2016.06.02

Multi-armed Bandits Algorithms, a possible alternative to A/B testing for short-term tests or extremely long tests. For those that are not familiar with bayesian statistics, it's recommended to go through the first two documents in the bayesian_statistics folder.

  • Epsilon Greedy, Softmax, Upper Confidence Bound and Thompson Sampling from scratch. [nbviewer]

ab_tests : 2016.06.01

Includes Bayesian and Frequentist A/B testing. For those that are not familiar with bayesian statistics, it's recommended to go through all the documents in the bayesian_statistics folder.

  • Bayesian A/B testing, beta heirarchical model with pymc. [nbviewer]
  • Frequentist A/B testing. [Rmarkdown]
  • Template and caveats for the A/B testing process (applicable for both types of testing). [nbviewer]

bayesian_statistics : 2016.04.21

For starters with bayesian statistics, read the documents in listed order.

  • Bayes theorem basics. [nbviewer]
  • Beta distribution, empirical bayes estimation, credible interval and false discovery rate. [Rmarkdown]
  • Markov Chain Monte Carlo (MCMC) - Metropolis Hastings Algorithm. [Rmarkdown]

General

Articles

  • Continuously updated non-technical articles. [Rmarkdown]
  • 2017.01.17 | Data Science advice (mentality, problem solving and presentation template). [Rmarkdown]
  • 2016.07.02 | Some ways of addressing data hygiene. [Rmarkdown]

Visualizations

  • 2016.07.09 | Production ready calendar heatmap. [Rmarkdown]
  • 2016.05.12 | Production ready scatter plot. [Rmarkdown]
  • 2016.05.12 | Production ready faceted bar plot. [Rmarkdown]

R

  • 2017.01.13 | Unit testing and setting up a basic R package. [Rmarkdown]
  • 2017.01.13 | Efficient (parallel) looping in R. [Rmarkdown]
  • 2017.01.13 | Rmarkdown quickstart. [Rmarkdown]
  • 2017.01.13 | data.table joining and other tricks. [Rmarkdown]