/DataScienceProjects

The code repository for projects and tutorials in R and Python that covers a variety of topics in data visualization, statistics sports analytics and general application of probability theory.

Primary LanguageJupyter Notebook

Overview

In this repository, you will find the source code to various projects I have been working on or still work-in-progress. The majority of the projects are accompanied by a Medium blog posts at tuannguyen-doan.medium.com. I published almost exclusively on Towards Data Science publication through Medium's Partnership program so please check out these articles as a way to support me and my future projects. Alternatively, you can also find my blog posts at my personal website here.

My interests lie in the intersection of statistical techniques, data visualization and sports (especially football). All the codes are written entirely in Python or R. I don't have a strong preference or attempt to make a concerted effort to code in a specific language/platform. The decision is mostly based on how specific functionalities needed for a project are supported (scraping in Python and data processing with dplyr piping in R).

I. Statistical application:

The statistics of modern football:

A collection of projects that explore the intricate statistical aspect of the Beautiful Game

Statistical theory and its application:

II. External Collaborations:

Published papers:

III. General tutorials with Python and R:

Data visualization:

Machine Learning practicals: