/learn_data_science_through_sports

Journey through the world of data science from beginner to advanced concepts and methodologies. In this repository, I will walk you through DS skills, hints, tips, and tricks all with Sports relevant examples!

Primary LanguageJupyter Notebook

Learn Data Science Through Sports

Journey through the world of data science from beginner to advanced concepts and methodologies. In this repository, I will walk you through DS skills, hints, tips, and tricks all with Sports relevant examples!

I will provide walkthroughs, notebooks, and code snippets weekly using some of my favorite examples from the sports world. We will go over concepts such as Python fundamentals, SQL fundamentals, Data Visualization, Data Cleaning, Data Mining, Algorithm development, automation, machine learning, AI, and more!

Love sports? Want to get into the world of data science? You've come to the right place!

Please connect with me on LinkedIn: https://www.linkedin.com/in/anthony-vessicchio/

And check out my personal website: https://ant-vessicchio.github.io/

Before we Begin

The following curriculum is hosted in a Jupyter notebook environment. If you are not familiar with this platform/do not have it installed, follow this link: https://www.dataquest.io/blog/jupyter-notebook-tutorial/

All of my notebooks in the curriculum are .ipynb files (Jupyter notebook files). Once you have this environment set up, you can easily follow along with the entire curriculum!

The Curriculum

(Read through each of my descriptions and run each cell yourself. Try to make some modifications to each cell and enter some of your own content ideas!)

Rookie 1.0 (Your first steps in Python)

  1. Introduction to Python

https://github.com/ant-vessicchio/learn_data_science_through_sports/blob/main/Introduction_To_Python.ipynb

  1. Variables and Naming

https://github.com/ant-vessicchio/learn_data_science_through_sports/blob/main/Variables%20and%20Naming.ipynb

  1. Data Types

https://github.com/ant-vessicchio/learn_data_science_through_sports/blob/main/Data_Types.ipynb

  1. Data Types Advanced

https://github.com/ant-vessicchio/learn_data_science_through_sports/blob/main/Data_Types_Advanced.ipynb

Rookie 2.0 (Let's start with some logic!)

  1. If/Else Statements

https://github.com/ant-vessicchio/learn_data_science_through_sports/blob/main/If_Else.ipynb

  1. While Loops

https://github.com/ant-vessicchio/learn_data_science_through_sports/blob/main/While_Loops.ipynb

  1. For Loops

https://github.com/ant-vessicchio/learn_data_science_through_sports/blob/main/For_Loops.ipynb

  1. Python Functions

https://github.com/ant-vessicchio/learn_data_science_through_sports/blob/main/Python_Functions.ipynb

  1. String Formatting

https://github.com/ant-vessicchio/learn_data_science_through_sports/blob/main/String_Formatting.ipynb

Rookie Challenge 1 (NFL Combine)

https://github.com/ant-vessicchio/learn_data_science_through_sports/blob/main/Beginner_Challenge.ipynb

Rookie 3.0 (Let's "Look" at Some Data using Matplotlib)

  1. Plotting

https://github.com/ant-vessicchio/learn_data_science_through_sports/blob/main/Plotting_in_Matplotlib.ipynb

  1. Scatterplots and Bar Graphs

https://github.com/ant-vessicchio/learn_data_science_through_sports/blob/main/Scatterplots_and_Bar_Graphs.ipynb

  1. Histograms and Pie Charts

https://github.com/ant-vessicchio/learn_data_science_through_sports/blob/main/Histograms_and_Pie_Charts.ipynb

  1. Rookie Final Challenge (Telling a story about a baseball roster)

https://github.com/ant-vessicchio/learn_data_science_through_sports/blob/main/Rookie_Final_Challenge.ipynb

Congrats on completing the Rookie Section!

By now, you can tell how Python can be a great foundational tool in your Sports Data Science journey. The next portion of the curriculum is going to be very "learn by example" based. I feel that this is not only an engaging way to learn, but I feel like this is the best way to get the creative juices flowing!

I will be introducing new concepts, modules, and methodologies in the following examples without specifically dedicating a learning section to them. With repetition and "doing", these concepts will naturally stick! You will also begin to develop an analytical mindset and become familiar with some very common data science processes and workflows. I will also include a list of skills you will learn under each project name!

Novice Section (Let's begin on some real life projects)

(MLB Databank from 1871-2015)

The following Novice Projects will be centered around a databank composed of several dataset files containing various information on baseball players, teams and games from 1871 to 2015. The first part will serve as an "exploratory" look into the data and teach you some valuable workflow skills when dealing with a dataset you've never seen before. As we move further into this section, the Parts will become more difficult (but also more useful, applicable, and creative!)

Novice Project 1.0 (Exploration)

https://github.com/ant-vessicchio/learn_data_science_through_sports/tree/main/Baseball_Databank_Exploration

Skills Used: Ingesting Datasets from Kaggle, Reading csv files into Dataframes, Exploring Dataframes, .loc and .iloc (accessing Dataframe elements), extracting data from specific columns, merging tables, intro to cleaning data, simple visualizations from a cleaned dataset.

Novice Project 2.0 (Was Babe Ruth Really That Good?)