The Buddy Bass Tournament on Williamstown Lake was started in 1985 by twin brothers: Bob and Sam Perry. They held their final tournament season in 2020.
In this notebook, I'll take a look at the dataset from the Buddy Bass Tournament 2020 Season. I will use the following dataset: dataset/Buddy_Bass_Tournament_2020.xlsx, that I received from my dad, Sam Perry.
via Anaconda
- Install Anaconda if you don't have it installed (https://docs.anaconda.com/anaconda/install/index.html)
- Clone this repository from Github https://github.com/istarlet/buddy_bass_tournament/
- Open Jupyter Notebook in Anaconda
- Open cloned repo and run 'buddy_bass_.ipynb'
- matplotlib
- pandas
Read in data from a local csv, excel file, json, or any other file type.
I read in an excel file with data from my dad's "Buddy Bass Tournament".
Use built-in pandas or numpy functions to do things like remove 0’s and null values where they don’t belong in your dataset.
- skiprows - I used skiprows when reading the datasets in to skip over the title so the dataframe will start at the column headings.
- pd.Series(pd.date_range()) - I created a series from date_range that starts on 06/10/2022 and then every Wednesday for the next 15 periods.
- .drop() - I used drop to remove the "Big Fish/Year (LBS)" column from the dataframe.
- .fillna - I used .fillna to replace all instances of NaN in the dataframe with 0.
Use at least 5 different built-in Python functions to find out something about your data.
Do 5 basic calculations with Pandas
- .info() - I used .info() to display information about the columns including data type and number of missing values, if any
- shape - I used .shape to return the number of rows and columns in the data
- .describe() - I used .describe to display a summary of statistics calculated for each column
- .groupby + .sum() - I used .groupby to group the data by "Month" and calculate the sum for "Number of Boats", "Total No of Fish Caught", and "Total Weight in Pounds" by month
- Find tournament dates where number of boats was less than 30 - buddy_bass_2020[buddy_bass_2020["Number of Boats"] < 30]
- .median() - I used .median to find the median number of fish caught
Make 2 basic plots with matplotlib, seaborn, or any other kind of visualization library that you think looks interesting.
Fig.1 - Scatter Plot
Fig. 2 - Bar Plot
Write markdown cells in Jupyter explaining your thought process and code.
Throughout my Jupyter Notebook you will find comments and markdown cells where I explain my thought process and code.