This repo contains the codes and report for EDA and Data Analysis done during the project.
Did some Exploratory Data Analysis tasks, following tasks were accomplished:
- Finding maximum number of wins by any team in particular seasons.
- Which stadium hosted the most number of IPL matches?
- Which team has won the most number/percentage of matches?
- Which player has won the most number of Man of the Match (MoM) awards?
- Which team has won the most tosses?
- What are the top 10 greatest victories (by runs and by wickets)?
- Most 50s and 100s scored.
- plotly scatter plot for comparison between any given batsmen
- Made a scatter plot with y-axis as mean strike rate per over, x-axis as number of over, color according to batsmen(take top 20-30 batsmen according to score) and size as number of bowls faced by them (to get understanding of their experiance)
- Made a bi-histogram plot for some Team1 vs Team2 with x axis as different years. This will give us an estimate of how the two teams perform against each other over the years. Make a function so that we can easily put two different teams.
- Used Tableau Dashboard feature to get some plots for top batsmen and bowlers.
Utilized different Machine Learning models to Perform Classification Task (Predicting the Winner of the a match) and Regression Task (Final score prediction). And finally compaired the accuracies and got the optimal model for the task.
- Models used for Classification task- Logistic Regressoin, Support vector machine, Decision Tree, and Random Forest
- Models used for Regression task- Linear Regression, Random Forest and Linear SVR
Designed Neural Networks to accomplish the same tasks as mentioned above, Observed the accuracy and R2 score v/s epoch trend for training and validation sets and tuned the hyperparameters based on that.