EDA - Exploratory Data Analysis
It is a crucial step in the data analysis process
EDA helps in comprehending the dataset's characteristics, including its variables, distributions, and any missing values.
EDA involves tasks like handling missing values, removing duplicates, and addressing outliers to ensure data quality.
Identifying Patterns and Relationships:
EDA uses visualizations to uncover patterns, trends, and relationships within the data.
EDA generates hypotheses about the data, which can be further tested using statistical methods or machine learning algorithms.
EDA informs feature selection and engineering decisions to create meaningful predictors for modeling.
Model Selection and Validation:
EDA assists in selecting appropriate models by understanding the data distribution and model assumptions.
EDA communicates insights to stakeholders through interpretable visualizations and summaries.
EDA is iterative, allowing for continuous refinement of analysis techniques based on new insights.
Domain Knowledge Integration:
EDA benefits from incorporating domain knowledge to interpret findings in context.
EDA is flexible and does not have fixed rules, allowing analysts to choose techniques based on the dataset and analysis goals.
It involves Process such as:-
Univariate Analysis
Multivariate Analysis
Creating New Columns
Modifying Existing Ones
Detect Outliers
Remove Outliers
The Entire process is Highly Iterative