It is very important to know our data very well before applying it to build the model. Initially, we don't have any knowledge about the data, so we need to explore our data very well. Exploratory Data Analysis or EDA is a task of analyzing data using multiple important and useful tools, different statistical methods, visualization plots and many other important techniques like linear regression, rule based methods etc. in order to get the complete understanding about the data. It is the first and very important step of any kind of data modeling and model building.
- Get the no. of features (columns) and no. of records (rows).
- Get features (independent variables) name.
- Get unique class (dependent variables) names.
- Get no. of feature values or dataponits belongs to each class.
- Check whether the dataset is balanced or not.
- Mean
- Standard Deviation
- Median
- Quantile & Percentile
- Median Absolute Deviation (MAD)
- Univariate Analysis
- Bivariate Analysis
- Multivariate Analysis
- Histogram Plot
- Probability Density Function (PDF) and Cumulative Distribtion Function (CDF)
- Scatter Plot
- Pair Plot
- Box Plot
- Violin Plot
- Contour Plot
- Dist Plot
- Joint Plot
- NumPy
- Seaborn
- Matplotlib
- Pandas