/Analysis_Avacado

Testing, Training dataset of Avocado prices that directly came from retailers and this data set has huge differences of prices. Our work is to predict future prices and also plot variations into the graph

Primary LanguageJupyter Notebook

Analysis_Avacado

The Avocado dataset came directly from retailers’ cash registers based on the actual retail sales of Hass avocados. The average price reflects the cost per unit even when there are multiple units are sold in the bag. During dataset exploration there was no missing values noted. The total volume and total bags columns were dropped because total volume represents a sum of Hass Avocado bags for various PLU codes plus total bags; total bags column was dropped as well because it represents a sum of various avocado bag sizes. The date column was dropped since it has too many levels, it is recorded as day, month and year, and region column was dropped as well because there are too many levels for the purpose of our analysis, fifty five regions were in the dataset.