Welcome to the Walmart Sales Insights repository! This project involves analyzing and visualizing Walmart sales data to derive meaningful insights and answer key business questions. The analysis is performed using Python with libraries such as pandas
, matplotlib
, and seaborn
.
Walmart Sales Insights is a project aimed at exploring Walmart sales data to identify trends, patterns, and insights that can help in decision-making processes. The project involves data preprocessing, analysis, and visualization to provide a comprehensive overview of sales performance across various dimensions such as time, store locations, and external factors.
- Data preprocessing and cleaning
- Trend analysis of total weekly sales over time
- Comparison of sales during holiday and non-holiday weeks
- Identification of top-performing and underperforming stores
- Analysis of the impact of external factors (temperature, fuel price, CPI, unemployment) on sales
- Visualization of sales distribution and correlation between variables
- Heatmap of sales by store and month
- Monthly and yearly sales trends
- Interactive and static visualizations
The dataset used in this project includes the following columns:
Store
: The store numberDate
: The week end dateWeekly_Sales
: Sales for the given store and weekHoliday_Flag
: Indicates whether the week is a special holiday week (1) or not (0)Temperature
: Temperature in the regionFuel_Price
: Cost of fuel in the regionCPI
: Consumer Price IndexUnemployment
: Unemployment rate
To run this project, you need to have Python installed along with the necessary libraries. You can install the required:
pip install pandas matplotlib seaborn psycopg2
-
Load Data: Load the Walmart sales data from the CSV file.
import pandas as pd # Load data from CSV file into a DataFrame csv_file_path = 'path_to_your_csv_file.csv' df = pd.read_csv(csv_file_path) # Convert 'Date' column to datetime format df['Date'] = pd.to_datetime(df['Date'], format='%d-%m-%Y')
-
Data Preprocessing: Perform necessary data preprocessing steps such as converting date formats and extracting year and month.
# Extract Year and Month from Date df['Year'] = df['Date'].dt.year df['Month'] = df['Date'].dt.month
-
Visualization: Create various visualizations to analyze the data.
import matplotlib.pyplot as plt import seaborn as sns # Overall Sales Trends plt.figure(figsize=(12, 6)) sns.lineplot(data=df, x='Date', y='Weekly_Sales', estimator='sum') plt.title('Total Weekly Sales Over Time') plt.xlabel('Date') plt.ylabel('Total Weekly Sales') plt.show() # Add more visualizations as needed...
-
Run Analysis: Execute the Python script to generate insights and visualizations.
The project includes the following visualizations:
- Total Weekly Sales Over Time
- Weekly Sales: Holiday vs Non-Holiday Weeks
- Top Performing Stores
- Sales vs. Temperature
- Sales vs. Fuel Price
- Sales vs. CPI
- Sales vs. Unemployment
- Total Monthly Sales
- Heatmap of Sales by Store and Month
- Distribution of Weekly Sales
- Sales Trends by Store
- Total Yearly Sales
- Correlation Matrix Heatmap
- Sales Trends on Holidays vs. Non-Holidays
Contributions are welcome! If you have any suggestions or improvements, feel free to open an issue or submit a pull request.
- Fork the repository.
- Create a new branch:
git checkout -b my-feature-branch
. - Make your changes and commit them:
git commit -m 'Add some feature'
. - Push to the branch:
git push origin my-feature-branch
. - Open a pull request.
This project is licensed under the MIT License. See the LICENSE file for more details.