This project aims to perform customer segmentation and revenue prediction for a gaming company based on customer attributes. The company wants to create persona-based customer definitions and segment customers based on these personas to estimate how much potential customers can generate in revenue.
The dataset used for this project is "persona.csv," which contains records of customer transactions with the following columns:
Price
: The amount spent by the customer.Source
: The type of device used by the customer.Sex
: The gender of the customer.Country
: The country of the customer.Age
: The age of the customer.
- Loaded the dataset and displayed general information about it.
- Determined the number of unique sources and their frequencies.
- Calculate the number of unique prices and display their frequencies.
- Counted the occurrences of each price point.
- Counted the number of sales from each country.
- Calculate the total revenue from sales in each country.
- Grouped sales counts by source.
- Calculate the average price for each country.
- Calculate the average price for each source.
Perform customer segmentation based on the following attributes: COUNTRY
, SOURCE
, SEX
, and AGE
. The resulting DataFrame (agg_df
) includes the mean price for each unique combination of these attributes.
Sort the agg_df
DataFrame based on the PRICE
column in descending order.
Rename the index names of the DataFrame to variable names.
Convert the numeric AGE
variable into a categorical variable with user-defined age categories (AGECAT
) and add it to the agg_df
DataFrame.
Create a new variable called customers_level_based
that combines customer attributes for segmentation. Duplicate values were removed.
Segment new customers based on their predicted revenue into four segments: 'A', 'B', 'C', and 'D', using quartiles of the PRICE
column.
Implement a function mean_revenue_prediction
to predict the revenue and segment for new customers based on their attributes. Users can input their age, source, country, and sex, and the function will provide the segment and estimated revenue.
- Clone this repository to your local machine.
- Ensure you have Python and the required libraries (NumPy, Pandas, Seaborn, Matplotlib) installed.
- Run the Jupyter Notebook or Python script to execute the code.
- To predict revenue for new customers, use the
mean_revenue_prediction
function by passing the customer's age, source, country, and sex as arguments.
Enjoy exploring customer segmentation and revenue prediction with this project!