Portfolio Information | Description |
---|---|
Language | Python |
Libraries Used | sklearn, NLTK,statsmodels, Numpy, Pandas, re(Regex), matplotlib, seaborn, wordcloud |
Projects Count | 6 |
Author | Ileana Cabada |
Dataset | Electronic_Pricedataset,AB Testing dataset |
About Portfolio -Product Price and Cross- Price Elasticities of Advertisment demand, Feature engineering for machine learning (Product Category Labelling) with natural language processing , A/B Player Retention Testing and games played probability, Price Exploratory Analysis
In following analysis, we would select Best Buy products as main data sample for our price elasticity analysis. For future reference,this model can be implemented in every kind of vendors by e-commerce or brick and mortar by measuring sales demand
Hypothesis Proposed
From Bestbuy laptop sample data in 2017. Is ad impression demand sensitive to its own product price changes? If yes, by how much ad impression demand is sensitive to price change?
- Linear Regression
- statsmodels, NumPy, Pandas, Matplotlib
Laptop, Desktop Price Elasticity |
---|
Hypothesis Proposed
How much is ad impression demand influenced by main competitors when they change their prices? This model help us to know the naturality of competition between prices of our own price product advertised against main competitors price product changes
- Multi Linear Regression
- statsmodels, NumPy, Pandas, Matplotlib
Cross-Price Elasticity of 12 Mac Book |
---|
- Poisson Distribution, Bootstrap Distribution
- statsmodels, NumPy, Pandas, Matplotlib
A/B Testing Distribution | Poisson Distribution |
---|---|
Due to the fact that the dataset doesn't count with category labelling for further price analysis between similar products (i.e. tablets, headphones).
Unsupervised texting clustering model for the creation of product category label segmentation was implemented by using texting preprocessing techniques such as Lemmatization, Regex, Tokenization, followed by TF-IDF Vectorization and Kmeans algorithm.
Category_name and Cluster features were created from unique product names with their respective product description.
- Kmeans
- NLTK, sklearn, RE(Regex), WordCloud, Matplotlib, Pandas and Numpy
WordCloud | Electronic Category Label Clusters |
---|---|
For further calculation of price elasticities with multilinear regression model. This price exploratory analysis was executed for following reasons:
- Product Condition Selection
- Price Outlier Detection
- Price Distribution Analysis
- Discount Price Correlation with Impression Total Count per Category
- Merchant (e-commerce) Impression Time Analysis
- seaborn, Matplotlib, Pandas and Numpy
Price Distribution Plot | Price Discount Correlation Heatmap |
---|---|
managing null values, dropping of unused features, text normalization
- RE(Regex), Matplotlib, Pandas and Numpy
Null, Unique and Datatype column values table |
---|
Contact Source | Information |
---|---|
ileana.cabada@gmail.com | |
https://www.linkedin.com/in/ileana-c-24666159/ |