Picture Source: Unsplash
In this project, a recommendation model was developed with Association rule analysis based on the products preferred by the customers (random). Some inferences were obtained from the people who shopped among 9 different products representing the shopping cart of different people. The aim is to obtain and interpret the links between people's preferred products. Association rule extraction, which can provide a very efficient effect for markets.
- Market
- Machine Learning
- Association Rules Analysis
- Apriori
- Association Rule Mining
The strength of an association rule is measured by two metrics: support and confidence.
Support
: Support indicates the frequency of occurrence of the items in the dataset. It is calculated as the proportion of transactions containing both itemset A and itemset B. Support is an indication of how frequently the itemset appears in the dataset.
Confidence
: Confidence measures the reliability of the association rule. It is calculated as the proportion of transactions containing itemset A that also contain itemset B. Confidence is the percentage of all transactions satisfying X that also satisfy Y.
Additionally, two other metrics are often used in association rules analysis:
Lift
: Lift measures the strength of the association rule by comparing the observed support with the expected support if itemset A and itemset B were independent of each other. A lift greater than 1 indicates a positive association, while a lift less than 1 indicates a negative association. The ratio of the observed support to that expected if X and Y were independent.
Conviction
: Conviction measures the degree of implication of the rule. It is calculated as the ratio of the expected confidence to the observed confidence if itemset A and itemset B were independent of each other. Higher conviction values indicate stronger implications.It compares the probability that X appears without Y if they were dependent with the actual frequency of the appearance of X without Y.
Association rules analysis helps uncover hidden patterns and relationships in the data, which can be used for various purposes. For example, in retail, it can be used to make product recommendations, optimize store layouts, plan promotional strategies, and improve inventory management.
By identifying the frequent itemsets and generating meaningful association rules, businesses can gain insights into customer behavior, improve decision-making, and enhance the overall customer experience.
Source: Wikipedia
The Data Set was obtained on completely random values. It was created for demonstration (prototype) purposes only, as no real data is available. The values of the dataset can be played with as desired.
The variety of the product can be arranged as desired.
items = ['Coffee', 'Bread', 'Milk', 'Eggs', 'Butter', 'Juice', 'Cereal', 'Yogurt', 'Cheese', 'Pasta', 'Rice', 'Chicken', 'Beef', 'Fish', 'Apples', 'Bananas', 'Oranges', 'Grapes', 'Strawberries', 'Tomatoes', 'Potatoes', 'Carrots', 'Onions', 'Lettuce', 'Broccoli', 'Cucumber', 'Soap', 'Shampoo', 'Toothpaste']
The number of customers shopping and the number of products purchased can be adjusted as desired.
np.random.seed(42)
data = []
df_range = 1000 #@param {type:"number"}
for _ in range(df_range):
customer = [int(pd.Series([0, 1]).sample(n=1, weights=[0.7, 0.3]).iloc[0]) for _ in range(len(items))]
data.append(customer)
df = pd.DataFrame(data, columns=items)
df.insert(0, 'CustomerID', range(1, df_range+1))
If you have something to say to me please contact me:
- Twitter: Doguilmak
- Mail address: doguilmak@gmail.com