KeyError while generating rules
Closed this issue · 6 comments
The apriori algorithm throws a KeyError
when generating rules from itemsets.
Generating itemsets.
Counting itemsets of length 1.
Found 457 candidate itemsets of length 1.
Found 190 large itemsets of length 1.
Counting itemsets of length 2.
Found 17955 candidate itemsets of length 2.
Found 16611 large itemsets of length 2.
Counting itemsets of length 3.
Found 67 candidate itemsets of length 3.
Found 63 large itemsets of length 3.
Counting itemsets of length 4.
Found 0 candidate itemsets of length 4.
Itemset generation terminated.
Generating rules from itemsets.
Generating rules of size 2.
Generating rules of size 3.
Traceback (most recent call last):
File "basket.py", line 26, in <module>
itemsets, rules = apriori(transactions, min_support=.6, min_confidence=.5, verbosity=1)
File "/home/.../lib/python3.5/site-packages/efficient_apriori/apriori.py", line 56, in apriori
return itemsets, list(rules)
File "/home/.../lib/python3.5/site-packages/efficient_apriori/rules.py", line 329, in generate_rules_apriori
conf = count(itemset) / count(lhs)
File "/home/.../lib/python3.5/site-packages/efficient_apriori/rules.py", line 303, in count
return itemsets[len(itemset)][itemset]
KeyError: ('Item 3', 'Item 5')
Is this a Python version issue?
efficient_apriori
not tested on Python 3.5, so the version might indeed be the issue.
- Can you try on Python 3.6 or 3.7?
If that does not work, I will need an example input on which the program fails. Then I can debug it. But try on Python 3.6 or 3.7 first and see if that works.
It is indeed related to the Python version. Works on versions 3.6 and 3.7.
is it possible to enable python 3.5?
Hi @dwy904 . Unfortunately the program uses features which are not Python 3.5 compatible, so it would need to be backported. This both requires some work, and would make the code base less nice.
- I suggest you upgrade your Python installation if possible.
- If you cannot upgrade due to other packages, consider using conda to create a separate environment for using Efficient-Apriori.
@dwy904 It should not take that long to get some decent results, remember to:
- Get the data into memory, i.e. do not use the data_generator solution.
- Start with a large value for
min_support
, let the algorithm converge, then decrease it. - Use
verbose
to see output.