This is yet another variation of the well-known word2vec method, proposed by Mikolov et al., applied to unordered sequences, which are commonly referred to as itemsets.
The contribution of itembed
is twofold:
- Modifying the base algorithm to handle unordered sequences, which has an impact on the definition of context windows;
- Using the two embedding sets introduced in word2vec for supervised learning.
A similar philosophy is described by Wu et al. in StarSpace and by Barkan and Koenigstein in item2vec.
itembed
uses Numba to achieve high performances.
Install from PyPI:
pip install itembed
Or install from source, to ensure latest version:
pip install git+https://github.com/sdsc-innovation/itembed.git
Please refer to the documentation for detailed explanations and examples.