/data2cooc2emb2ann

Learning embeddings from item co-occurrence statistics, and building an approx. nearest neighbour index

Primary LanguageJupyter NotebookGNU General Public License v3.0GPL-3.0

Learning embeddings from Co-occurrence statistics

This repository includes examples of how to:

  1. Create item co-occurrence statistics from tabular data using Apache Beam
  2. Learning item embeddings using TensorFlow tf.estimator APIs
  3. Extract the trained embeddings and build an approximate nearest neighbours index using ANNOY