/legal

Primary LanguageJupyter Notebook

topicmodeling

Overview -- Unsupervised learning: Finding important features (NMF)

This sprint will use Non-Negative Matrix factorization (NMF) to discover topics from our NYT corpus. Similar to kmeans and hierarchical clustering, NMF is a technique to help discover latent properties (features) in our data that a human might not have been able to see otherwise.

Goals

  • Matrix factorization
  • Dimensionality reduction
  • Latent properties
  • Linear combination of features

Exercise in pair.md.