/Spark-Text-Clustering

Text clustering in spark with scala using LDA Model on a TF-IDF matrix

Primary LanguageScalaMIT LicenseMIT

Spark-Text-Clustering

The following project demonstrates how to use LDA Models in Scala in a Spark environment on TF-IDF matrixs of texts, in order to cluster those in different topics.

Requirements

Java (jdk-13.0.1) Scala (scala-sdk-2.12.10) Spark (Spark-3.0.0 and sbt-1.3.10)