/spark-lda-coherence

Example of Topic Coherence Calculation for LDA model in Apache Spark

Primary LanguageScalaBSD 2-Clause "Simplified" LicenseBSD-2-Clause

spark-lda-coherence

Example of Topic Coherence Calculation for LDA (Latent Dirichlet allocation) model in Apache Spark.

The example uses Pointwise mutual information (PMI) for topic coherence calculation. In details, it uses Intrinsic UMass measure. Helpful articles:

How to use

There is example of usage in CoherenceTest file.

Also, you can compile the project and add it dependency to your project:

Example, publish to local Maven: sbt publishM2

And next:

libraryDependencies += "io.github.gnupinguin" %% "ldacoherence_2.12" % "1.0"