/go-tsne

t-Distributed Stochastic Neighbor Embedding (t-SNE) in Go

Primary LanguageGoBSD 3-Clause "New" or "Revised" LicenseBSD-3-Clause

go-tsne

A Go implementation of t-Distributed Stochastic Neighbor Embedding (t-SNE), a prize-winning technique for dimensionality reduction particularly well suited for visualizing high-dimensional datasets.

mnist2d mnist3d

Usage

Import this library:

    import "github.com/danaugrs/go-tsne/tsne"

Create the TSNE object:

    t := tsne.NewTSNE(2, 300, 300, true)

The parameters are

  • Number of output dimensions
  • Perplexity
  • Max number of iterations
  • Verbosity

There are two ways to start the t-SNE embedding optimization. The regular way is to provide an n by d matrix where each row is a datapoint and each column is a dimension:

    Y := t.EmbedData(X, nil)

The alternative is to provide a distance matrix directly:

    Y := t.EmbedDistances(D, nil)

In either case, the returned matrix Y will contain the final embedding.

For more fine-grained control, a step function can be provided in either case:

    Y := t.EmbedData(X, func(iter int, divergence float64, embedding mat.Matrix) bool {
    	fmt.Printf("Iteration %d: divergence is %v\n", iter, divergence)
    	return false
    })

The step function has access to the iteration, the current divergence, and the embedding optimized so far. You can return true to halt the optimization.

Examples

Two examples are provided - mnist2d and mnist3d. They both use the same data - a subset of MNIST with 2500 handwritten digits. mnist2d generates plots throughout the optimization process, and mnist3d shows the optimization happening in real-time, in 3D. mnist3d depends on G3N. To run an example, cd to the example's directory and go run it, e.g:

    cd examples/mnist2d
    go run mnist2d

Support

I hope you enjoy using and learning from go-tsne as much as I enjoyed writing it.

If you come across any issues, please report them.