/clojure-machinelearning-cookbook

A set of recipes for using Clojure on Machine Learning tasks

Primary LanguageClojure

Clojure Machine Learning Cookbook

A set of recipes/notes that I developed or used for Clojure, in Machine Learning applications. These recipes are not tied to a single library or platform.

Using Clojure for Machine Learning on Apache Spark

A short introduction to Machine Learning on Apache Spark with Clojure, based on support I added to the Sparkling project. Sparkling is an open source project that enables using Clojure for Spark, since the "default" languages supported by Spark are limited to Java, Scala, Python and R. Currently, it has capabilities to

  1. train an ML classifier using some of the classifiers available on Spark
  2. Use multi-fold cross validation to evaluate scores on different validation sets
  3. Use train-test splits to evaluate scores on a single validation set, with different percentage splits
  4. Use Grid-search to find the combination of hyper parameters that achieves the lowest error on the desired metric

Word similarity analysis using Word2Vec

Word2Vec is a tool that converts words into high-dimension vectors. Here's a short introduction that uses an implementation available at the Bridgei2i Github repo.

Displaying Core.matrix datasets as HTML tables in Gorilla-repl

Gorilla-repl is an extremely useful tool for Data Analysis in Clojure. Since I work a lot with core.matrix, I need to eyeball the data-set as it goes through data transformations. I developed a plugin to view core.matrix datasets as HTML tables, with options to control the number of rows and columns that are displayed. An example gorilla-repl worksheet is at this link.