/ml

Re-usable low-level ML components

Primary LanguageJavaApache License 2.0Apache-2.0

ML Components

Repository for low-level production-grade ML inference. The current motivating example is the CRF inference component which is used in the AI2 fork of Grobid and Science Parse. It's currently 100% Java, but can also have Scala too.

Getting Started

This project is currently Java 8 built with gradle. To install gradle simply install via brew install gradle via Homebrew. Then if you can do:

> gradle test # Run unit tests
> gradle idea # Generate IntelliJ project

Project Conventions

Documentation

You can use Markdown in your Javadoc using Pegdown.

Test Coverage

You can run gradle jacoco and this will produce a testing report.

> gradle jacoco
> open build/reports/jacoco/test/html/index.html

Benchmark Tests

> gradle jmh
> open build/reports/jmh/results.txt

Lombok

This project uses Lombok which requires you to enable annotation processing inside of an IDE. Here is the IntelliJ plugin and you'll need to enable annotation processing (instructions here).

Lombok has a lot of useful annotations that give you some of the nice things in Scala:

  • val is equivalent to final and the right-hand-side class. It gives you type-inference via some tricks
  • Checkout @Data

Efficient Primitive Collections

Using GSCollections which has been found as efficient as the best libraries across a wide-range of tasks (in particular way faster than trove).