/varaha

Machine learning and natural language processing with Apache Pig

Primary LanguageJavaApache License 2.0Apache-2.0

Varaha

A set of Apache Pig scripts and UDFs (User Defined Functions) for machine learning and natural language processing. Why should Mahout have all the fun?

Build

You’ll want to build the UDFs before doing anything else. To do that simply do:


mvn clean package

The rest

See individual readme files under the scripts directory for how to run.

Why is it called Varaha?

Evidently, Varaha is an avatar of the Hindu god Vishnu, in the form of a Boar.