/yet-another-hive-udf-lib

Hive UDF lib - implementation of UDFs not present in HiveSwarm, brickhouse and others.

Primary LanguageJava

Hive UDF library

As the name suggests, this is yet another Hive UDF library. Consists of algorithms that don't already exist in HiveSwarm, brickhouse etc.

Hive UDF's implemented

Levenstein Distance Damerau-Levenshtein Distance

Compile

mvn compile

Test

mvn test

Build

mvn assembly:single

Run

%> hive
hive> ADD JAR target/NAME_OF_ASSEMBLED.jar;
hive> SOURCE sourceAll.hql;
hive> select ldistance(full_name, first_name) from people limit 10;

Credits

-> This great article walks through creating java UDFs in Hive.