Implement Spark Native UDF interface
myui opened this issue · 1 comments
myui commented
Related to #345, Hive UDF invocation is slow in Spark.
We can do better at least for UDF, currently not for UDAF/UDTF, by implementing Spark's Java UDF{1,...,22} as well as implementing Hive's UDF.
class AngularDistanceUDF extends GenericUDF implements org.apache.spark.sql.api.java.UDF2
https://github.com/myui/hivemall/blob/master/core/src/main/java/hivemall/knn/distance/AngularDistanceUDF.java
Also, we can prepare some helper methods for Spark API in
https://github.com/myui/hivemall/blob/master/core/src/main/java/hivemall/UDFWithOptions.java
@maropu How do you think?
maropu commented
yea, I think it's a good idea. I'll try later.