swoop-inc/spark-alchemy

README example on how to use NativeFunctionRegistration

Closed this issue · 1 comments

@ssimeonov - I read your blog post and took a look at the NativeFunctionRegistration trait.

Can you provide a README example on how to use the NativeFunctionRegistration trait that's a little easier to follow for newbies than the HyperLogLog source code? I'd like to understand the basics and then dive into the HyperLogLog code!

This library looks really cool. Can't wait to understand all this code!

@MrPowers yes, we have to add more docs. :)

As the comment in NativeFunctionRegistration says, this is code pulled from FunctionRegistry in OSS Spark.

Registering native functions is only necessary if you want to use them from SparkSQL. Using them from Scala simply requires creating a Column-oriented instantiation API as done in Spark's functions.

Once you have built one or more native functions, you create a registration object that extends NativeFunctionRegistration and implements expressions, e.g., the way HLLFunctionRegistration does.

To use the functions from SparkSQL, you have to register them by calling the equivalent of HLLFunctionRegistration.register(spark).

That's all there is to it.