
Functions as column extensions

Closed this issue · 8 comments

@snithish @gorros @kirubakarrs @oscarvarto - I’ve always thought it’s a bit random how Spark defines some functionality as Column functions and other functionality as SQL functions. Here’s an example of the inconsistency:

lower(col("blah").substr(0, 2))

Having two SQL functions would look like this:

lower(substr(col("blah"), 0, 2))

Having two Column functions would look like this:

col("blah").substr(0, 2).lower()

I like the Column functions syntax, so I started monkey patching the SQL functions to the Column class: Let me know your thoughts.

@MrPowers I prefer this syntax col("blah").substr(0, 2).lower(), but I am curious if there is a way to do that for all functions without explicitly defining them.

@MrPowers but I can assist in monkey patching :)

@gorros - Thanks for the help @gorros! Let me know if you find a clever way to do this without explicitly defining all the functions. In the meantime, I'm going to keep adding functions in the pattern you laid out in PR #51. Thanks!

@MrPowers non-explicit solutions rely on reflection and I am not sure if they will work with implicit conversion. Also, I am not a fun of reflection. But I will try some more.

@MrPowers I came with another idea. Would you like to have the following syntax for the methods without additional arguments
col(" SOME_String ")|trim|lower
without rewriting them method?

My 2 cents. I like fluent interfaces over operator overloading as I find it creates more defensible code wrt keeping things as simple as possible.

However, I find the pipe overloading rather scala elegant, would support that.

One thing that might be worth it to try is the Dynamic trait

Don't have much experience with it but might allow the Fluent interface with little code.

Probably a concern: As of Scala 2.10, defining direct or indirect subclasses of this trait is only possible if the language feature dynamics is enabled.

@eclosson Thanks for info about Dynamic, I will check it out.

@MrPowers did you have chance to review my above suggestion?