facebookarchive/hive-io-experimental

Type check for HiveIO

greg1github opened this issue · 5 comments

Let's add a check to verify that when we call a getLong(i) on a HiveReadableRecord, then the type column i is bigint (and not any other type, maybe int). We can add similar checks for other types. We can also add getLongDoubleMap(i) and a few more methods corresponding to typical hive cell formats that will return types. Then the caller does not have to cast (and supres warnings), and the methods will internally check that the type of the column matches the signature of the get method.

The problem is that yesterday I accidentally fed a String column to a reader that was getting Long, and the returned long was always zero, which subverted my computation. Debugging this kind of cases is hard, especially that the user (me) expects the framework to be "typed."

To be clear you want the type check in both get() and set() methods I presume? Preventing you from feeding a String into a Long column, and making sure when you call getLong() that the underlying column is of BIGINT type.

Yes, both input and output.

I would consider removing the get/set Object, so as to force the user into a type-safe programming.

How do you remove get()/set() yet support retrieving any arbitrary object? If anything we can replace them with just getMap() and getList() as I'm pretty sure that's the only other types that will be in there. To get the full data type though will require some generics magic. You have any ideas?

How about we list the types for get/set we have seen so far first. Then we can decide if get/set Object is needed.