spark-root/laurelin

Implement AsStrings

jpivarski opened this issue · 3 comments

This one is important. Do this before AsDouble32, AsSTLBitSet, etc.

The String type that spark wants is org.apache.spark.unsafe.types.UTF8String -- it actually has a constructor that accepts an array of bytes, if you want to skip making a Java String first https://github.com/apache/spark/blob/master/common/unsafe/src/main/java/org/apache/spark/unsafe/types/UTF8String.java#L110

This is the quandary. For ease of use in Spark, we'd like it to be some canonical string object, but ROOT provides no encoding. Bytestrings would be appropriate, but bytestrings in Java are even less pleasant than they are in Python 3: who wants to unpack byte[]?