tweag/inline-java

Updates for text 2.x package

Closed this issue · 5 comments

The jni and jvm packages make assumptions about the internal representation of text in the text package (UTF-16) that aren't valid anymore as of v. 2.x. It's now UTF-8.

https://github.com/tweag/inline-java/blob/master/jni/src/common/Foreign/JNI/Unsafe/Internal/Introspection.hs#L104
https://github.com/tweag/inline-java/blob/master/jvm/src/common/Language/Java/Unsafe.hs#L753
https://github.com/tweag/inline-java/blob/master/jvm/src/common/Language/Java/Unsafe.hs#L759

For strings from Java we can't simply cast a pointer anymore, they need to be properly en-/decoded.

jni:

import Data.ByteString as BS
import Foreign.Ptr

toText :: JString -> IO Text.Text
toText obj = bracket
  (getStringChars obj)
  (releaseStringChars obj) $
  \cs -> do
    sz <- fromIntegral <$> getStringLength obj
    txt <- Text.decodeUtf16LEWith (\_ _ -> Just '?') <$> BS.packCStringLen (castPtr cs, sz * 2)
    return txt

jvm:

import qualified Data.Text.Encoding as Text
import Foreign (Ptr, Storable, withForeignPtr, castPtr)

instance Reify Text where
  reify jobj = do
      sz <- getStringLength jobj
      cs <- getStringChars jobj
      txt <- Text.decodeUtf16LEWith (\_ _ -> Just '?') <$> BS.packCStringLen (castPtr cs, fromIntegral sz * 2)
      releaseStringChars jobj cs
      return txt

instance Reflect Text where
  reflect x =
      BS.useAsCStringLen (Text.encodeUtf16LE x) $ \(ptr, len) ->
        newString (castPtr ptr) (fromIntegral len `div` 2)

Note that compiling with recent template Haskell also needs a version of distributed-closure that isn't on hackage:

In cabal.project:

source-repository-package
    type: git
    location: https://github.com/tweag/distributed-closure.git
    tag: 0eaace06ad1e9d80d13287b4e3b1e03f314082ed

Thanks for the report @user16332!

Just for my info, do you intend to submit a PR?

Note that compiling with recent template Haskell also needs a version of distributed-closure that isn't on hackage:

I just uploaded these changes in a new release of distributed-closure (0.5.0.0).