Provide decoding directly from InputStream (and probably Reader) for JVM
gnp opened this issue · 0 comments
There should be a decodeInputStream
method on JsonDecoder
for JVM. I'll provide some notes on my use case (and performance) and how I wrote an equivalent in user-space.
I have code that obtains an InputStream from an object stored in a Zip file. I want to decode that JSON.
Since there was no API in ZIO JSON for reading from an InputStream
, I first had to build a String
from the InputStream
and then parse the JSON (to JSON AST). That took about 230 +/- 3 ms for my use case. I did that with new String(is.readAllBytes, StandardCharsets.UTF_8).fromJson[Json]
It seemed I should not have to put the whole input in memory though, so with a pointer from @erikvanoosten on Discord, I made a version that did this: JsonDecoder[Json].decodeJsonStreamInput(ZStream.fromInputStream(is), StandardCharsets.UTF_8)
. But, that took 955 +/- 12ms. This is a significant decrease in performance. Upon investigation, I discovered the implementation of decodeJsonStreamInput
is taking my ZStream
and converting it back to an InputStream
and wrapping it in a Reader
.
So, I took what I learned from the above and built a user-level solution out of things I found and copied from the underlying private implementation details in ZIO JSON. For my use case this ran in about 330 +/- 6 ms. Much better than using decodeJsonStreamInput
, though admittedly still materially slower than just building and parsing the String
! Here is the implementation I'm using outside ZIO JSON, in user code:
final def decodeInputStream[R, A](
decoder: JsonDecoder[A],
is: InputStream,
charset: Charset = StandardCharsets.UTF_8,
bufferSize: Int = 8192 // Taken from BufferedInputStream.DEFAULT_BUFFER_SIZE
): ZIO[R, Throwable, A] = {
final class UnexpectedEnd
extends Exception(
"if you see this a dev made a mistake using OneCharReader"
)
with scala.util.control.NoStackTrace
def readAll(reader: java.io.Reader): ZIO[Any, Throwable, A] =
ZIO.attemptBlocking {
try decoder.unsafeDecode(Nil, new zio.json.internal.WithRetractReader(reader))
catch {
case JsonDecoder.UnsafeJson(trace) => throw new Exception(JsonError.render(trace))
case _: UnexpectedEnd => throw new Exception("unexpected end of input")
}
}
ZIO.scoped[R] {
ZIO
.fromAutoCloseable(
ZIO.succeed(new BufferedReader(new java.io.InputStreamReader(is, charset), bufferSize))
)
.flatMap(readAll)
}
}
Notes:
UnexpectedEnd
is a copy of the private class in packagezio.json.internal
(from readers.scala).readAll
is a copy of the private method of that name from JVMJsonDecoderPlatformSpecific
- I experimented with putting buffering just on the
InputStream
level, just on theReader
level, or both. For my (single) test case, best performance was with buffering just at theReader
level as shown here. Javadoc forInputStreamReader
makes this same recommendation "for top efficiency" (though initially I was expecting it to be better done at theInputStream
level).