evolution-gaming/kryo-macros

OOM on some BigDecimal inputs during serialization

migesok opened this issue · 5 comments

For a serializer generated for a case class which contains a BigDecimal field, if a number with sufficiently big scale supplied (like BigDecimal("3E1000000000")), serialization logic blows in memory causing OOM.
This could be a possible source for DoS attacks.

It seems that using of toPlainString wasn't the best idea:

q"output.writeString($arg.underlying.toPlainString)"

But even in case of using of toString users can be affected after parsing of such values. Just try to pass the parsed value of that big decimal number to the following function and see what will happen:

scala> val f = (x: BigDecimal) => x + 1
f: BigDecimal => scala.math.BigDecimal = $$Lambda$1210/93981118@790ac3e0

So, we should avoid returning of parsed BigDecimal with too big exponent or with MathContext.UNLIMITED.

As example to mitigate possible DoS attacks jsoniter-scala uses safe defaults that limit a scale value and number of significant digits. Also, it sets MathContext.DECIMAL128 by default:

https://github.com/plokhotnyuk/jsoniter-scala/blob/41573be3df9df4a6f9d29c8b5931cec0c951c7ce/jsoniter-scala-macros/src/main/scala/com/github/plokhotnyuk/jsoniter_scala/macros/JsonCodecMaker.scala#L65-L67

@plokhotnyuk I think safe or unsafe to use is out of scope for a serializer. I believe any valid value of any supported type should be reversely serializable without loosing data, that is:

deserialize(serialize a) shouldEqual a

Using toString solves this issue, also the change is perfectly format-backward-compatible.

@migesok so will there be a pull request ? :)

@t3hnar what about changing the representation from textual to binary?

As example, will it be acceptable if BigDecimal values be serialized as byte arrays of mantissas followed by int of scale?

In any case, the value of MathContext and a limit for scale can be configured by some system property or by introducing some configuration parameters for the macros, isn't it?

@plokhotnyuk why not? it is a binary format at the end.