travisbrown/dhallj

Type checker is not stack-safe for (extremely) deep records

travisbrown opened this issue · 0 comments

Similar to #2, although the cause is different, and the point at which it becomes a problem is much deeper.

The type checker will happily give you a type for a record many thousands of nestings deep (although 20k layers takes about a minute on my laptop, so it's not fast):

scala> import org.dhallj.core.Expr
import org.dhallj.core.Expr

scala> val deepRecord = (0 to 20000).foldLeft(Expr.Constants.TRUE) {
     |   case (expr, _) => Expr.makeRecordLiteral("foo", expr)
     | }
deepRecord: org.dhallj.core.Expr = {foo = {foo = {foo = {foo = {foo = {foo = ...

scala> deepRecord.typeCheck
res0: org.dhallj.core.Expr = {foo : {foo : {foo : {foo : {foo : {foo : {foo : ...

All operations except type checking work for arbitrarily deep values—e.g. normalizing or hashing a record 100k layers deep takes seconds:

scala> val deeperRecord = (0 to 100000).foldLeft(Expr.Constants.TRUE) {
     |   case (expr, _) => Expr.makeRecordLiteral("foo", expr)
     | }
deeperRecord: org.dhallj.core.Expr = {foo = {foo = {foo = {foo = {foo = {foo = ...

scala> deeperRecord.normalize
res2: org.dhallj.core.Expr = {foo = {foo = {foo = {foo = {foo = {foo = {foo = {foo = ...

scala> deeperRecord.hash
res3: String = 8a8477b86e27cd48496db13bbd71bb9845c700cb88b9a8bfacd2391541ff38cc

(I'm guessing dhall would agree on this hash, but it's been running for five minutes and I've not heard back from it yet.)

Type checking blows up somewhere between 20k and 100k:

scala> deeperRecord.typeCheck
java.lang.StackOverflowError
  at org.dhallj.core.typechecking.TypeCheck.onRecord(TypeCheck.java:404)
  at org.dhallj.core.typechecking.TypeCheck.onRecord(TypeCheck.java:27)
  at org.dhallj.core.Constructors$RecordLiteral.accept(constructors-gen.java:249)
  at org.dhallj.core.typechecking.TypeCheck.onRecord(TypeCheck.java:408)
  at org.dhallj.core.typechecking.TypeCheck.onRecord(TypeCheck.java:27)
  at org.dhallj.core.Constructors$RecordLiteral.accept(constructors-gen.java:249)
  ...

This is because the type checker is currently implemented as an external visitor, where the visitor drives the recursion manually, while all of the other core operations are implemented as internal visitors, where the driver manages the recursion and maintains its own stack.

20k layers should be enough for any record, so I'm not letting this block the 0.1.0 release, but I'm planning to rework the type checker as an internal visitor soon, anyway, and I'm opening this issue to track the problem.