typelead/eta-hackage

Handle string literals > 65535 bytes long

rahulmutt opened this issue · 1 comments

While it's hard to break the 65,535 limit for strings with manual user input, it seems like generated lexers/parsers from alex & happy generate their state tables as a giant primitive string literal and it may happen that for sufficiently large grammars, it may get close to this limit. For now, the compiler will emit a warning when a literal string that consumes > 65,000 bytes is trying to compile.

This problem can be fixed in general:
1.) Add a version of eta.runtime.io.MemoryManager.loadString method that takes an array of strings and adds them sequentially to the allocated byte buffer that has a size of all the string sizes combined.
2.) Modify cgLit (MachStr _) in ETA.CodeGen.Utils to check the size of the ByteString and if it exceeds 65,000 (say) it should split it up into chunks of size 65,000 bytes and generate multiple sconst instructions that get pushed into an array and passed into the variant of loadString that should've been done in (1) above.

Oops, wrong issue.