adam-mcdaniel/oakc

Improved error reporting

Opened this issue · 2 comments

I have been doing a bit of research on how to improve error reporting by capturing information about the original source file. Larlpop does not support anything out of the box except for 2 pseudo symbols 'documented' here.

I took a look at how other programming languages use this and it seems codespan is a fairly mature solution. A more barebones solution like codemap might be easier to use to prototype.

This would not only improve error reporting by showing what part of the source code contains the error but sets up a next step like sourcemaps. I think the javascript target is a great environment to test sourcemaps since it has great sourcemap support (at least on chrome). It might even be possible have the debugger work straight away if the sourcemaps are implemented correctly.

In my attempt to set up an MVP I ran into a few problems:

  • #68 needs to be resolved before a solid implentation can be included
  • The first file is compiled in bin.rs where the file name and contents are available. The rest of the files are only accessible inside tir/hir. This would make it a bit akward with passing around the meta data structure.
  • It is a bit unclear who should own this data, currently a new parser is created in lib.rs with parser::ProgramParser::new(). Maybe this data structure should be owned by the parser and the instance should be kept and passed to any include statements so a parser owns the complete meta data of a oak program?

Issue #68 has been fixed by PR #74, so now this might be something we could start working on.

The main problem is that 99% of the typechecking errors are in MIR, and by this time, most of the code has been twisted and manipulated into code that's different from the user's. Would codespan make it possible to do better error messages despite this?

Well, as far as I know we have to capture from which tokens the AST nodes are created and then pass them down correctly. The @L and @R in lalrpop are indexes into a &str so those can be captured to form some kind of Token or MetaData structure. Those indexes and a file name would be enough meta data to map it back to the original source code no matter how far down the IR you go. At least that is how I infer how it should work.