calyxir/calyx

Report multiple errors

Opened this issue · 5 comments

At the moment, we only report the first error found in a file. This is the default way that error handling works in Rust because of how ? works. For the language server it would be super neat if we could report all the errors found in a file, so that we could display them all at once.

I'm not exactly sure what the best way to implement this is. I don't know if this is possible while using the ? operator. Something to look into.

So near as I can tell, this is one of those things that seems like it should be easy and is actually quite hard. I found a few blog posts that are kinda about the topic but haven't had a chance to read them yet so I won't link them here, but everything I've seen suggests a different general "shape" for programs that want to do this type of error collecting.

Instead of having the return type be Result<Program, Error> it needs to be more like (Result<Program, ()>, Vec<ErrorStuff>) and making the latter is not something that is easily bolted on. Or at least so sayeth: https://users.rust-lang.org/t/accumulating-multiple-errors-error-products/93730/4

The biggest problem of course is that this requires the program to continue an execution partially under a lot of circumstances which is not what ? is built to do, so you would need to unwrap and "vec-ify" a bunch of errors and then bubble that further up the stack by hand

Filament used to do this by passing a mutable DiagnosticsReporter responsible for accumulating errors and reporting them after a pass has completed. We can do something similar but have to figure out what to do in cases when we currently return Err(..) because we might need a recovery mechanism instead.

Another option is librarize-ing the analysis done by papercut and well-formed so we can write a different LSP analysis tool instead of relying on the passes to perform the checking for us.

Thanks for the thought, @sgpthomas! Yes, this would be a cool thing to do now that Calyx is kinda grown up.

Yep, @EclecticGriffin is right that there is no real shortcut here. Basically, grown-up compilers do what @rachitnigam alluded to: there is a "reporter" thing that gets passed around mutably one way or another, and specific passes/checks call a method on that to accumulate reports.

@rachitnigam, you make a good point that we could consider avoiding disturbing the compiler too much by doing what you say. Philosophically, this would amount to a recognition that the papercut pass and well-formedness pass are different from all other passes; other passes do not wish to report interesting human-visible diagnostics and would prefer to just abort on first error for efficiency reasons. So with that in mind, it does not make sense to change all the passes in the world (and the top-level compiler entry point) to support a use case that is actually much more narrow than the compiler as a whole.

Before I go to much further with my implementation, I'm curious to gather thoughts on what I've done so far. Work is in this PR: #1950.

I've introduced a DiagnosticContext which is a simple struct that supports mutably gathering results. Passes that are interested in reporting multiple errors, can include this in their pass struct like so. And then, instead of return Err(...), they can use self.diag.err(..) to report an error. Refactoring passes to use this has the side-effect that they no longer produce an Err ever. To handle this for now, I've introduce register_diagnostic into the PassManager which pulls out the first Error of the pass manager (if any), making it act as a normal single-error pass. Eventually, we can turn this into reporting multiple errors, but I'll leave that for a future PR.

This makes "multiple error reporting" an opt-in thing for passes, and doesn't require any changes for existing passes. Does this seem like a reasonable approach?

@sgpthomas I really like this approach! One tricky thing is making sure that the error reporting code does not assume that bail outs will occur. For example, if a particular program is malformed in some way, it might cause a cascade of errors that all boil down to the same error location.

Also, we should just take advantage of this in the normal compiler flow. No need to just report just the first error.