verona Error handling, the next level C++
This is a long story, not a quick-fix issue. It's an overall design "document".
In #168 / #177 we introduced LLVM's Error
and Expected
patter for error handling. This works well in the MLIR generator, but we still have a few problems:
First, only the MLIR generator is using, while the Parser is using its own handler. This makes it harder to collate errors into a single hierarchical structure.
Second, the error messages are baked in (in English) inside the generating code. This makes it harder to display error in different settings (console, pop-ups, status bars).
Finally, the errors are final and can't be composed or returned as a list. This makes it harder to do partial compilation or to ignore errors in order to get as many as possible to show the user.
These three problems are important to get right early on, as it gets harder to do the more code we have. They're also an important part of our goal for the Verona compiler (partial compilation, helper tools, etc).
4 Answer:
To fix the first problem, we just need to take a decision on which error handling we'll use for the project as a whole, and then propagate that to all interacting sub-projects. For now, that's mostly the AST parser and MLIR generator. But we need to take a decision that makes sense for both projects, and all future integrations later (driver, LLVM handler).
To fix the second problem, the common and simple approach is to make ParsingError
and RuntimeError
base classes for derived behaviour, with constructors that don't propagate the meaning as text, but as which derived class we use. The front-ends then can choose how to report them, using the internal structure to populate the error messages.
For example:
IncompatibleType : public ParsingError {
AST type1, type2;
...
};
llvm::make_error<IncompatibleType>(type1, type2, loc);
The front-end would receive that object and use type1
, type2
and loc
directly instead of printing an already rendered string.
We'd probably need some object-to-text conversion that returns some UTF-8 string, but since those will match the source code, we don't need to worry about internationalisation in the error objects themselves.
The third problem is harder. We'll need to change the whole structure of the code.
The parser should be simple, as it can "skip tokens" until some closing token is found (ex. }
) to continue.
The MLIR generator, however, assumes that the AST is well formed, so any error in nested blocks will likely be type errors, or worse, transform errors, and will need to "skip the block", which may not be semantically valid to do at that late point (or easy to report back, with source code location).
The easy way out is to treat functions/classes as isolated constructs and only take one error per top-level construct. But a better approach would be to only add multiple error handling to the areas that make sense (loops, conditionals, lexical blocks). This will add complexity to the generator, so if we can think of an automated way of doing that (some form of dispatcher), that'd make things a lot easier.
We're writing a new parser which will probably change a lot in the current MLIR layer. While these points are still very relevant, the final solution might be completely different. Closing this for now. We can open a new one for the new pipeline when we get there.
Read next
- trash-cli Any active fork Python
- CodeIgniter4 Bug: Debugging toolbar not showing - PHP
- Loading save state fails on mgba-qt but works on mgba (linux) - mgba
- AC3: useQuery does not return old data anymore when variable change - apollo-client
- Sonarr Port is reset to stored value when visiting settings C#
- tuf import-style prevents vendoring - Python
- New warnings from recent ingestion? - Cplusplus react-native-windows
- Not working in PDF viewing - darkreader