Continue parsing after finding error
NoahTheDuke opened this issue · 2 comments
Is your feature request related to a problem? Please describe.
It's possible to recover from encountering mismatched brackets using :edamame/expected-delimiter
but not for other kinds of errors. I would like to be able to recover from other kinds of errors to continue parsing. I encountered this with non-octal numbers padded with zeros (08
and 09
), but it would be helpful in all other contexts as well (keywords or symbols with multiple /
, maps with uneven number of entries, etc).
Describe the solution you'd like
Some mechanism to provide a "fix" and continue to parse. This could be a flag set in the parser when it's first called, or it could be an alternate code flow, or it could even be side-effecting top-level function to alter the state of all parsers (like (set! *warn-on-reflection* true)
), or the problems could be accrued in some sort of "broken state" map and returned alongside the correct code. Here are a couple ideas for how to solve this, after thinking about it for 5 seconds:
Idea: Errors could be thrown with the correctly parsed parts and the broken parts attached in some fashion. For example, uneven maps could be, :edamame/uneven-map {:map {:a 1 :b 2} :leftovers :c}
and duplicates could be, {:edamame/duplicate-map-entry {:map {:a 1 :b 2} :duplicates [{:a 3}]}}
. This would allow for granularity in how each is tackled.
Idea: Errors in code could be replaced with gensym-like keywords so they can be replaced as desired. For example, (parse-string-all "(list 1 2 08 {:a 1 :b 2 :c})" {:gather-errors true})
would return [[(list 1 2 :edamame/error-1 :edameme/error-2)] {:edamame/errors-1 {:type :edamame/incorrect-number :string "08"} :edamame/errors-2 {:type :edamame/uneven-map :string "{:a 1 :b 2 :c}"}}]
.
Idea: Error fixing functions can be included in the parser options so throw if function doesn't return a non-nil value: (parse-string-all "(list 1 2 08 {:a 1 :b 2 :c})" {:incorrect-number (fn [s] (when (str/starts-with s "0") (subs s 1)) :uneven-map (fn [entries] (conj entries :splint/missing-value))}
would return [(list 1 2 8 {:a 1 :b 2 :c :splint/missing-value})]
.
Describe alternatives you've considered
- Do nothing. Can't know what was intended so must exist immediately.
- Don't fix the parsing state, just delete the offending token and move on.
Additional context
The goal is to be able to analyze a whole file and provide feedback even when it's not exactly correct, because it's still worthwhile to check the rest of the file. It's annoying to only see one broken piece of code at a time, instead of being able to review/fix them all at once.
Lots of ideas and possibilities here. It would help (and save time) if you could make a table of anything that could go wrong during parsing (e.g. unbalanced parens, uneven amount of key/vals in map, duplicate set elements) and how this would be solved on a case by case basis (and/or by configuration).
Excluding all of the feature throws ("Syntax quote not allowed." etc) and unmatched delimiters (those are already handled):
fn | msg | kind |
---|---|---|
read-num |
Invalid number | :invalid-char |
parse-string |
EOF while reading, expected X to match | :eof |
parse-to-delimiter |
EOF while reading, expected X to match | :eof |
read-regex-pattern |
Error while parsing regex | :eof |
parse-set |
X literal contains duplicate key | :duplicate |
parse-first-matching-condition |
Feature should be a keyword | :invalid-type |
parse-first-matching-condition |
EOF while reading, expected X | :eof |
read-symbol |
Invalid symbol | :invalid-char |
parse-namespaced-map |
namespaced map must specify a namespace | :syntax |
parse-sharp |
Unexpected EOF | :eof |
parse-sharp |
EOF while reading | :eof |
parse-map |
Map literal contains odd forms. | :uneven-pairs |
parse-keyword |
Invalid token | :invalid-char |
dispatch |
EOF while reading | :eof |
:eof
is hard to know how to handle. Maybe punt for now for simplicity (shouldn't happen during most usages).:invalid-char
is my catch-all for "used a wrong character for one of the literals": disallowed letters in a number, disallowed characters in a keyword or symbol, etc. Maybe the string so far and the type ({:type :symbol :string "cool-symbol:"}
) could be passed to a provided function and the valid type must be returned.:duplicate
is pretty easy: if the fixer fn is provided, pass the vector of forms to the function. let it create a valid object of the required type.:syntax
felt less specific than:namespaced-map-error
, but it's the only one lol. I don't know how to fix this except to pass the string plus following map to the function and let it fix it. Maybe punt? Most people don't use namespaced maps.:uneven-pairs
can be fixed in the same way as:duplicate
: pass the vector of forms to the provided function and then assert it returns a valid map.
What do you think?