Simplify the tokenize_line function
Opened this issue · 2 comments
ShivamSarodia commented
The tokenize_line
lexer function is long and becoming difficult to maintain/read. It should be broken up into multiple parts, perhaps as part of a refactor of the lexer as a whole.
eriols commented
Was looking a little bit at this one. Maybe a separate issue but I thought about isolating the preprocessing a bit more so it can do the steps 1 trigraph to single character, 2 line splicing/joining etc before going into 3 the tokenization and then 4 macro expansion.
I.e. doing trigraph conversion does not really fit in this function as it requires looking at three "symbol_kinds" instead of two so refactoring it without thinking about trigraphs is perhaps a bit "short-term" win as such?
ShivamSarodia commented