Better errors for literal parsing

Question

Better errors for literal parsing

ISibboI opened this issue 2 years ago · 6 comments

Right now when parsing a literal fails, evalexpr simply assumes it is supposed to be an identifier. We should introduce the basic rule that identifiers need to start with a letter, and numeric literals with a number (like in many major programming languages).

Answer 1 · 2023-06-04T04:38:57.000Z

@ISibboI Is it okay If I do this? I am planning to change the conversion of Partial Token to Token Portion of Code. If it starts with letter or underscore, it will be parsed as Identifiers. Then, it will try parsing float and then integer?

Answer 2 · 2023-06-04T04:51:46.000Z

Sure, that sounds good!

…

On Sun, 4 Jun 2023, 7.39 hexofyore, ***@***.***> wrote: @ISibboI <https://github.com/ISibboI> Is it okay If I do this? I am planning to change the conversion of Partial Token to Token Portion of Code. If it starts with letter or underscore, it will be parsed as Identifiers. Then, it will try parsing float and then integer? — Reply to this email directly, view it on GitHub <#134 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AASATXTEDP5CJ4Q53XYHLGDXJQGOZANCNFSM6AAAAAAYVOOQYM> . You are receiving this because you were mentioned.Message ID: ***@***.***>

Answer 3 · 2023-06-04T05:26:21.000Z

Actually, the rule should be more like: when it starts with a number, then go for the number variants, and otherwise go for identifier. Such that identifiers can be arbitrary unicode not starting with a (arabic) number.

…

On Sun, 4 Jun 2023, 7.51 Sebastian Schmidt, ***@***.***> wrote: Sure, that sounds good! On Sun, 4 Jun 2023, 7.39 hexofyore, ***@***.***> wrote: > @ISibboI <https://github.com/ISibboI> Is it okay If I do this? I am > planning to change the conversion of Partial Token to Token Portion of > Code. If it starts with letter or underscore, it will be parsed as > Identifiers. Then, it will try parsing float and then integer? > > — > Reply to this email directly, view it on GitHub > <#134 (comment)>, > or unsubscribe > <https://github.com/notifications/unsubscribe-auth/AASATXTEDP5CJ4Q53XYHLGDXJQGOZANCNFSM6AAAAAAYVOOQYM> > . > You are receiving this because you were mentioned.Message ID: > ***@***.***> >

Answer 4 · 2023-06-04T05:29:13.000Z

Doesn't rust and other programming language only support english letters and underscore for starting letter? Do they support other unicode characters?

Answer 5 · 2023-06-04T05:47:29.000Z

Ideally, we would mimic Rust identifiers: https://doc.rust-lang.org/reference/identifiers.html However, I am not sure if the standard library allows e.g. checking for a character being XID_Start, and I would not want to add any dependency for that. But if the standard library has a way to check if a string is a Rust identifier, then that would of course be great. If not or if that is too much effort, then the following would be great: any sequence of characters with the `Alphabet` attribute (see https://doc.rust-lang.org/std/primitive.char.html#method.is_alphabetic) as well as the underscore character being identifiers, except that a single `_` is not an identifier.

…

On Sun, Jun 4, 2023 at 8:29 AM hexofyore ***@***.***> wrote: Doesn't rust and other programming language only support english letters and underscore for starting letter? Do they support other unicode characters? — Reply to this email directly, view it on GitHub <#134 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AASATXSIGCPRAH7A3R65BYDXJQMLJANCNFSM6AAAAAAYVOOQYM> . You are receiving this because you were mentioned.Message ID: ***@***.***>

Answer 6 · 2023-06-05T04:38:37.000Z

@ISibboI I am not so sure about this. I made small changes. Check and see what's missing. I will PR it