tonybaloney/cpython-book-samples

failure to compile when adding almost-equal operator (Lexing and Parsing)

jonkiparsky opened this issue · 5 comments

Attempting to work the almost-equal example in the Lexing and Parsing section, I get an error.

Steps:

  • modify Grammar/python.gram and Tokens (see diff below)
  • execute make regen-token regen-pegen
  • confirm Parser/token.c shows the expected change
  • recompile with make -j2 -s
  • get error:
Parser/pegen/parse.c:9034:51: error: use of undeclared identifier 'AlE'
            _res = _PyPegen_cmpop_expr_pair ( p , AlE , a );
                                                  ^
1 error generated.
make: *** [Parser/pegen/parse.o] Error 1
make: *** Waiting for unfinished jobs....

Note: experienced this error on both Mac and Linux machines.

Diff before regenerating headers:

$ git diff
diff --git a/Grammar/Tokens b/Grammar/Tokens
index 9de2da5d15..424e388cd6 100644
--- a/Grammar/Tokens
+++ b/Grammar/Tokens
@@ -53,6 +53,7 @@ ATEQUAL                 '@='
 RARROW                  '->'
 ELLIPSIS                '...'
 COLONEQUAL              ':='
+ALMOSTEQUAL            '~='
 
 OP
 AWAIT
diff --git a/Grammar/python.gram b/Grammar/python.gram
index 0e12b5cb96..21474deab8 100644
--- a/Grammar/python.gram
+++ b/Grammar/python.gram
@@ -410,6 +410,7 @@ compare_op_bitwise_or_pair[CmpopExprPair*]:
     | in_bitwise_or
     | isnot_bitwise_or
     | is_bitwise_or
+    | ale_bitwise_or
 eq_bitwise_or[CmpopExprPair*]: '==' a=bitwise_or { _PyPegen_cmpop_expr_pair(p, Eq, a) }
 noteq_bitwise_or[CmpopExprPair*]:
     | (tok='!=' { _PyPegen_check_barry_as_flufl(p, tok) ? NULL : tok}) a=bitwise_or {_PyPegen_cmpop_expr_pair(p, NotEq, a) }
@@ -421,6 +422,7 @@ notin_bitwise_or[CmpopExprPair*]: 'not' 'in' a=bitwise_or { _PyPegen_cmpop_expr_
 in_bitwise_or[CmpopExprPair*]: 'in' a=bitwise_or { _PyPegen_cmpop_expr_pair(p, In, a) }
 isnot_bitwise_or[CmpopExprPair*]: 'is' 'not' a=bitwise_or { _PyPegen_cmpop_expr_pair(p, IsNot, a) }
 is_bitwise_or[CmpopExprPair*]: 'is' a=bitwise_or { _PyPegen_cmpop_expr_pair(p, Is, a) }
+ale_bitwise_or[CmpopExprPair*]: '~=' a=bitwise_or { _PyPegen_cmpop_expr_pair(p, AlE, a) }
 
 bitwise_or[expr_ty]:
     | a=bitwise_or '|' b=bitwise_xor { _Py_BinOp(a, BitOr, b, EXTRA) }

This is defined in Python/parser.asdl (page 119), you need to change cmpop with the new token.
After running make regen-ast (page 120) Include/Python-ast.h should be updated with this token in the list and then run make regen-pegen to pick it up.

In that case, the steps are in the wrong sequence. this is an error in the book. The regen steps in CPython changed multiple times on editing.

Thanks for the quick response!

When I add the token to Parser/python.asdl, I can indeed get the modified python to compile. However, this does not seem to be the end of the matter, as now I get an unexpected and initially amusing error on testing the new operator's existence:

Python 3.9.5+ (heads/3.9-dirty:9bcb76c24f, Jun  1 2021, 01:32:19) 
[Clang 11.0.0 (clang-1100.0.33.17)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> 1 ~= 2
Fatal Python error: compiler_addcompare: We've reached an unreachable state. Anything is possible.
The limits were in our heads all along. Follow your dreams.
https://xkcd.com/2200
Python runtime state: initialized

Current thread 0x000000010f54e5c0 (most recent call first):
<no Python frame>
Abort trap: 6

Specificially, starting from the previous state, I

  • added token to Python.asdl
  • ran make regen-ast
  • ran make regen-pegen
  • compiled with make -j2 -s
  • started python.exe and executed 1 ~= 2, with the results above.

That's expected. There is a comment at the end of the chapter about the compiler not understanding it. You'll update the compiler in the next chapter :-)

p.s. yes that error is funny (follow the link)

That's expected.

Okay, I see it now. Thanks. As I recall, the first time I came into this state while trying to trouble-shoot this on my own the ast.parse also failed in some way that I don't remember now, so I didn't think to try it again this time.

As long as you're touching this text, it might not hurt to make it explicit that at this stage testing the operator directly will blow up in this way, in case anyone else gets cross-threaded in these parts.

Didn't need to follow the link, the comic is a classic :) The error message itself is also funny... the first time.