mathialo/bython

py2by breaks on a lot of valid indentation cases

lynnpepin opened this issue · 1 comments

py2by fails on a lot of valid python indentation cases, including, but not limited to:

No way to express mixed indentation levels, which are valid in python. E.g.:

if (1==1):
    if(2==2):
                print("Oops!")

Indentation that does not impact nesting level. E.g.:

print("Sometimes strings that are too long,",
    "are broken up on different lines.")

I'm considering rewriting py2by utilizing the python tokenizer to detect INDENT and DEDENT tokens. Should be more portable too, should the indentation rules for python ever change

Did a bit of research; these few lines of code should be enough to nicely show off the tokenizer. Change "filename.py" to your own filename and check it out!

from tokenize import tokenize, untokenize, tok_name

pyfile = open("simpletest.py","rb")
tokens = list(tokenize(pyfile.readline))
pyfile.close()

for token in tokens:
    print(token.start[0], tok_name[token.exact_type],token.string)
    pass

Explanation: We open a file as "rb" because the tokenizer wants to read them as bytes.
token.start (and token.end) returns a tuple of (linenumber, character), indicating where a token starts and ends. token.exact_type is an integer representation of a token, while toke_name maps that to a string. And token.string gives us the line of python code that resulted in the token in question! :)

EDIT: Forgot to mention, working on this right now