JuliaLang/Tokenize.jl

JET self test

goerch opened this issue · 5 comments

JET self test reports many possible problems with CSTParser. These in turn seem to lead to Tokenize.jl. Is there an easy way to resolve these?

As far as I understand it, returning EMPTY_TOKEN(token_type(l)) is not compatible to CSTParser's ParseState

mutable struct ParseState
    l::Lexer{Base.GenericIOBuffer{Array{UInt8,1}},RawToken}
    done::Bool # Remove this
    lt::RawToken
    t::RawToken
    nt::RawToken
    nnt::RawToken
    lws::RawToken
    ws::RawToken
    nws::RawToken
    nnws::RawToken
    closer::Closer
    errored::Bool
end

expecting RawToken. Any recommendations how this should be fixed?

Edit: I missed

EMPTY_TOKEN(::Type{Token}) = _EMPTY_TOKEN
EMPTY_TOKEN(::Type{RawToken}) = _EMPTY_RAWTOKEN

I am a bit confused about all these JET issues being opened everywhere. This issue mentions JET, CSTParser.jl and Tokenize. So is there an issue with Tokenize? In that case, is there a repro?

I am a bit confused about all these JET issues being opened everywhere. This issue mentions JET, CSTParser.jl and Tokenize. So is there an issue with Tokenize? In that case, is there a repro?

That is what I'm trying to understand. Example JET error is

┌ @ C:\Users\Win10\Documents\GitHub\CSTParser.jl\src\lexer.jl:47 CSTParser.next(ps)
│┌ @ C:\Users\Win10\Documents\GitHub\CSTParser.jl\src\lexer.jl:80 t = Base.getproperty(Base.getproperty(CSTParser.Tokenize, :Lexers), :next_token)(Base.getproperty(ps, :l))
││┌ @ C:\Users\Win10\.julia\packages\Tokenize\FGrTw\src\lexer.jl:807 Tokenize.Lexers.read_string(l, Tokenize.Tokens.STRING)
│││┌ @ C:\Users\Win10\.julia\packages\Tokenize\FGrTw\src\lexer.jl:863 goto %50 if not Tokenize.Lexers.==(Tokenize.Tokens.kind(t), Tokenize.Tokens.ENDMARKER)
││││ for 1 of 2 union split cases, non-boolean (Missing) used in boolean context: goto %50 if not Tokenize.Lexers.==(Tokenize.Tokens.kind::typeof(Tokenize.Tokens.kind)(t::Any)::Any, Tokenize.Tokens.ENDMARKER::Tokenize.Tokens.Kind)::Union{Missing, Bool}

That just looks like something (a token) is not inferred properly so == infers to Union{Missing, Bool}. Not really an issue with Tokenize.jl I think.

However, to "fix" it these could be changed to use ===:

Tokenize.jl/src/lexer.jl

Lines 863 to 867 in bed7f32

if Tokens.kind(t) == Tokens.ENDMARKER
return false
elseif Tokens.kind(t) == Tokens.LPAREN
o += 1
elseif Tokens.kind(t) == Tokens.RPAREN

Sorry for the noise: I was running the self test on Juno/Atom and this was incorporating Tokenize as preloaded package. Nevertheless a possible problem if users want to use JET in an IDE.