TRLC performance analysis and improvements

Question

TRLC performance analysis and improvements

florianschanda opened this issue a year ago · 0 comments

The worst offenders are for tests-system/bulk are:

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000   31.589   31.589 trlc/trlc.py:21(<module>)
        1    0.000    0.000   31.571   31.571 trlc/trlc.py:490(main)
        1    0.000    0.000   31.558   31.558 trlc/trlc.py:422(process)
        1    0.000    0.000   27.343   27.343 trlc/trlc.py:364(parse_trlc_files)
       85    0.037    0.000   27.338    0.322 trlc/parser.py:1705(parse_trlc_file)
    63859    0.044    0.000   27.257    0.000 trlc/parser.py:1577(parse_trlc_entry)
    63859    0.887    0.000   27.084    0.000 trlc/parser.py:1530(parse_record_object_declaration)
  1217740    0.433    0.000   19.282    0.000 trlc/parser.py:167(match)
  1217930    0.643    0.000   18.853    0.000 trlc/parser.py:139(advance)
  1217930    3.994    0.000   18.210    0.000 trlc/lexer.py:332(token)
   319609    0.692    0.000   10.996    0.000 trlc/parser.py:1352(parse_value)
  6214992    3.212    0.000    4.665    0.000 trlc/lexer.py:215(is_alnum)
   581360    0.591    0.000    4.360    0.000 trlc/ast.py:3060(lookup_direct)
        1    0.017    0.017    4.212    4.212 trlc/trlc.py:391(resolve_record_references)
    63859    0.144    0.000    4.160    0.000 trlc/ast.py:2873(resolve_references)
   109334    0.077    0.000    4.030    0.000 trlc/ast.py:1063(resolve_references)
    74685    0.032    0.000    3.853    0.000 trlc/ast.py:904(resolve_references)
  9752278    3.774    0.000    3.774    0.000 trlc/lexer.py:238(advance)
       10    0.298    0.030    3.659    0.366 /usr/lib/python3.8/difflib.py:688(get_close_matches)
    94872    0.136    0.000    2.933    0.000 trlc/parser.py:328(parse_qualified_name)
   638740    1.912    0.000    2.813    0.000 /usr/lib/python3.8/difflib.py:647(quick_ratio)
  1217930    0.876    0.000    2.442    0.000 trlc/lexer.py:232(skip_whitespace)
    63859    0.096    0.000    2.229    0.000 trlc/ast.py:2816(__init__)
  1217844    1.020    0.000    1.905    0.000 trlc/lexer.py:71(__init__)
    63859    0.531    0.000    1.846    0.000 trlc/ast.py:2821(<dictcomp>)
  1021744    0.720    0.000    1.316    0.000 trlc/ast.py:546(__init__)
  1217844    0.765    0.000    1.242    0.000 trlc/lexer.py:162(__init__)
  1217844    0.798    0.000    1.188    0.000 trlc/lexer.py:200(is_alpha)

This is not unexpected:

token() is the worst offender with 18s (number crunching)
parse_trlc_files() takes around 9s once you remove the lexing (which likely seems unavoidable)
and process() takes 4 seconds, which is entirely due to resolve_record_references (unavoidable, this is work that needs to happen sooner or later)

There are some immediate ideas:

is_alpha, is_alum, and is_digit could be replaced by more builtiny functions (but we need to take care of unicode stuff, so it's not as easy as just using the builtins)
implement #47
implement #48
token() could be optimised in some other way
token() could be replaced by a hand-written c lexer (but this adds portability concerns)

There is one more issue that could manifest on windows with large repos: if you have millions of files (most of which are not trlc files) then the initial traversal for register_dir could take a lot of time.