emrgnt-cmplxty/automata

Update embedding scripts to remove 'missing' symbols

emrgnt-cmplxty opened this issue · 0 comments

Embeddings can be refreshed via the code below

# Build/refresh the code embeddings
automata run-code-embedding

# "L1" docs are the docstrings written into the code
# "L2" docs are generated from the L1 docs + symbol context
# Build/refresh and embed the L2 docs
automata run-doc-embedding-l2

# "L3" docs are generated from the L2 docs + symbol context
# Build/refresh and embed the L3 docs
automata run-doc-embedding-l3

Unfortunately, the way these scripts are structured they do not remove symbols from the embedding that are no longer supported in the codebase. Overtime this causes a creeping growth in observed errors like these:

...
ERROR:automata.core.symbol.graph:Error processing scip-python python automata 9db05b7e7ebd49f93703df45accd7e5f9d5cedb0 `automata.core.coding.py_coding.navigation`/find_syntax_tree_node().: Symbol(scip-python python automata 9db05b7e7ebd49f93703df45accd7e5f9d5cedb0 `automata.core.coding.py_coding.navigation`/find_syntax_tree_node()., scip-python, Package(python automata 9db05b7e7ebd49f93703df45accd7e5f9d5cedb0), (Descriptor(automata.core.coding.py_coding.navigation, 1), Descriptor(find_syntax_tree_node, 4)))
ERROR:automata.core.symbol.graph:Error processing scip-python python automata 9db05b7e7ebd49f93703df45accd7e5f9d5cedb0 `automata.tests.unit.test_python_writer_tool`/test_extend_module_with_new_function().: Symbol(scip-python python automata 9db05b7e7ebd49f93703df45accd7e5f9d5cedb0 `automata.tests.unit.test_python_writer_tool`/test_extend_module_with_new_function()., scip-python, Package(python automata 9db05b7e7ebd49f93703df45accd7e5f9d5cedb0), (Descriptor(automata.tests.unit.test_python_writer_tool, 1), Descriptor(test_extend_module_with_new_function, 4)))
ERROR:automata.core.symbol.graph:Error processing scip-python python automata 9db05b7e7ebd49f93703df45accd7e5f9d5cedb0 `automata.tests.unit.test_python_writer_tool`/test_extend_module_with_documented_new_function().: Symbol(scip-python python automata 9db05b7e7ebd49f93703df45accd7e5f9d5cedb0 `automata.tests.unit.test_python_writer_tool`/test_extend_module_with_documented_new_function()., scip-python, Package(python automata 9db05b7e7ebd49f93703df45accd7e5f9d5cedb0), (Descriptor(automata.tests.unit.test_python_writer_tool, 1), Descriptor(test_extend_module_with_documented_new_function, 4)))
ERROR:automata.core.symbol.graph:Error processing scip-python python automata 9db05b7e7ebd49f93703df45accd7e5f9d5cedb0 `automata.core.coding.py_coding.module_tree`/LazyModuleTreeMap#put_module().: Symbol(scip-python python automata 9db05b7e7ebd49f93703df45accd7e5f9d5cedb0 `automata.core.coding.py_coding.module_tree`/LazyModuleTreeMap#put_module()., scip-python, Package(python automata 9db05b7e7ebd49f93703df45accd7e5f9d5cedb0), (Descriptor(automata.core.coding.py_coding.module_tree, 1), Descriptor(LazyModuleTreeMap, 2), Descriptor(put_module, 4)))
ERROR:automata.core.symbol.graph:Error processing scip-python python automata 9db05b7e7ebd49f93703df45accd7e5f9d5cedb0 `automata.core.coding.py_coding.module_tree`/DotPathMap#contains_dotpath().: Symbol(scip-python python automata 9db05b7e7ebd49f93703df45accd7e5f9d5cedb0 `automata.core.coding.py_coding.module_tree`/DotPathMap#contains_dotpath()., scip-python, Package(python automata 9db05b7e7ebd49f93703df45accd7e5f9d5cedb0), (Descriptor(automata.core.coding.py_coding.module_tree, 1), Descriptor(DotPathMap, 2), Descriptor(contains_dotpath, 4)))

We should modify the re-fresh scripts to remove these 'missing' symbols.

Feel free to post any questions or concerns you have about this implementation. Your contribution to this project is highly appreciated!