AutoImport.generate_modules_cache can be speeded up by 2x
tkrabel opened this issue · 1 comments
Describe the bug
I played around with AutoImport.generate_modules_cache
, as it is perceived to be slow. I logged what packages get imported from a conda env I created and linked to the project, and I seems there are many unnecessary duplicate entries added to the database.
You can see the imports in this file: sorted_packages_with_site_packages.txt.
I see the following pattern of duplicate entries in that file.
Name(name=<import_name>, modname=<mod_name>, package=<package>, ...)
Name(name=<import_name>, modname="site-packages."<mod_name>, package='site-packages', ...)
One example of a duplicate:
Name(name='BaseName', modname='jedi.api.classes', package='jedi', source=<Source.SITE_PACKAGE: 4>, name_type=<NameType.Class: 7>)
Name(name='BaseName', modname='site-packages.jedi.api.classes', package='site-packages', source=<Source.SITE_PACKAGE: 4>, name_type=<NameType.Class: 7>)
From that, it seems the issue is that we don't exclude the top level site-packages
directory itself from our search tree, which treats it as its own package and hence every package inside of it is double counted.
To Reproduce
- Change code in
sqlite.py
diff --git a/rope/contrib/autoimport/sqlite.py b/rope/contrib/autoimport/sqlite.py
index eb7c27de..42447edc 100644
--- a/rope/contrib/autoimport/sqlite.py
+++ b/rope/contrib/autoimport/sqlite.py
@@ -371,6 +371,7 @@ class AutoImport:
return
self._add_packages(packages)
job_set = task_handle.create_jobset("Generating autoimport cache", 0)
+ end_names = []
if single_thread:
for package in packages:
for module in get_files(package, underlined):
@@ -383,9 +384,11 @@ class AutoImport:
get_future_names(packages, underlined, job_set)
):
self._add_names(future_name.result())
+ end_names.append(future_name.result())
job_set.finished_job()
self.connection.commit()
+ return end_names
def _get_packages_from_modules(self, modules: List[str]) -> Iterator[Package]:
for modname in modules:
- Run the code from an env that has the changes from (1) applied
from rope.base.project import Project
from rope.contrib.autoimport.sqlite import AutoImport
project = Project(".")
autoimport = AutoImport(project, memory=True)
autoimport.generate_modules_cache()
- Look at the result
We need to also address this comment: #723 (comment)