agentos-project/agentos

VirtualEnvs should preserve sys.path changes

Closed this issue · 0 comments

Sometimes when imported, a library will make changes to the sys.path upon which other code in that library depends (I believe ray or one of its dependencies does this).

Right now we use an idiom like the following for importing and running managed objects:

with venv:
    import_managed_obj()

This can cause errors because when we exit the venv context, we lose all changes that the managed object has made to the environment. Instead, we should preserve changes to the environment.

Additionally, I think we want to generally encourage Components to activate the shared virtualenv once (and not deactivate it) due to potentially unexpected behavior around PCS not being able to provide full environment isolation. I'm not sure if this once-only behavior should be explicitly enforced.

Related: Why PCS can't provide fully environment isolation

The Just-in-time virtual env design doc goes more in-depth into some of the limitations of Python virtual env manipulation. From the doc:

  • Python has facilities for extension modules, these are compiled C and C++ libraries (.so files on linux systems) that you can import into your Python code in the same way you would import a pure Python module.
  • These extension modules, however, are treated differently than pure Python modules. Many of these modules use shared process-wide state (instead of module-local state like pure Python modules).
  • This shared process-wide state prevents us from cleanly reloading these modules when, for example, we want to import a different version of an already imported extension module so that a Component can run against this new version.
  • Currently, there are no facilities for reloading extension modules although there is a proposal (and maybe active work) on making extension modules default to a per-module state model instead of a per-process state model.2 This would potentially make reloading more feasible but presumably all extension modules that use the current per-process style would have to be rewritten.
  • As a result, we are only able to isolate a Component’s environment from other Component environments when all modules are pure Python (i.e. no extension modules are used)
  • Unfortunately, many useful real-world libraries require extension modules (e.g. protobuf which is used in MLflow, tensorflow).
  • Trying our environment isolation techniques when extension modules are loaded result in unexpected errors (e.g. classes that should be the same are not the same, data that should only be initialized once gets re-initialized) and break the running of our Components.