Error with custom framework

Question

Error with custom framework

sedol1339 opened this issue a year ago · 7 comments

Hello! I'm running the following command from unmodified AMLB bencmark repo:

python automlbenchmark/runbenchmark.py catboost amlb_full 1h_1fold \
    --indir /data/osedukhin/shared/openml_cache \
    --outdir /data/osedukhin/shared/amlb_results \
    --userdir /data/osedukhin/shared/amlb_configs \
    --exit-on-error

User dir contains the following config:

frameworks:
  definition_file:
    - '{root}/resources/frameworks.yaml'
    - '{root}/examples/custom/frameworks.yaml'
    - '{user}/frameworks.yaml'
benchmarks:
  definition_dir:
    - '{root}/resources/benchmarks'
    - '{user}/benchmarks'
  constraints_file:
    - '{root}/resources/constraints.yaml'
    - '{user}/constraints.yaml'

{user}/frameworks.yaml contains the following:

CatBoost:
  module: extensions.CatBoost

Directory {user}/frameworks/extensions/CatBoost contains the following __init__.py:

def version():
    from catboost import __version__
    return __version__

from amlb.utils import call_script_in_same_dir

print('CatBoost __init.py__')

def setup(*args, **kwargs):
    print('Installing CatBoost')
    call_script_in_same_dir(__file__, "setup.sh", *args, **kwargs)
    
def run(*args, **kwargs):
    from .exec import run
    return run(*args, **kwargs)

Also {user}/frameworks/extensions/CatBoost contains the following setup.sh:

#!/usr/bin/env bash
shopt -s expand_aliases
HERE=$(dirname "$0")

. "$HERE/.setup_env"
. "$AMLB_ROOT/frameworks/shared/setup.sh" "$HERE" true
PIP install -r "$HERE/requirements.txt"

PY -c "from catboost import __version__; print(__version__)" >> "${HERE}/.installed"

When i'm starting runbenchmark.py using the above command, I got the following output:

Running benchmark `catboost` on `amlb_full` framework in `local` mode.
Loading frameworks definitions from ['/data/osedukhin/shared/automlbenchmark/resources/frameworks.yaml', '/data/osedukhin/shared/automlbenchmark/examples/custom/frameworks.yaml', '/data/osedukhin/shared/amlb_configs/frameworks.yaml'].
Loading benchmark constraint definitions from ['/data/osedukhin/shared/automlbenchmark/resources/constraints.yaml', '/data/osedukhin/shared/amlb_configs/constraints.yaml'].
Loading benchmark definitions from /data/osedukhin/shared/amlb_configs/benchmarks/amlb_full.yaml.
CatBoost __init.py__
fatal: not a git repository (or any parent up to mount point /data/osedukhin)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).

[MONITORING] [python [1098707]] CPU Utilization: 2.9%

------------------------------------------------------------
Starting job local.amlb_full.1h_1fold.eucalyptus.0.CatBoost.
............

After this, it fails with the error No module named 'catboost'.

So, I guess the problem is that init.py is being run, after this occurs fatal: not a git repository and setup method in init.py is not being run. What may be the problem? I don't understand which git repository it wants.

Answer 1 · 2023-10-04T08:19:41.000Z

I have changed python automlbenchmark/runbenchmark.py to cd automlbenchmark; python unbenchmark.py and the error fatal: not a git repository has gone. However, it still does not work. The full output:

(venv) osedukhin@ns-bd-v100-08:~/shared/automlbenchmark$ python runbenchmark.py catboost amlb_full 1h_1fold     --indir /data/osedukhin/shared/openml_cache     --outdir /data/osedukhin/shared/amlb_results     --userdir /data/osedukhin/shared/amlb_configs     --exit-on-error
Running benchmark `catboost` on `amlb_full` framework in `local` mode.
Loading frameworks definitions from ['/data/osedukhin/shared/automlbenchmark/resources/frameworks.yaml', '/data/osedukhin/shared/automlbenchmark/examples/custom/frameworks.yaml', '/data/osedukhin/shared/amlb_configs/frameworks.yaml'].
Loading benchmark constraint definitions from ['/data/osedukhin/shared/automlbenchmark/resources/constraints.yaml', '/data/osedukhin/shared/amlb_configs/constraints.yaml'].
Loading benchmark definitions from /data/osedukhin/shared/amlb_configs/benchmarks/amlb_full.yaml.
CatBoost __init.py__
[MONITORING] [python [1102918]] CPU Utilization: 2.2%

------------------------------------------------------------
Starting job local.amlb_full.1h_1fold.eucalyptus.0.CatBoost.
[MONITORING] [python [1102918]] Memory Usage: 5.1%
Assigning 32 cores (total=32) for new task eucalyptus.
[MONITORING] [python [1102918]] Disk Usage: 79.9%
Assigning 120205 MB (total=128826 MB) for new eucalyptus task.
Running task eucalyptus on framework CatBoost with config:
TaskConfig({'framework': 'CatBoost', 'framework_params': {}, 'framework_version': '1.2.2', 'type': 'classification', 'name': 'eucalyptus', 'openml_task_id': 359954, 'test_server': False, 'fold': 0, 'metric': 'logloss', 'metrics': ['logloss', 'acc', 'balacc'], 'seed': 1405310980, 'job_timeout_seconds': 7200, 'max_runtime_seconds': 3600, 'cores': 32, 'max_mem_size_mb': 120205, 'min_vol_size_mb': -1, 'input_dir': '/data/osedukhin/shared/openml_cache', 'output_dir': '/data/osedukhin/shared/amlb_results/catboost.amlb_full.1h_1fold.local.20231004T081930', 'output_predictions_file': '/data/osedukhin/shared/amlb_results/catboost.amlb_full.1h_1fold.local.20231004T081930/predictions/eucalyptus/0/predictions.csv', 'tag': None, 'command': 'runbenchmark.py catboost amlb_full 1h_1fold --indir /data/osedukhin/shared/openml_cache --outdir /data/osedukhin/shared/amlb_results --userdir /data/osedukhin/shared/amlb_configs --exit-on-error', 'git_info': {'repo': 'https://github.com/openml/automlbenchmark', 'branch': 'master', 'commit': '386cfb66baa576ca9891ca18007c8d298380da3e', 'tags': [], 'status': ['## master...origin/master']}, 'measure_inference_time': False, 'ext': {}, 'quantile_levels': [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9], 'type_': 'multiclass', 'output_metadata_file': '/data/osedukhin/shared/amlb_results/catboost.amlb_full.1h_1fold.local.20231004T081930/predictions/eucalyptus/0/metadata.json'})
Job `local.amlb_full.1h_1fold.eucalyptus.0.CatBoost` failed with error: No module named 'catboost'
Traceback (most recent call last):
  File "/data/osedukhin/shared/automlbenchmark/amlb/job.py", line 120, in start
    result = self._run()
  File "/data/osedukhin/shared/automlbenchmark/amlb/utils/process.py", line 744, in profiler
    return fn(*args, **kwargs)
  File "/data/osedukhin/shared/automlbenchmark/amlb/benchmark.py", line 578, in run
    meta_result = self.benchmark.framework_module.run(self._dataset, task_config)
  File "/data/osedukhin/shared/amlb_configs/extensions/CatBoost/__init__.py", line 14, in run
    from .exec import run
  File "/data/osedukhin/shared/amlb_configs/extensions/CatBoost/exec.py", line 3, in <module>
    import catboost
ModuleNotFoundError: No module named 'catboost'
Job `local.amlb_full.1h_1fold.eucalyptus.0.CatBoost` did not stop gracefully: Job `local.amlb_full.1h_1fold.eucalyptus.0.CatBoost` was interrupted.
Traceback (most recent call last):
  File "/data/osedukhin/shared/automlbenchmark/amlb/job.py", line 226, in start
    self._run()
  File "/data/osedukhin/shared/automlbenchmark/amlb/job.py", line 324, in _run
    result = job.start()
  File "/data/osedukhin/shared/automlbenchmark/amlb/job.py", line 120, in start
    result = self._run()
  File "/data/osedukhin/shared/automlbenchmark/amlb/utils/process.py", line 744, in profiler
    return fn(*args, **kwargs)
  File "/data/osedukhin/shared/automlbenchmark/amlb/benchmark.py", line 578, in run
    meta_result = self.benchmark.framework_module.run(self._dataset, task_config)
  File "/data/osedukhin/shared/amlb_configs/extensions/CatBoost/__init__.py", line 14, in run
    from .exec import run
  File "/data/osedukhin/shared/amlb_configs/extensions/CatBoost/exec.py", line 3, in <module>
    import catboost
ModuleNotFoundError: No module named 'catboost'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/data/osedukhin/shared/automlbenchmark/amlb/job.py", line 147, in stop
    self._cancel()
  File "/data/osedukhin/shared/automlbenchmark/amlb/job.py", line 184, in _cancel
    raise_in_thread(self.thread_id, CancelledError(f"Job `{self.name}` was interrupted."))
  File "/data/osedukhin/shared/automlbenchmark/amlb/utils/process.py", line 437, in raise_in_thread
    ret = ctypes.pythonapi.PyThreadState_SetAsyncExc(tid, exc_class)
amlb.utils.process.CancelledError: Job `local.amlb_full.1h_1fold.eucalyptus.0.CatBoost` was interrupted.
All jobs executed in 0.062 seconds.
[MONITORING] [python [1102918]] CPU Utilization: 2.0%
[MONITORING] [python [1102918]] Memory Usage: 5.1%
[MONITORING] [python [1102918]] Disk Usage: 79.9%
No module named 'catboost'
Traceback (most recent call last):
  File "runbenchmark.py", line 189, in <module>
    res = bench.run(args.task, args.fold)
  File "/data/osedukhin/shared/automlbenchmark/amlb/benchmark.py", line 211, in run
    results = self._run_jobs(jobs)
  File "/data/osedukhin/shared/automlbenchmark/amlb/benchmark.py", line 253, in _run_jobs
    self.job_runner.start()
  File "/data/osedukhin/shared/automlbenchmark/amlb/job.py", line 226, in start
    self._run()
  File "/data/osedukhin/shared/automlbenchmark/amlb/job.py", line 324, in _run
    result = job.start()
  File "/data/osedukhin/shared/automlbenchmark/amlb/job.py", line 120, in start
    result = self._run()
  File "/data/osedukhin/shared/automlbenchmark/amlb/utils/process.py", line 744, in profiler
    return fn(*args, **kwargs)
  File "/data/osedukhin/shared/automlbenchmark/amlb/benchmark.py", line 578, in run
    meta_result = self.benchmark.framework_module.run(self._dataset, task_config)
  File "/data/osedukhin/shared/amlb_configs/extensions/CatBoost/__init__.py", line 14, in run
    from .exec import run
  File "/data/osedukhin/shared/amlb_configs/extensions/CatBoost/exec.py", line 3, in <module>
    import catboost
ModuleNotFoundError: No module named 'catboost'

Answer 2 · 2023-10-04T08:22:35.000Z

Looks like the setup method in __init.py__ is not called

Answer 3 · 2023-10-04T08:25:23.000Z

I added -s force, now it tries to install CatBoost, however, I got the following error output:

Running cmd `/data/osedukhin/shared/amlb_configs/extensions/CatBoost/setup.sh stable`
/data/osedukhin/shared/amlb_configs/extensions/CatBoost/setup.sh: line 5: /data/osedukhin/shared/amlb_configs/extensions/CatBoost/.setup_env: No such file or directory
/data/osedukhin/shared/amlb_configs/extensions/CatBoost/setup.sh: line 6: /frameworks/shared/setup.sh: No such file or directory
/data/osedukhin/shared/amlb_configs/extensions/CatBoost/setup.sh: line 7: PIP: command not found
/data/osedukhin/shared/amlb_configs/extensions/CatBoost/setup.sh: line 9: PY: command not found

Answer 4 · 2023-10-04T08:37:08.000Z

Finally got it. I added AMLB_ROOT=. prefix before python runbenchmark.py ... and it now works.

However, there is nothing about these problems in docs. Why do they happen?

Answer 5 · 2023-10-04T09:29:39.000Z

Thank you for opening an issue. I had to make a few modifications in order to reproduce your error:

I used predefined benchmark and constraints as amlb_full and 1h_1fold definitions were not supplied. This shouldn't have an effect since the error should be unreleated.
My catboost integration files were at {user}/extensions/CatBoost instead of {user}/frameworks/extensions/CatBooast and I added a requirements.txt with requirement catboost
I added extensions/CatBoost/exec.py (see end of message)

It looks like our setup example is outdated. The script should have

- . "$HERE/.setup_env"
+ . "$HERE/.setup/setup_env"

This script sets AMLB_ROOT so the remainder of the installation should work after this modification.

edit: updated the exec.py script to ensure it works with datasets that have categorical features. It should now successfully run on the test benchmark.

{user}/extensions/CatBoost/exec.py:

import logging
import catboost

from amlb.benchmark import TaskConfig
from amlb.data import Dataset
from amlb.datautils import impute_array
from amlb.results import save_predictions
from amlb.utils import Timer

log = logging.getLogger(__name__)


def run(dataset: Dataset, config: TaskConfig):
    log.info(f"\n**** CatBoost v{catboost.__version__} ****\n")

    is_classification = config.type == 'classification'

    X_train, X_test = dataset.train.X, dataset.test.X
    y_train, y_test = dataset.train.y, dataset.test.y

    estimator = catboost.CatBoostClassifier if is_classification else catboost.CatBoostRegressor
    categorical_features = list(X_train.select_dtypes(include=['category']).columns)

    for feature in categorical_features:
        X_train[feature] = X_train[feature].cat.add_categories("missing")
        X_test[feature] = X_test[feature].cat.add_categories("missing")
    X_train.loc[:, categorical_features] = X_train.loc[:, categorical_features].fillna('missing')
    X_test.loc[:, categorical_features] = X_test.loc[:, categorical_features].fillna('missing')

    predictor = estimator(
        random_state=config.seed,
        cat_features=categorical_features if categorical_features else None,
        **config.framework_params)

    with Timer() as training:
        predictor.fit(X_train, y_train)
    predictions = predictor.predict(X_test)
    probabilities = predictor.predict_proba(X_test) if is_classification else None

    save_predictions(dataset=dataset,
                     output_file=config.output_predictions_file,
                     probabilities=probabilities,
                     predictions=predictions,
                     truth=y_test)

    return dict(
        models_count=1,
        training_duration=training.duration
    )

Answer 6 · 2023-10-04T11:31:49.000Z

@PGijsbers thanjk you! My current setup.py is the following:

#!/usr/bin/env bash
shopt -s expand_aliases
HERE=$(dirname "$0")

. "$HERE/.setup/setup_env"
. "$AMLB_ROOT/frameworks/shared/setup.sh" "$HERE" true
PIP install -r "$HERE/requirements.txt"

PY -c "from catboost import __version__; print(__version__)" >> "${HERE}/.installed"

__init__.py is the following:

print('CatBoost __init.py__')

from amlb.utils import call_script_in_same_dir

def setup(*args, **kwargs):
    print('Installing CatBoost')
    call_script_in_same_dir(__file__, "setup.sh", *args, **kwargs)
    
def run(*args, **kwargs):
    from .exec import run
    return run(*args, **kwargs)

And the command is the following:

python runbenchmark.py catboost amlb_full 1h_1fold \
    --indir /data/osedukhin/shared/openml_cache \
    --outdir /data/osedukhin/shared/amlb_results \
    --userdir /data/osedukhin/shared/amlb_configs \
    --exit-on-error -s force

Also I added debug output to the beginning of exec.py:

import sys
print('sys.executable =', sys.executable)

However, it seems to install catboost in the created venv, but not to use this venv. The error log is the following:

(venv) osedukhin@ns-bd-v100-08:~/shared/automlbenchmark$ python runbenchmark.py catboost amlb_full 1h_1fold \
>     --indir /data/osedukhin/shared/openml_cache \
>     --outdir /data/osedukhin/shared/amlb_results \
>     --userdir /data/osedukhin/shared/amlb_configs \
>     --exit-on-error -s force
Running benchmark `catboost` on `amlb_full` framework in `local` mode.
Loading frameworks definitions from ['/data/osedukhin/shared/automlbenchmark/resources/frameworks.yaml', '/data/osedukhin/shared/automlbenchmark/examples/custom/frameworks.yaml', '/data/osedukhin/shared/amlb_configs/frameworks.yaml'].
Loading benchmark constraint definitions from ['/data/osedukhin/shared/automlbenchmark/resources/constraints.yaml', '/data/osedukhin/shared/amlb_configs/constraints.yaml'].
Loading benchmark definitions from /data/osedukhin/shared/amlb_configs/benchmarks/amlb_full.yaml.
CatBoost __init.py__
Setting up framework CatBoost.
Installing CatBoost
Running cmd `/data/osedukhin/shared/amlb_configs/extensions/CatBoost/setup.sh stable`
shared/setup.sh /data/osedukhin/shared/amlb_configs/extensions/CatBoost true
Collecting pip
  Using cached pip-23.2.1-py3-none-any.whl (2.1 MB)
Collecting wheel
  Using cached wheel-0.41.2-py3-none-any.whl (64 kB)
Installing collected packages: pip, wheel
  Attempting uninstall: pip
    Found existing installation: pip 20.0.2
    Uninstalling pip-20.0.2:
      Successfully uninstalled pip-20.0.2
Successfully installed pip-23.2.1 wheel-0.41.2
PY=/data/osedukhin/shared/amlb_configs/extensions/CatBoost/venv/bin/python -W ignore
PIP=/data/osedukhin/shared/amlb_configs/extensions/CatBoost/venv/bin/python -m pip
Looking in indexes: http://mirrors.tools.huawei.com/pypi/simple
Collecting numpy==1.24.2 (from -r /data/osedukhin/shared/automlbenchmark/frameworks/shared/requirements.txt (line 7))
..................
Installing collected packages: pytz, zipp, tzdata, tenacity, six, scipy, pyparsing, pillow, packaging, kiwisolver, graphviz, fonttools, cycler, contourpy, python-dateutil, plotly, importlib-resources, pandas, matplotlib, catboost
Successfully installed catboost-1.2.2 contourpy-1.1.1 cycler-0.12.0 fonttools-4.43.0 graphviz-0.20.1 importlib-resources-6.1.0 kiwisolver-1.4.5 matplotlib-3.7.3 packaging-23.2 pandas-2.0.3 pillow-10.0.1 plotly-5.17.0 pyparsing-3.1.1 python-dateutil-2.8.2 pytz-2023.3.post1 scipy-1.10.1 six-1.16.0 tenacity-8.2.3 tzdata-2023.3 zipp-3.17.0



Setup of framework CatBoost completed successfully.
[MONITORING] [python [1147288]] CPU Utilization: 1.6%

------------------------------------------------------------
Starting job local.amlb_full.1h_1fold.eucalyptus.0.CatBoost.
[MONITORING] [python [1147288]] Memory Usage: 5.1%
Assigning 32 cores (total=32) for new task eucalyptus.
[MONITORING] [python [1147288]] Disk Usage: 80.0%
Assigning 120242 MB (total=128826 MB) for new eucalyptus task.
Running task eucalyptus on framework CatBoost with config:
TaskConfig({'framework': 'CatBoost', 'framework_params': {}, 'framework_version': 'stable', 'type': 'classification', 'name': 'eucalyptus', 'openml_task_id': 359954, 'test_server': False, 'fold': 0, 'metric': 'logloss', 'metrics': ['logloss', 'acc', 'balacc'], 'seed': 526914963, 'job_timeout_seconds': 7200, 'max_runtime_seconds': 3600, 'cores': 32, 'max_mem_size_mb': 120242, 'min_vol_size_mb': -1, 'input_dir': '/data/osedukhin/shared/openml_cache', 'output_dir': '/data/osedukhin/shared/amlb_results/catboost.amlb_full.1h_1fold.local.20231004T114404', 'output_predictions_file': '/data/osedukhin/shared/amlb_results/catboost.amlb_full.1h_1fold.local.20231004T114404/predictions/eucalyptus/0/predictions.csv', 'tag': None, 'command': 'runbenchmark.py catboost amlb_full 1h_1fold --indir /data/osedukhin/shared/openml_cache --outdir /data/osedukhin/shared/amlb_results --userdir /data/osedukhin/shared/amlb_configs --exit-on-error -s force', 'git_info': {'repo': 'https://github.com/openml/automlbenchmark', 'branch': 'master', 'commit': '386cfb66baa576ca9891ca18007c8d298380da3e', 'tags': [], 'status': ['## master...origin/master', ' M frameworks/shared/setup.sh']}, 'measure_inference_time': False, 'ext': {}, 'quantile_levels': [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9], 'type_': 'multiclass', 'output_metadata_file': '/data/osedukhin/shared/amlb_results/catboost.amlb_full.1h_1fold.local.20231004T114404/predictions/eucalyptus/0/metadata.json'})
sys.executable = /data/osedukhin/shared/automlbenchmark/venv/bin/python
Job `local.amlb_full.1h_1fold.eucalyptus.0.CatBoost` failed with error: No module named 'catboost'
Traceback (most recent call last):
  File "/data/osedukhin/shared/automlbenchmark/amlb/job.py", line 120, in start
    result = self._run()
  File "/data/osedukhin/shared/automlbenchmark/amlb/utils/process.py", line 744, in profiler
    return fn(*args, **kwargs)
  File "/data/osedukhin/shared/automlbenchmark/amlb/benchmark.py", line 578, in run
    meta_result = self.benchmark.framework_module.run(self._dataset, task_config)
  File "/data/osedukhin/shared/amlb_configs/extensions/CatBoost/__init__.py", line 10, in run
    from .exec import run
  File "/data/osedukhin/shared/amlb_configs/extensions/CatBoost/exec.py", line 6, in <module>
    import catboost
ModuleNotFoundError: No module named 'catboost'
Job `local.amlb_full.1h_1fold.eucalyptus.0.CatBoost` did not stop gracefully: Job `local.amlb_full.1h_1fold.eucalyptus.0.CatBoost` was interrupted.
Traceback (most recent call last):
  File "/data/osedukhin/shared/automlbenchmark/amlb/job.py", line 226, in start
    self._run()
  File "/data/osedukhin/shared/automlbenchmark/amlb/job.py", line 324, in _run
    result = job.start()
  File "/data/osedukhin/shared/automlbenchmark/amlb/job.py", line 120, in start
    result = self._run()
  File "/data/osedukhin/shared/automlbenchmark/amlb/utils/process.py", line 744, in profiler
    return fn(*args, **kwargs)
  File "/data/osedukhin/shared/automlbenchmark/amlb/benchmark.py", line 578, in run
    meta_result = self.benchmark.framework_module.run(self._dataset, task_config)
  File "/data/osedukhin/shared/amlb_configs/extensions/CatBoost/__init__.py", line 10, in run
    from .exec import run
  File "/data/osedukhin/shared/amlb_configs/extensions/CatBoost/exec.py", line 6, in <module>
    import catboost
ModuleNotFoundError: No module named 'catboost'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/data/osedukhin/shared/automlbenchmark/amlb/job.py", line 147, in stop
    self._cancel()
  File "/data/osedukhin/shared/automlbenchmark/amlb/job.py", line 184, in _cancel
    raise_in_thread(self.thread_id, CancelledError(f"Job `{self.name}` was interrupted."))
  File "/data/osedukhin/shared/automlbenchmark/amlb/utils/process.py", line 437, in raise_in_thread
    ret = ctypes.pythonapi.PyThreadState_SetAsyncExc(tid, exc_class)
amlb.utils.process.CancelledError: Job `local.amlb_full.1h_1fold.eucalyptus.0.CatBoost` was interrupted.
All jobs executed in 0.073 seconds.
[MONITORING] [python [1147288]] CPU Utilization: 2.1%
[MONITORING] [python [1147288]] Memory Usage: 5.1%
[MONITORING] [python [1147288]] Disk Usage: 80.0%
No module named 'catboost'
Traceback (most recent call last):
  File "runbenchmark.py", line 189, in <module>
    res = bench.run(args.task, args.fold)
  File "/data/osedukhin/shared/automlbenchmark/amlb/benchmark.py", line 211, in run
    results = self._run_jobs(jobs)
  File "/data/osedukhin/shared/automlbenchmark/amlb/benchmark.py", line 253, in _run_jobs
    self.job_runner.start()
  File "/data/osedukhin/shared/automlbenchmark/amlb/job.py", line 226, in start
    self._run()
  File "/data/osedukhin/shared/automlbenchmark/amlb/job.py", line 324, in _run
    result = job.start()
  File "/data/osedukhin/shared/automlbenchmark/amlb/job.py", line 120, in start
    result = self._run()
  File "/data/osedukhin/shared/automlbenchmark/amlb/utils/process.py", line 744, in profiler
    return fn(*args, **kwargs)
  File "/data/osedukhin/shared/automlbenchmark/amlb/benchmark.py", line 578, in run
    meta_result = self.benchmark.framework_module.run(self._dataset, task_config)
  File "/data/osedukhin/shared/amlb_configs/extensions/CatBoost/__init__.py", line 10, in run
    from .exec import run
  File "/data/osedukhin/shared/amlb_configs/extensions/CatBoost/exec.py", line 6, in <module>
    import catboost
ModuleNotFoundError: No module named 'catboost'

Answer 7 · 2023-10-04T12:24:16.000Z

I got it! I need to use run_in_venv