arcus-azure/arcus.azureml

Import error when launching run script

Opened this issue · 1 comments

Describe the bug
Run is scheduled in Experiment in azure ML, docker builds image correctly, but when executing train.py script we get the following error:

  File "train.py", line 15, in <module>
    from arcus.ml.images import *
  File "/azureml-envs/azureml_c8679ff754035121fa7879f8b571ce9a/lib/python3.6/site-packages/arcus/ml/images/io.py", line 8, in <module>
    from cv2 import imread, imdecode, IMREAD_COLOR
  File "/azureml-envs/azureml_c8679ff754035121fa7879f8b571ce9a/lib/python3.6/site-packages/cv2/__init__.py", line 5, in <module>
    from .cv2 import *
ImportError: libGL.so.1: cannot open shared object file: No such file or directory

Seems as though a lib is to be added to the build.

To Reproduce
Run a default arcus training script with any train.py script

from arcus.azureml.environment.aml_environment import AzureMLEnvironment

work_env = AzureMLEnvironment.Create(config_file="../.azureml/config.json")

training_name = 'your_training_name'
trainer = work_env.start_experiment(training_name)
trainer.setup_training(training_name, overwrite=False)

dataset_name = 'your_dataset_name'

arguments = {
    '--epochs': 75,
    '--batch_size': 256,
    '--es_patience': 20,
    '--train_test_split_ratio': 0.08
}
trainer.start_training(training_name, estimator_type='tensorflow', 
                       input_datasets = [dataset_name], 
                       compute_target='your_instance', gpu_compute=True, script_parameters = arguments)

Expected behavior
Should launch training as if it were on local, but in the cloud within a run in an experiment.

This is related to the requirements.txt file that should be updated.
Things to do:

  • include the right cv dependency in the pip package of arcus
  • update the requirements.txt in the training resources

Working file :

pip==20.3.1
arcus-azureml>=1.1.2.2a2020061514
arcus-ml>=1.0.11.1
opencv-python==3.3.0.9
azureml-telemetry
azureml-widgets
tensorflow
azureml-dataprep
azureml-train
numpy
pandas
azureml-core
tqdm
joblib
scikit-learn
matplotlib
azureml-core
tqdm
scikit-learn
seaborn
scikit-image
inference-schema