Azure/azureml-examples

Example Pipeline with Azure OpenAI CommandComponents Fails to Import Data

sraza-onshape opened this issue · 0 comments

Operating System

Linux

Version Information

Python Version: 3.10.11
azure-ai-ml package version: 1.8.0 (and also 1.12.1)

Steps to reproduce

  1. Create an Azure ML workspace
  2. Inside the workspace, create a compute instance (I'm using a Standard_DS11_v2 VM)
  3. Turn on the instance, and access it locally from VSCode via a websocket (using the Azure ML extension)
  4. Clone this repo
  5. Find the credentials for your workspace
  6. Run the code in the notebook

Expected behavior

Code should create a new pipeline job in the Azure ML workspace, that finetunes a GPT 3.5 Turbo model using a user-defined dataset (stored in a directory in this repo).

Actual behavior

The pipeline job is created, yet it fails.

It has 2 nodes - the failure is at the first, "Data Import"
Screenshot 2024-01-09 at 3 46 50 PM

The error at this node says: UserError: Failed to submit job due to Exception: Response status code does not indicate success: 404 (Could not find datastore: azureml_managed_openaidevaulttrainingdata.). Microsoft.RelInfra.Common.Exceptions.ErrorResponseException: Could not find datastore: azureml_managed_openaidevaulttrainingdata...

There are no logs or code found at this node.

Addition information

The notebook code I am referring to is here.

I'm not sure if the root cause of this error is something in the code/Azure ML workspace. In case it's the former, here's also a list of all the packages and versions in the Jupyter kernel I'm using to run the notebook:

  - _libgcc_mutex=0.1=main
  - _openmp_mutex=5.1=1_gnu
  - asttokens=2.2.1=pyhd8ed1ab_0
  - backcall=0.2.0=pyh9f0ad1d_0
  - backports=1.0=pyhd8ed1ab_3
  - backports.functools_lru_cache=1.6.4=pyhd8ed1ab_0
  - bzip2=1.0.8=h7b6447c_0
  - ca-certificates=2023.5.7=hbcca054_0
  - debugpy=1.5.1=py310h295c915_0
  - decorator=5.1.1=pyhd8ed1ab_0
  - entrypoints=0.4=pyhd8ed1ab_0
  - executing=1.2.0=pyhd8ed1ab_0
  - ipykernel=6.15.0=pyh210e3f2_0
  - ipython=8.14.0=pyh41d4057_0
  - jedi=0.18.2=pyhd8ed1ab_0
  - jupyter_client=7.3.4=pyhd8ed1ab_0
  - jupyter_core=5.3.1=py310hff52083_0
  - ld_impl_linux-64=2.38=h1181459_1
  - libffi=3.4.4=h6a678d5_0
  - libgcc-ng=11.2.0=h1234567_1
  - libgomp=11.2.0=h1234567_1
  - libsodium=1.0.18=h36c2ea0_1
  - libstdcxx-ng=11.2.0=h1234567_1
  - libuuid=1.41.5=h5eee18b_0
  - matplotlib-inline=0.1.6=pyhd8ed1ab_0
  - ncurses=6.4=h6a678d5_0
  - nest-asyncio=1.5.6=pyhd8ed1ab_0
  - openssl=3.0.9=h7f8727e_0
  - parso=0.8.3=pyhd8ed1ab_0
  - pexpect=4.8.0=pyh1a96a4e_2
  - pickleshare=0.7.5=py_1003
  - pip=23.1.2=py310h06a4308_0
  - platformdirs=3.6.0=pyhd8ed1ab_0
  - prompt-toolkit=3.0.38=pyha770c72_0
  - prompt_toolkit=3.0.38=hd8ed1ab_0
  - psutil=5.9.0=py310h5eee18b_0
  - ptyprocess=0.7.0=pyhd3deb0d_0
  - pure_eval=0.2.2=pyhd8ed1ab_0
  - pygments=2.15.1=pyhd8ed1ab_0
  - python=3.10.11=h955ad1f_3
  - python-dateutil=2.8.2=pyhd8ed1ab_0
  - python_abi=3.10=2_cp310
  - pyzmq=25.1.0=py310h6a678d5_0
  - readline=8.2=h5eee18b_0
  - setuptools=67.8.0=py310h06a4308_0
  - six=1.16.0=pyh6c4a22f_0
  - sqlite=3.41.2=h5eee18b_0
  - stack_data=0.6.2=pyhd8ed1ab_0
  - tk=8.6.12=h1ccaba5_0
  - tornado=6.1=py310h5764c6d_3
  - traitlets=5.9.0=pyhd8ed1ab_0
  - typing-extensions=4.6.3=hd8ed1ab_0
  - typing_extensions=4.6.3=pyha770c72_0
  - wcwidth=0.2.6=pyhd8ed1ab_0
  - wheel=0.38.4=py310h06a4308_0
  - xz=5.4.2=h5eee18b_0
  - zeromq=4.3.4=h9c3ff4c_1
  - zlib=1.2.13=h5eee18b_0
  - pip:
      - adal==1.2.7
      - aiosignal==1.3.1
      - alembic==1.11.1
      - argcomplete==2.1.2
      - attrs==23.1.0
      - azure-ai-ml==1.12.1
      - azure-common==1.1.28
      - azure-core==1.27.1
      - azure-graphrbac==0.61.1
      - azure-identity==1.13.0
      - azure-mgmt-authorization==3.0.0
      - azure-mgmt-containerregistry==10.1.0
      - azure-mgmt-core==1.4.0
      - azure-mgmt-keyvault==10.2.2
      - azure-mgmt-resource==22.0.0
      - azure-mgmt-storage==21.0.0
      - azure-storage-blob==12.16.0
      - azure-storage-file-datalake==12.11.0
      - azure-storage-file-share==12.12.0
      - azureml-core==1.51.0.post1
      - azureml-dataprep==4.12.1
      - azureml-dataprep-native==38.0.0
      - azureml-dataprep-rslex==2.19.2
      - azureml-fsspec==1.2.0
      - azureml-mlflow==1.51.0
      - backports-tempfile==1.0
      - backports-weakref==1.0.post1
      - bcrypt==4.0.1
      - blinker==1.6.2
      - cachetools==5.3.1
      - certifi==2023.5.7
      - cffi==1.15.1
      - charset-normalizer==3.1.0
      - click==8.0.4
      - cloudpickle==2.2.1
      - colorama==0.4.6
      - contextlib2==21.6.0
      - contourpy==1.1.0
      - cryptography==41.0.1
      - cycler==0.11.0
      - cython==0.29.35
      - databricks-cli==0.17.7
      - distlib==0.3.6
      - distro==1.8.0
      - docker==6.1.3
      - dotnetcore2==3.1.23
      - filelock==3.12.2
      - flask==2.3.2
      - fonttools==4.40.0
      - frozenlist==1.3.3
      - fsspec==2023.6.0
      - gitdb==4.0.10
      - gitpython==3.1.31
      - google-api-core==2.11.1
      - google-auth==2.20.0
      - googleapis-common-protos==1.59.1
      - greenlet==2.0.2
      - grpcio==1.43.0
      - gunicorn==20.1.0
      - humanfriendly==10.0
      - idna==3.4
      - imageio==2.31.1
      - importlib-metadata==6.7.0
      - isodate==0.6.1
      - itsdangerous==2.1.2
      - jeepney==0.8.0
      - jinja2==3.1.2
      - jmespath==1.0.1
      - joblib==1.2.0
      - jsonpickle==3.0.1
      - jsonschema==4.17.3
      - kiwisolver==1.4.4
      - knack==0.10.1
      - lazy-loader==0.2
      - mako==1.2.4
      - markdown==3.4.3
      - markupsafe==2.1.3
      - marshmallow==3.19.0
      - matplotlib==3.7.1
      - mldesigner==0.1.0b13
      - mlflow==2.4.1
      - mlflow-skinny==2.4.1
      - mltable==1.4.1
      - msal==1.22.0
      - msal-extensions==1.0.0
      - msgpack==1.0.5
      - msrest==0.7.1
      - msrestazure==0.6.4
      - ndg-httpsclient==0.5.1
      - networkx==3.1
      - numpy==1.25.0
      - oauthlib==3.2.2
      - opencensus==0.11.2
      - opencensus-context==0.1.3
      - opencensus-ext-azure==1.1.9
      - packaging==23.0
      - pandas==2.0.2
      - paramiko==3.2.0
      - pathspec==0.11.1
      - pillow==9.5.0
      - pkginfo==1.9.6
      - portalocker==2.7.0
      - protobuf==3.20.3
      - pyarrow==12.0.1
      - pyasn1==0.5.0
      - pyasn1-modules==0.3.0
      - pycparser==2.21
      - pydash==7.0.5
      - pyjwt==2.7.0
      - pynacl==1.5.0
      - pyopenssl==23.2.0
      - pyparsing==3.1.0
      - pyrsistent==0.19.3
      - pysocks==1.7.1
      - pytz==2023.3
      - pywavelets==1.4.1
      - pyyaml==6.0
      - querystring-parser==1.2.4
      - ray==2.0.0
      - requests==2.31.0
      - requests-oauthlib==1.3.1
      - rsa==4.9
      - scikit-image==0.21.0
      - scikit-learn==1.2.2
      - scipy==1.10.1
      - secretstorage==3.3.3
      - smmap==5.0.0
      - sqlalchemy==2.0.16
      - sqlparse==0.4.4
      - strictyaml==1.7.3
      - tabulate==0.9.0
      - threadpoolctl==3.1.0
      - tifffile==2023.4.12
      - tqdm==4.65.0
      - tzdata==2023.3
      - urllib3==1.26.16
      - virtualenv==20.23.1
      - websocket-client==1.6.0
      - werkzeug==2.3.6
      - zipp==3.15.0