chicago-taxi
Opened this issue · 3 comments
Rework the current chicago-taxi solution. There are dependency conflicts with the current implementation on AI Platform. May need to upgrade the runtime, keras version, and deployment strategy.
@truongc2 and @bobbyjacob anyone looking at this?
I pushed the chicago-taxi demo that I got working for the GCP MLOPs assessment to a new branch in this repo: gcp_ml_assessment_chicago_taxi.
That demo works, but only in the jupyter notebook on vertex-ai workbench that I got working for the assessment: python-20220624-164215-bjacob.
If I try to clone the repo to another workbench instance, the demo breaks due to python dependency issues. The new notebook has a different set of default installed packages. I tried creating conda and venv versions of the working notebook environment but can't get it working on another instance.
I pushed a new branch gcp_ml_assessment_chicago_taxi_aaa
. In a new workbench environment, I ran these commands:
pip install tensorflow==2.8.4
pip install tensorflow_data_validation==1.8.0
pip install tensorflow_transform==1.8.0
pip cloudml-hypertune
This notebook works completes without error: 01-dataset-management.ipynb
.
This one fails with
CustomJob projects/354621994428/locations/us-central1/customJobs/2464840774366265344 current state:
JobState.JOB_STATE_RUNNING
CustomJob projects/354621994428/locations/us-central1/customJobs/2464840774366265344 current state:
JobState.JOB_STATE_RUNNING
CustomJob projects/354621994428/locations/us-central1/customJobs/2464840774366265344 current state:
JobState.JOB_STATE_FAILED
"The replica workerpool0-0 exited with a non-zero status of 1.
Traceback (most recent call last):
File \"/opt/conda/lib/python3.7/runpy.py\", line 193, in _run_module_as_main
\"__main__\", mod_spec)
File \"/opt/conda/lib/python3.7/runpy.py\", line 85, in _run_code
exec(code, run_globals)
File \"/root/.local/lib/python3.7/site-packages/src/model_training/task.py\", line 27, in <module>
from src.model_training import defaults, trainer, exporter
File \"/root/.local/lib/python3.7/site-packages/src/model_training/trainer.py\", line 18, in <module>
import tensorflow_transform as tft
File \"/opt/conda/lib/python3.7/site-packages/tensorflow_transform/__init__.py\", line 19, in <module>
from tensorflow_transform.analyzers import *
File \"/opt/conda/lib/python3.7/site-packages/tensorflow_transform/analyzers.py\", line 39, in <module>
from tensorflow_transform import analyzer_nodes
File \"/opt/conda/lib/python3.7/site-packages/tensorflow_transform/analyzer_nodes.py\", line 36, in <module>
from tensorflow_transform import nodes
File \"/opt/conda/lib/python3.7/site-packages/tensorflow_transform/nodes.py\", line 33, in <module>
from future.utils import with_metaclass
ModuleNotFoundError: No module named \'future\'
To find out more about why your job exited please check the logs: https://console.cloud.google.com/logs/viewer?project=354621994428&resource=ml_job%2Fjob_id%2F2464840774366265344&advancedFilter=resource.type%3D%22ml_job%22%0Aresource.labels.job_id%3D%222464840774366265344%22"