GoogleCloudPlatform/vertex-pipelines-end-to-end-samples

Make run command is broken

rauerhans opened this issue ยท 3 comments

After setting everything up (tf IAC, python envs, env vars), executing make run pipeline=training fails with this error message:

luegnix-gcp on ๎‚  main via ๐Ÿ v3.7.12 on โ˜๏ธ  (eu-central-1) on โ˜๏ธ  asterix.versus.caesar@proton.me 
โฏ make run pipeline=training 
/Users/hans.rauer/Repos/luegnix-gcp/.venv/lib/python3.7/site-packages/kfp/v2/compiler/compiler.py:1266: FutureWarning: APIs imported from the v1 namespace (e.g. kfp.dsl, kfp.components, etc) will not be supported by the v2 compiler since v2.0.0
  category=FutureWarning,

WARNING: gsutil rsync uses hashes when modification time is not available at
both the source and destination. Your crcmod installation isn't using the
module's C extension, so checksumming will run very slowly. If this is your
first rsync since updating gsutil, this rsync can take significantly longer than
usual. For help installing the extension, please see "gsutil help crcmod".

Building synchronization state...
If you experience problems with multiprocessing on MacOS, they might be related to https://bugs.python.org/issue33725. You can disable multiprocessing by editing your .boto config or by adding the following flag to your command: `-o "GSUtil:parallel_process_count=1"`. Note that multithreading is still available even if you disable multiprocessing.

Starting synchronization...
If you experience problems with multiprocessing on MacOS, they might be related to https://bugs.python.org/issue33725. You can disable multiprocessing by editing your .boto config or by adding the following flag to your command: `-o "GSUtil:parallel_process_count=1"`. Note that multithreading is still available even if you disable multiprocessing.

usage: main.py [-h] [--template_path TEMPLATE_PATH]
               [--enable_caching ENABLE_CACHING]
main.py: error: unrecognized arguments: --pipeline=./training.json
make: *** [run] Error 2

It the included targets make compile and make sync-assets work as expected. It clearly fails, because the python trigger code does not define the option --pipeline. Looking at the code, this is not unexpected:

def sandbox_run(args: List[str] = None) -> aiplatform.PipelineJob:
    """Trigger a Vertex Pipeline run from a (local) compiled pipeline definition.
    Returns the PipelineJob object of the triggered pipeline run.
    Usage: python main.py --template_path=pipeline.json --enable_caching=true
    """
    logging.basicConfig(level=logging.DEBUG)

    parser = argparse.ArgumentParser()
    parser.add_argument(
        "--template_path", help="Path to the payload JSON file", type=str
    )
    parser.add_argument("--enable_caching", type=str, default=None)

    # Get commandline args
    args = parser.parse_args(args)

    # If empty value for enable_caching provided on commandline default to None
    if args.enable_caching == "":
        args.enable_caching = None

    payload = {
        "attributes": {
            "template_path": args.template_path,
            "enable_caching": args.enable_caching,
        }
        # "data" omitted as pipeline params are taken from the default args
        # in compiled JSON pipeline
    }

    return trigger_pipeline_from_payload(payload)


if __name__ == "__main__":
    sandbox_run()

Please could you try adjusting the Makefile, replacing --pipeline with --template_path ?

That's exactly what I did right now, I also drew that connection :D

Update: this works.
I guess it would need a fix, before you close this issue, right?