hello-gbdx

A GBDX task that obtains a list of the task input files and prints this list in the file out.txt, along with a user defined message.

Run

Here we run a sample execution of the hello-gbdx task. Sample inputs are provided on S3 in the locations specified below.

  1. In a Python terminal create a GBDX interface and specify the task input location:

    from gbdxtools import Interface
    from os.path import join
    import uuid
    
    gbdx = Interface()
    
    input_location = 's3://gbd-customer-data/32cbab7a-4307-40c8-bb31-e2de32f940c2/platform-stories/hello-gbdx'
  2. Create a task instance and set the required inputs:

    # create task object
    hello_task = gbdx.Task('hello-gbdx')
    hello_task.inputs.data_in = join(input_location, 'data_in')
    hello_task.inputs.message = 'This is my message!'
  3. Initialize a workflow and specify where to save the output:

    # Define a single-task workflow
    workflow = gbdx.Workflow([hello_task])
    random_str = str(uuid.uuid4())
    output_location = join('platform-stories/trial-runs', random_str)
    
    workflow.savedata(hello_task.outputs.data_out, output_location)
  4. Execute the workflow and track it's status as follows:

    workflow.execute()
    workflow.status

Input Ports

The task input ports. GBDX input ports can only be of "Directory" or "String" type. Note that booleans, integers and floats must be passed to the task as strings, e.g., "True", "10", "0.001".

Name Type Description Required
message string A user-defined message. True
data_in directory Input data directory. True

Output Ports

Name Type Description
data_out directory Output data directory. Contains out.txt, which has a list of files in data_in and the user-defined message.

Development

Build the Docker Image

You need to install Docker.

Clone the repository:

git clone https://github.com/platformstories/hello-gbdx

Then

cd hello-gbdx
docker build -t hello-gbdx .

Try out locally

Create a container in interactive mode and mount the sample input under /mnt/work/input/:

docker run --rm -v full/path/to/sample-input:/mnt/work/input -it hello-gbdx

Then, within the container:

python /hello-gbdx.py

Docker Hub

Login to Docker Hub:

docker login

Tag your image using your username and push it to DockerHub:

docker tag hello-gbdx yourusername/hello-gbdx
docker push yourusername/hello-gbdx

The image name should be the same as the image name under containerDescriptors in hello-gbdx.json.

Alternatively, you can link this repository to a Docker automated build. Every time you push a change to the repository, the Docker image gets automatically updated.

Register on GBDX

In a Python terminal:

from gbdxtools import Interface
gbdx=Interface()
gbdx.task_registry.register(json_filename="hello-gbdx-definition.json")

Note: If you change the task image, you need to reregister the task with a higher version number in order for the new image to take effect. Keep this in mind especially if you use Docker automated build.