A GBDX task that obtains a list of the task input files and prints this list in the file out.txt, along with a user defined message.
Here we run a sample execution of the hello-gbdx task. Sample inputs are provided on S3 in the locations specified below.
-
In a Python terminal create a GBDX interface and specify the task input location:
from gbdxtools import Interface from os.path import join import uuid gbdx = Interface() input_location = 's3://gbd-customer-data/32cbab7a-4307-40c8-bb31-e2de32f940c2/platform-stories/hello-gbdx'
-
Create a task instance and set the required inputs:
# create task object hello_task = gbdx.Task('hello-gbdx') hello_task.inputs.data_in = join(input_location, 'data_in') hello_task.inputs.message = 'This is my message!'
-
Initialize a workflow and specify where to save the output:
# Define a single-task workflow workflow = gbdx.Workflow([hello_task]) random_str = str(uuid.uuid4()) output_location = join('platform-stories/trial-runs', random_str) workflow.savedata(hello_task.outputs.data_out, output_location)
-
Execute the workflow and track it's status as follows:
workflow.execute() workflow.status
The task input ports. GBDX input ports can only be of "Directory" or "String" type. Note that booleans, integers and floats must be passed to the task as strings, e.g., "True", "10", "0.001".
Name | Type | Description | Required |
---|---|---|---|
message | string | A user-defined message. | True |
data_in | directory | Input data directory. | True |
Name | Type | Description |
---|---|---|
data_out | directory | Output data directory. Contains out.txt, which has a list of files in data_in and the user-defined message. |
You need to install Docker.
Clone the repository:
git clone https://github.com/platformstories/hello-gbdx
Then
cd hello-gbdx
docker build -t hello-gbdx .
Create a container in interactive mode and mount the sample input under /mnt/work/input/
:
docker run --rm -v full/path/to/sample-input:/mnt/work/input -it hello-gbdx
Then, within the container:
python /hello-gbdx.py
Login to Docker Hub:
docker login
Tag your image using your username and push it to DockerHub:
docker tag hello-gbdx yourusername/hello-gbdx
docker push yourusername/hello-gbdx
The image name should be the same as the image name under containerDescriptors in hello-gbdx.json.
Alternatively, you can link this repository to a Docker automated build. Every time you push a change to the repository, the Docker image gets automatically updated.
In a Python terminal:
from gbdxtools import Interface
gbdx=Interface()
gbdx.task_registry.register(json_filename="hello-gbdx-definition.json")
Note: If you change the task image, you need to reregister the task with a higher version number in order for the new image to take effect. Keep this in mind especially if you use Docker automated build.