This repository contains a set of scripts that will create CDP minimal assets for demo in one wrapper script, including:
- Cloud pre-requisites (bucket, policies, roles, network)
- Cloud CDP Environmment
- CDP Data Lake
- Any CDP Data Hub cluster definition
- AWS CLI (Instructions)
- You must run
aws configure
after install, and ensure your region is set
- You must run
- AWS ssh key (Instructions); Alternatively, you can use the
field
or_field
keys setup in our AWS SE accounts
- Azure CLI (Instructions)
- Use
az login
after install to login
- Use
- ssh Key: you will need to paste the public key into your parameters file
Note: Azure CML is not supported yet, so don't add it in your parameters file :)
-
CDP CLI (Instructions)
-
CDP Credential (Instructions)
{ Required parameters: "required": { Prefix used for cdp assets creation: "prefix": "pvi", Name of credential to use: "credential": "pvidal-aws-se-credential", Region to use (should also be the default region of your cloud provider cli profile): "region": "us-east-1", ssh key to use for cdp instances setup: "key": "field", Workload password to use in CDP: "workload_pwd": "cdpw0rksh0p", Array of datahub to setup (can be empty): "datahub_list": [ Element 1: { Definition from cdp-cluster-definitions folder: "definition": "data-mart.json", Custom script from cdp-dh-custom-scripts folder: "custom_script": "" }, Element 2: { Definition from cdp-cluster-definitions folder: "definition": "cdp-mod-workshop.json", Custom script from cdp-dh-custom-scripts folder: "custom_script": "cdp_mod_wkp.sh" }, ], Array of ml workspaces to setup (can be empty): "ml_workspace_list": [ Element 1: { Definition from cml-workspace-definitions folder: "definition": "small_workspace.json", Flag to enable monitoring, governance and model metrics (possible values yes or no): "enable_workspace": "no" } ], Array of op database to setup (can be empty): "op_db_list": [ Element 1: { Name of the database you want to create: "database_name": "your_db_name" } ], Array of CDW vw to setup (can be empty): "dw_list": [ Element 1: { Name of the vw you want to create: "name": "vw-name", Type of vw you want to create: "type": "hive" } ] }, Optional (defaulted) parameters (can be empty): "optional": { Cloud provider (default: aws, possible values: aws, az): "cloud_provider": "aws", Cloud provider cli profile (AWS-your profile name / AZ-your subscription name or ID) (default: default): "cloud_profile": "default", CDP cli profile (default: default): "cdp_profile": "default", Flag to create cdp credential or not (default: no, possible values: yes, no) "generate_credential": "no", NOT SUPPORTED YET Flag to generate minimal cross account role policy or not (default: no, possible values: yes, no) "generate_minimal_cross_account": "no", Flag to create network in cloud provider or not (default: no, possible values: yes, no) "create_network": "no", CIDR to open in your security group of your network (port 443, 22 and 9443 will be open to this) "sg_cidr": "0.0.0.0/0", Use private IPs for env deployment (default: no, possible values: yes, no). NB: For AWS If this is set to "yes" and "create_network" is set to "no", you must currently use the DEV CDP CLI. "use_priv_ips": "no", Use existing network for env deployment (path to the network file, see examples in parameters_sample) "existing_network_file": "[path_to_network_file]", The Data Lake scale you'd like to have (default: LIGHT_DUTY, possible vaules: LIGHT_DUTY, MEDIUM_DUTY_HA) "scale": "[LIGHT_DUTY]", If creating an environment with private IPs, create a bastion in one of the public subnets that you can proxy to to access all the UIs. (default: no, possible vaules: no, yes). "create_bastion": "yes", Enable workload analytics (i.e. WXM): (default: --no-enable-workload-analytics, possible values: --enable-workload-analytics, --no-enable-workload-analytics) "workload_analytic": "--enable-workload-analytics", Array of custom tags to setup (if empty the scripts will generate project, owner, end_date and deploytool tags): "tags": [ { "key": "my_tag", "value": "my_value" }, { "key": "my_other_tag", "value": "my_other_value" } ], } }
See parameters_sample
folder
Run the source target wrapper script:
cdp_create_all_the_things.sh <your_param_file>
Run the deletion script:
cdp_delete_all_the_things.sh <your_param_file>
cdp_aws_pre_reqs.sh <your_param_file>
cdp_aws_sdx.sh <your_param_file> [<network_file>]
cdp_az_pre_reqs.sh <your_param_file>
cdp_az_sdx.sh <your_param_file>
cdp_create_datahub_things.sh <your_param_file>
cdp_create_ml_things.sh <your_param_file>
cdp_create_opdb_things.sh <your_param_file>
cdp_create_dw_things.sh <your_param_file>
cdp_stop_all_the_things.sh <your_param_file>
cdp_start_all_the_things.sh <your_param_file>
Note: some flags require dev cli, not for public consumption, use at your own risk
--no-cost-check
: removes cost check
--no-db-ha
: does not create DB HA backend
--no-sync-users
: does launch sync users to free-ipa
- Add support for Azure ML
- Add support for minimal set of policies for AWS
- Add dynamic definition updates
- Create a nifi flow wrapper?
Paul Vidal - LinkedIn
Dan Chaffelson - LinkedIn
Chris Perro - LinkedIn
André Araújo - LinkedIn
Nathan Anthony - LinkedIn
Steffen Maerkl - LinkedIn
Mike Riggs - LinkedIn
Ryan Cicak - LinkedIn
Alex Moundalexis - LinkedIn