This repository contains the instructions to create the ARTIS High Performance Computer (HPC) on Amazon Web Services (AWS) and run the ARTIS model. There are two scenarios for using this repository:
- Setting up a new ARTIS HPC on AWS
- Running an Existing ARTIS HPC Setup
All commands will be run in the terminal/command line and indicated with a "$" before the command or contained within a code block
- An AWS root user was created (To create an AWS root user visit )
- AWS root user has created an admin user group with "AdministratorAccess" permissions.
- AWS root user has created IAM users
- AWS root user has add IAM users to admin group
- AWS IAM users have their
AWS AWS_ACCESS_KEY
andAWS_SECRET_ACCESS_KEY
To create an AWS IAM user follow the instructions here: Create AWS IAM user
Note: If have you created ANY AWS RESOURCES for ARTIS manually, not including ROOT and IAM users, please delete these before continuing.
- Technologies Used
- Update ARTIS Model Scripts and Model Inputs
- Installations
- Assumptions
- AWS CLI Setup
- Clear Existing AWS Resources
- Python Installation
- Setting Up a New ARTIS HPC on AWS
- Running an Existing ARTIS HPC Setup
- Combine All ARTIS Model Outputs into Database Ready CSVs
- Download Results, Clean Up AWS and Docker Environments
- Create AWS IAM user
Terraform
- Creates all the AWS infrastructure needed for the ARTIS HPC.
- Destroys all AWS infrastructure for the ARTIS HPC after the ARTIS model has finished to save on unnecessary costs.
Docker
- Creates a docker image that our HPC jobs will use to run the ARTIS model code.
Python
- Uses the Docker and AWS Python (boto3) clients to:
- Push all model input data to AWS S3
- Build docker image needed fir the AWS Batch jobs to run ARTIS model
- Push docker image to AWS ECR
- Submit jobs to ARTIS HPC
- Uses the Docker and AWS Python (boto3) clients to:
R
- Pull all model outputs data
- Copy
00-aws-hpc-setup.R
script toartis-hpc/data_s3_upload/ARTIS_model_code/
- Copy
02-artis-pipeline.R
script toartis-hpc/data_s3_upload/ARTIS_model_code/
- Copy
03-combine-tables.R
script toartis-hpc/data_s3_upload/ARTIS_model_code/
- Run $
export HS_VERSIONS="[HS VERSIONS YOU ARE RUNNING, NO SPACES]"
i.e. $export HS_VERSIONS="02,07,12,17,96"
or $export HS_VERSIONS="17"
to specify which HS versions to run - Run $
./create_pipeline_versions.sh
to create a new version of02-artis-pipeline.R
and00-aws-hpc-setup.R
for every HS version specified to run inHS_VERSIONS
inartis-hpc/data_s3_upload/ARTIS_model_code/
- Copy the most up-to-date set of
model_inputs
toartis-hpc/data_s3_upload/
directory. Retain the folder namemodel_inputs
- Copy the most up-to-date ARTIS
R/
package folder toartis-hpc/data_s3_upload/ARTIS_model_code/
- Copy the most up-to-date ARTIS R package
NAMESPACE
file toartis-hpc/data_s3_upload/ARTIS_model_code/
- Copy the most up-to-date ARTIS R package
DESCRIPTION
file toartis-hpc/data_s3_upload/ARTIS_model_code/
- Copy the most up-to-date .Renviron file to
artis-hpc/data_s3_upload/ARTIS_model_code/
(-AM is this needed?)
If running on a new Apple chip arm64:
- Copy arm64_venv_requirements.txt file from the root directory to the
artis-hpc/docker_image_files_original/
- Rename the file
artis-hpc/docker_image_files_original/arm64_venv_requirements.txt
toartis-hpc/docker_image_files_original/requirements.txt
- Homebrew
- AWS CLI
- Terraform CLI
- Python
- Python packages
- docker
- boto3
- Python packages
Note: If you already have Homebrew installed please still confirm by following step 3 below. Both instructions should run without an error message.
- Install homebrew - run$
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
- Close existing terminal window where installation command was run and open a new terminal window
- Confirm homebrew has been installed -
- Run $
brew --version
. No error message should appear.
- Run $
If after homebrew installation you get a message stating brew command not found
:
-
Edit zsh config file, run $
vim ~/.zshrc
-
Type
i
to enter edit mode -
Copy & paste this line into the file you opened:
export PATH=/opt/homebrew/bin:$PATH
- Press
Shift
and : - Type
wq
- Press
Enter
- Source new config file, run $
source ~/.zshrc
Following instructions from AWS
Note: If you already have AWS CLI installed please still confirm by following step 3 below. Both instructions should run without an error message.
The following instructions are for MacOS users:
- Run $
curl "https://awscli.amazonaws.com/AWSCLIV2.pkg" -o "AWSCLIV2.pkg"
- Run $
sudo installer -pkg AWSCLIV2.pkg -target /
- Confirm AWS CLI has been installed:
- Run $
which aws
- Run $
aws --version
- Run $
Note: If you already have homebrew installed please confirm by running $brew --version
, no error message should occur.
To install terraform on MacOS we will be using homebrew. If you do not have homebrew installed on your computer please follow the installation instructions here, before continuing.
Based on Terraform CLI installation instructions provided here.
- Run $
brew tap hashicorp/tap
- Run $
brew install hashicorp/tap/terraform
- Run $
brew update
- Run $
brew upgrade hashicorp/tap/terraform
If this has been unsuccessful you might need to install xcode command line tools, try:
- Run terminal command:
sudo xcode-select --install
- Run $
export AWS_ACCESS_KEY=[YOUR_AWS_ACCESS_KEY]
- sets terminal environmental variable. Replace
[YOUR_AWS_ACCESS_KEY]
with your value
- sets terminal environmental variable. Replace
- Run $
export AWS_SECRET_ACCESS_KEY=[YOUR_AWS_SECRET_ACCESS_KEY]
- sets terminal environmental variable. Replace
[AWS_SECRET_ACCESS_KEY]
with your value
- sets terminal environmental variable. Replace
- Run $
export AWS_REGION=us-east-1
- sets terminal environmental variable
- Run $
aws configure set aws_access_key_id $AWS_ACCESS_KEY
- writes value to AWS credentials file (
~/.aws/credentials
)
- writes value to AWS credentials file (
- Run $
aws configure set aws_secret_access_key $AWS_SECRET_ACCESS_KEY
- writes value to AWS credentials file (
~/.aws/credentials
)
- writes value to AWS credentials file (
- Run $
aws configure set region $AWS_REGION
- writes value to AWS config file (
~/.aws/config
)
- writes value to AWS config file (
To check set values:
Run $echo $AWS_ACCESS_KEY
to display the local environmental variable value set with the export
command.
Likewise, run $aws configure get aws_access_key_id
to print aws environment variable values stored in the AWS credentials file.
Log onto AWS to check if there are any model outputs that need to be retained.
In order to run initial_setup.py
we need to create a virtual environment to run the script in. Note: Please make sure that your terminal is currently in your working directory that should end in artis-hpc
, by running the terminal command pwd
.
- Run $
python3 -m venv venv
to create a virtual environment - Run $
source venv/bin/activate
to pen virtual environment - Run $
pip3 install -r requirements.txt
to install all required python modules - Run $
pip3 list
to check that all python modules have been downloaded. Check that all modules in therequirements.txt
file are included.
If an error occurs please follow these instructions:
- Upgrade your version of pip, Run $
pip install --upgrade pip
- Install all required python modules, Run $
pip3 install -r requirements.txt
- If errors still occur install each python package in the
requirements.txt
file individually, Run $pip3 install [PACKAGE NAME]
ie $pip3 install urllib3
.
The initial_setup.py
script will create all necessary AWS infrastructure with terraform, upload all model inputs to an AWS S3 bucket artis-s3-bucket
, create and upload a docker image artis-image
defaulted with files in docker_image_files_original/
directory, and submit jobs to AWS batch. Files in docker_image_files_original/
allow the docker image to import all R scripts and model inputs from the artis-s3-bucket/ARTIS_model_code/
. Anytime there are edits or changes to the ARTIS model codebase there is no need to recreate the docker image, skip to Running an Existing ARTIS HPC Setup
- Open Docker Desktop
- Take note of any existing docker images and containers relating to other projects and
- Delete all docker containers relating to ARTIS,
- Delete all docker images relating to ARTIS.
- Create AWS infrastructure, upload model inputs, and create new ARTIS docker image, Run:
python3 initial_setup.py -chip [YOUR CHIP INFRASTRUCTURE] -aws_access_key [YOUR AWS KEY] -aws_secret_key [YOUR AWS SECRET KEY] -s3 artis-s3-bucket -ecr artis-image
- Details:
- If you are using an Apple Silicone chip (M1, M2, M3, etc) your chip will be
arm64
, otherwise for intel chips it will bex86
- If you are using an Apple Silicone chip (M1, M2, M3, etc) your chip will be
- If you have an existing docker image you would like to use include the
-di [existing docker image name]
flag with the command shown below.- Recommendation: the default options will create a docker image called
artis-image
, so if you want to use the previously created default docker image you would include-di artis-image
. - Note: The AWS docker image repository and the docker image created with default options both have the name
artis-image
, however they are two different resources.
- Recommendation: the default options will create a docker image called
python3 initial_setup.py -chip [YOUR CHIP INFRASTRUCTURE] -aws_access_key [YOUR AWS KEY] -aws_secret_key [YOUR AWS SECRET KEY] -s3 artis-s3-bucket -ecr artis-image -di artis-image:latest
Example command (using credentials stored in local environmental variables set above and creating the docker image from scratch):
python3 initial_setup.py -chip arm64 -aws_access_key $AWS_ACCESS_KEY -aws_secret_key $AWS_SECRET_ACCESS_KEY -s3 artis-s3-bucket -ecr artis-image
Note: If terraform states that it created all resources however when you log into the AWS console to confirm cannot see them, they have most likely been created as part of another account. Run terraform destroy -auto-approve
on the command line. Confirmed you have followed the AWS CLI set up instructions with the correct set of keys (AWS access key and AWS secret access key).
Note: All AWS infrastructure has already been created and there are only edits to the model input files or ARTIS model code.
- Make sure to put all new R scripts or model inputs in the relevant
data_s3_upload
directory - Run: $
python3 s3_upload.py
to upload local model code and inputs to AWS S3 bucketartis-s3-bucket
- Run: $
python3 submit_artis_jobs.py
to submit batch jobs on AWS. Loops through designated HS versions to run corresponding shell scripts to sourcedocker_image_artis_pkg_download.R
and02-artis-pipeline_[hs version].R
Check status of jobs submitted to AWS batch
- navigate to AWS in your browser and log in to your IAM account.
- Use the search bar at the top of the page to search for "batch" and click on the Service Batch result.
- Under "job queue overview" you will be able to see job status.
Troubleshoot "failed" jobs
- Click on number below "failed" column of job queue
- Identify and open relevant failed job. Inspect "Job attempts" for "status reason" value.
- Search for "cloudwatch" in search bar and click on the Service CloudWatch
- In the left hand nav-bar click on "Logs"" then "Log groups" and next "/aws/batch/job"
- Inspect "log stream" for timestamps and messages from running the model code.
- Run $
python3 submit_combine_tables_job.py
- Run $
python3 s3_download.py
to download "outputs" folder from AWS, - Run $
terraform destroy
to destroy all AWS resources and dependencies created - Open Docker Desktop app, 4. Delete all containers created 5. Delete all images created
- Run $
deactivate
to close python environment,
FIXIT: include screenshots for creating an IAM user with the correct admin permissions.
This needs to be corrected - just a placeholder
/home/ec2-user/artis/ │ ├── model_inputs/ │ ├── code_max_resolved.csv │ ├── (Other input files) │ ├── ARTIS_model_code/ │ ├── 02-artis-pipeline.R │ ├── 00-aws-hpc-setup.R │ ├── R/ │ │ ├── (Various R scripts) │ │ │ └── (Other model code files) │ └── output/ (Optional) ├── (Output files and logs)