/cuda-code-on-aws

Deploying generated CUDA code from MATLAB algorithm on AWS

Primary LanguageMATLAB

Deploying MATLAB generated CUDA code on AWS

Requirements

  • MATLAB®
  • MATLAB Coder™
  • GPU Coder™
  • Parallel Computing Toolbox™
  • Deep Learning Toolbox™
  • Image Processing Toolbox™
  • An Amazon Web Services™ (AWS) account
  • An SSH Key Pair for your AWS account in the US East (N. Virginia) region. For more information, see Amazon EC2 Key Pairs.

Costs

You are responsible for the cost of the AWS services used when implementing this demo. Resource settings, such as instance type, will affect the cost of deployment. The AMI used for this guide, with the recommended instance type costs $1/hour. For cost estimates, see the pricing pages for each AWS service you will be using. Prices are subject to change.

Introduction

The following guide demonstrates how to generate CUDA code using GPU Coder, build and run an executable on an EC2 instance.

Prepare your AWS Account

  1. If you don't have an AWS account, create one at https://aws.amazon.com by following the on-screen instructions.
  2. Use the regions selector in the navigation bar to choose the US-EAST (N. Virginia) region where you want to deploy MATLAB.
  3. Create a key pair in that region. The key pair is necessary as it is the only way to connect to the instance as an administrator.
  4. If necessary, request a service limit increase for the Amazon EC2 instance type or VPCs. You might need to do this if you already have existing deployments that use that instance type or you think you might exceed the default limit with this deployment.

Architecture overview

alt text

Steps

  1. Write an entry point function in MATLAB
  2. Generate a static library using GPU Coder
  3. Setup AWS environment
  4. Deploy generated code to an S3 bucket
  5. Deploy code in S3 bucket to an EC2 instance(s) or AutoScaling Group
  6. Build executable on EC2 instance
  7. Create a simple web app to interact with the executable

Step 1. Write an entry point function

Please see the MATLAB script alexnet_predict.m for more information

Step 2. Generate a static library using GPU Coder

Please see the MATLAB script test_codegen.m for more information

Step 3. Setup AWS environment

AWS component steps
IAM Roles
  • We need 2 IAM roles:
    • One that allows EC2 instances to access services like S3 and CodeDeploy
    • One that allows CodeDeploy to access services like S3 and EC2
  • Search and open IAM from Services
  • Click on Roles --> Create role --> Name your role
  • Click on permissions tab -> Attach policies
  • For the EC2 role, add ‘AmazonEC2RoleforAWSCodeDeploy’ and ‘AmazonS3FullAccess’ from the list of policies
  • For the CodeDeploy role, add ‘AWSCodeDeployRole’ and ‘AmazonS3FullAccess’ from the list of policies
EC2 Instance(s)
  • Configure your EC2 instance for CodeDeploy to work using steps described here
  • Create an EC2 instance using the recommended AMI and p2.xlarge instance type
  • Verify if the CodeDeploy agent is running on the instance. The steps are described here
S3 Bucket
  • Create an S3 bucket, so that the code generated files can be deployed from here
  • A S3 bucket was created in the same region as the EC2 instance and CodeDeploy deployment group for this demo.However, it is also possible to have a cross region AWS architecture
  • Go to the S3 Properties tab and enable Versioning
  • Go to the Permissions tab -> Bucket Policy
  • Please see the sample bucket policy in the repository
  • Replace the arn names with yours
  • You can find the AWS account number for your arn under the Support tab in the AWS console
  • Please see this AWS documentation page for more information
CodeDeploy

You can also use scp to upload the code generated files. However, AWS CodeDeploy is a free service that allows you to deploy applications from your development machine to one or more EC2 instances at once.

  • Go to the AWS CodeDeploy console
  • Click on ‘Create application’
  • Name your application and select EC2 from the compute platform options
  • Click on ‘Create deployment group’ and name it
  • Select the service/IAM role created previously from the service role drop down
  • Choose deployment type based on your requirement
  • Select EC2 or AutoScaling groups in environment configuration based on your requirement. In this case I used an EC2 instance
  • If you are using just 1 EC2 instance uncheck load balancing and click Create deployment group

Step 4. Deploy generated code to an S3 bucket

  • You need to do this from the command line using the AWS CLI
  • Please follow steps here to set up the AWS CLI on your development machine
  • Place your codegen directory and the appspec.yml file in one directory and navigate to the parent directory from the command line
  • The directory structure would look like:
    • temp
      • codegen
      • appspec.yml
  • Execute the following command in the command line: aws deploy push --application-name <name of your CodeDeploy App> --s3-location s3://codegens3/codegen.zip --ignore-hidden-files

Step 5. Deploy code in S3 bucket to an EC2 instance(s) or AutoScaling Group

  • Create a YAML file named ‘appspec.yml’ in the same directory that contains the codegen directory
  • This file tells AWS CodeDeploy the source and destination of your files
  • An example appspec.yml file is shown here
  • The deployment will fail if:
    • YAML file has incorrect syntax
    • You don’t have path permissions at the destination
  • Click on ‘Create deployment’
  • Select S3 under Revision type
  • The revision location would be: s3://your-bucket-name/codegen.zip
  • Complete by clicking on ‘Create deployment’

Step 6. Build executable on EC2 instance

  • SSH into the EC2 instance using the instructions here
  • Navigate to the directory containing all the code generated files
  • Find and edit the make file(.mk extension) to update the paths to the source files
  • Execute the following commands to create the static library using the updated paths:
    make -f <path to make file> clean followed by make -f <path to make file>
  • Execute the following command from the command line: nvcc -arch sm_35 -o classifier relative path/to/main.cu relative path/to/inputFile.a -I<relative path to codegen directory><br/> -L"./<relative path to codegen directory>" -lmwjpegreader -lcudart -lcudnn -lcudart -lcublas
  • lmwjpegreader is needed if you are using the imread MATLAB function in your MATLAB source code. Please see this documentation page for more information on how to use image processing functions in code generation. Alternatively, you can also use the OpenCV library to read images.
  • Please see this documentation page for more information on nvcc.

Step 7. Create a simple web app to interact with the executable

  • Install Apache web server on EC2 instance: sudo apt-get install apache2
  • We are using PHP, to install PHP: sudo apt-get install php libapache2-mod-php
  • Change server root to /var/www/ by editing file /etc/apache2/sites-enabled/ 000-default.conf
  • Add server root folder permissions for the user: sudo chmod -R g+rw /var/www
  • Restart server: sudo service apache2 restart
  • Create a dummy php file in the server folder and check if the server is running by going to <EC2 public DNS>/filename.php
  • Place the files in var/www from the repository in your server root
  • The single page PHP webapp will let the user upload a JPEG file
  • Clicking on the predict button will call the CUDA executable
  • The webapp will parse and display the classification result