Welcome to the AWS Summit San Francisco 2022 Inferentia Workshop! During this workshop, we will create two endpoints with one HuggingFace model each. We will use them for the task of paraphrase detection which is an NLP classification problem. These two endpoints will have the following configurations: a) CPU-based endpoint, where we will be deploying the model with no changes; and b) Inf1 instance based endpoint, where we will prepare and compile the model using SageMaker Neo before deploying. Finally, we will perform a latency and throughput performance comparison of both endpoints.
-
Access event engine: Open a browser and type the URL shared during the workshop.
-
Click on “Agree” (make sure the event passcode is displayed)
- Click on “Email One-Time Password (OTP)”
- Complete with your email and click “Send Passcode”
- Retrieve the OTP passcode from your email. Copy and paste it in the “Passcode 9 digit code” field. Press “Sign In”
- Once logged in, you will see the "team Dashboard". Click on “AWS Console”
- Then click on “Open AWS Console"
- Inside the console look for the search box and Type “Sagemaker”, then click on the "Amazon Sagemaker" Service. If you prefer you can navigate directly to the Sagemaker console.
- Once you see the Sagemaker dashboard click on “Studio” on the “Sagemaker Domain” menu on the left.
- Click on “Launch App” button and then on “Studio”
- Once inside Sagemaker Studio”, go to “File/New/Terminal” to open a terminal:
-
Type the following command to clone this repo:
git clone https://github.com/aws-samples/aws-inferentia-huggingface-workshop.git
-
Once the repo is cloned, open the Jupiter notebook named aws_summit_2022_inf1_bert_compile_and_deploy.ipynb. To do this, find the file browser on the left, and click on “aws-inferentia-huggingface-workshop” then double click on the file name aws_summit_2022_inf1_bert_compile_and_deploy.ipynb.
- You will see the following pop up.
- Make sure you select the Python 3 (Pytorch 1.8 Python 3.6 CPU Optimized) Kernel when prompted
This code is released under the MIT-0 license Please refer to the license for applicable terms.