awslabs/mountpoint-s3

Auto-configure Network Interface Card (NIC) throughput

jamesbornholt opened this issue ยท 5 comments

Remove the need for customers to manually configure Mountpoint for Amazon S3 to achieve high throughout for the instance type they are using

A few of us now have misconfigured the connector by not specifying target throughput on EC2 instances with large NICs. This seems like something we should be able to autodiscover and just set correctly by default.

The CRT S3 client has the beginnings of this interface (awslabs/aws-c-s3#70), but as far as I can tell it needs to be manually invoked and only carries data for c5n.18xlarge right now.

I wrote a script to auto-detect this based on ec2 metadata & describe instance api. Would love to see this integrated in somehow:

#!/bin/bash

# get network throughput from ec2 instance
instance_type=$(curl -s http://169.254.169.254/latest/meta-data/instance-type)
region=$(curl -s http://169.254.169.254/latest/dynamic/instance-identity/document | grep region|awk -F\" '{print $4}')
network=$(aws ec2 --region ${region} describe-instance-types --instance-types ${instance_type} --query "InstanceTypes[].[NetworkInfo.NetworkPerformance]" --output text | grep -o '[0-9]\+')

# Mount S3 Bucket
mkdir -p ${1}
mount-s3 --throughput-target-gbps ${network} ${2} ${1}

Hey there, we are implementing this via calling crt imds client.

Cool - keep in mind you'll likely need to filter down. IMDS returns values like:

$ aws ec2 describe-instance-types --filters "Name=instance-type,Values=c5.*" --query "InstanceTypes[].[InstanceType, NetworkInfo.NetworkPerformance]" --output table
-------------------------------------
|       DescribeInstanceTypes       |
+--------------+--------------------+
|  c5.4xlarge  |  Up to 10 Gigabit  |
|  c5.xlarge   |  Up to 10 Gigabit  |
|  c5.12xlarge |  12 Gigabit        |
|  c5.24xlarge |  25 Gigabit        |
|  c5.9xlarge  |  10 Gigabit        |
|  c5.2xlarge  |  Up to 10 Gigabit  |
|  c5.large    |  Up to 10 Gigabit  |
|  c5.metal    |  25 Gigabit        |
|  c5.18xlarge |  25 Gigabit        |
+--------------+--------------------+

I added a regex and filter to get down to just the number.

$ instance_type=c5n.18xlarge
$ region=us-east-1
$ aws ec2 --region ${region} describe-instance-types --instance-types ${instance_type} --query "InstanceTypes[].[NetworkInfo.NetworkPerformance]" --output text | grep -o '[0-9]\+'
100

Added in #184!

[ec2-user@ip-172-31-17-77 data]$ TOKEN=`curl -s -X PUT "http://169.254.169.254/latest/api/token" -H "X-aws-ec2-metadata-token-ttl-seconds: 21600"` && curl -s -H "X-aws-ec2-metadata-token: $TOKEN" http://169.254.169.254/latest/meta-data/instance-type
m6idn.16xlarge
[ec2-user@ip-172-31-17-77 data]$ ./mount-s3 bornholt-test-bucket ~/mnt -f
2023-04-27T01:19:26.070099Z  INFO mount_s3: target network throughput 100 Gbps
2023-04-27T01:19:26.410418Z  WARN mount_s3: bucket bornholt-test-bucket is in region us-west-2, not us-east-1. redirecting...
2023-04-27T01:19:26.531456Z  INFO mount_s3: successfully mounted "/home/ec2-user/mnt"
2023-04-27T01:19:26.531570Z  INFO mountpoint_s3::fuse::session: fuse worker 0 is thread ID 1030411

Sweeet!