s3tester - S3 Performance Benchmarking
The goal of s3tester is to be a lightweight S3 performance testing utility. It is solely focused on S3 testing.
This tool is in active development - please submit feature requests in the issues page.
s3tester latest version
Minimum Requirements
- Go 1.7 or higher
Installation
$ go get github.com/s3tester/s3tester
If you don't want to build from source you can download the compiled version of s3tester for Windows or Linux from github.com/s3tester/s3tester/releases
Usage
Setting your S3 credentials
There are multiple options for setting credentials.
- Using environment Variables:
$ export AWS_ACCESS_KEY_ID=AKIAINZFCN46TISVUUCA
$ export AWS_SECRET_ACCESS_KEY=VInXxOfGtEIwVck4AdtUDavmJf/qt3jaJEAvSKZO
- Using AWS credential file: see the
--profile
option below for details.
Command line options
Parameter | Type | Note |
---|---|---|
addressing-style | string | Whether to use virtual-hosted style addresses (bucket name is in the hostname) or path-style addresses (bucket name is part of the path). Value must be one of virtual or path . Default: path |
bucket | string | Bucket name (mandatory). Default: test |
concurrency | int | Maximum concurrent requests. 0 : scan concurrency, run with ulimit -n 16384 . Default: 1 |
consistency | string | The StorageGRID consistency control to use for all requests. Does nothing against non StorageGRID systems. (all , available , strong-global , strong-site , read-after-new-write , weak ) |
cpuprofile | string | Write CPU profile to file |
days | int | The number of days that the restored object will be available for. Default: 1 |
debug | boolean | Print response body on request failure. |
describe | boolean | Instead of running tests, show the consolidated list of test parameters that will be used when a test is run. |
duration | int | Test duration in seconds. Mutually exclusive with requests |
endpoint | string | target endpoint(s). If multiple endpoints are specified separate them with a , . Note: the concurrency must be a multiple of the number of endpoints. Default: "https://127.0.0.1:18082" |
header | - | Specify one or more headers of the form <header-name>: <header-value> . |
json | boolean | The result will be printed out in JSON format if this flag exists. Default: false |
lockstep | boolean | Force all threads to advance at the same rate rather than run independently |
logdetail | string | Write detailed log to file |
loglatency | string | Write latency histogram to file |
metadata | string | The metadata to use for the objects. The string must be formatted as such: 'key1=value1&key2=value2' . Used for put , updatemeta , multipartput , putget and putget9010r |
metadata-directive | string | Specifies whether the metadata is copied from the source object or if it is replaced with the metadata provided in the object copy request. Value must be one of COPY or REPLACE . Default: COPY |
mixed-workload | string | Path to a JSON file that specifies a mixture of operations. |
no-sign-request | boolean | Do not sign requests. Credentials will not be loaded if this argument is provided |
operation | string | Operation type: put , multipartput , get , puttagging , updatemeta , randget , delete , options , head , restore . Default: put |
overwrite | int | Turns a PUT/GET/HEAD into an operation on the same S3 key. 1 : all writes/reads are to same object, 2 : threads clobber each other but each write/read is to unique objects |
partsize | int | Size of each part in bytes. Only has an effect when a multipart put is used. Min: 5242880 (5MiB). Default: 5242880 (5MiB) |
prefix | string | Object name prefix. Default: testobject |
profile | string | Use a specific profile from AWS CLI credential file |
query-params | string | Specify one or more custom query parameters of the form <queryparam-name>=<queryparam-value> or <queryparam-name> separated by ampersands. |
random-range | string | Used to perform random range GET requests. Format is <min>-<max>/<size> , where <size> is the number of bytes per GET request, and <min>-<max> is an inclusive byte range within the object . Ex: Use 0-399/100 to perform random 100-byte reads within the first 400 bytes of an object. |
range | string | Specify range header for GET requests |
ratelimit | float | The total number of operations per second across all threads. Default: 1.7976931348623157e+308 |
region | string | Region to send requests to. Default: us-east-1 |
repeat | int | Repeat each S3 operation this many times: Default: 0 (do not repeat) |
requests | int | Total number of requests. Mutually exclusive with duration . Default: 1000 |
retries | int | Number of retry attempts. Default: 0 |
retrysleep | int | How long to sleep in between each retry in milliseconds. Default: 0 (exponential backoff) |
rr | - | Reduced redundancy storage for PUT requests |
size | int | Object size in bytes. Default: 30720 |
tagging | string | The tag-set for the object. The tag-set must be formatted as such: 'tag1=value1&tag2=value2' . Used for put , puttagging , putget and putget9010r |
tagging-directive | string | Specifies whether the object tag-set is copied from the source object or if it is replaced with the tag-set provided in the object copy request. Value must be one of 'COPY' or 'REPLACE'. Default: COPY |
tier | string | The retrieval option for restoring an object. One of expedited , standard , or bulk . AWS default option is standard if not specified. Default: standard |
uniformDist | string | Generates a uniform distribution of object sizes given a min-max size. Allowed values: 10 to 20 ) |
verify | int | Verify the retrieved data on a get operation. 0 : disable verify (default); 1 : normal put data, 2 : multipart put data. If verify equals 2 , partsize is required (default partsize is 5242880 bytes) |
workload | string | File path to a JSON file that describes a workload to be run. The file is parsed with the Go template package and must produce JSON that is valid according to the workload schema |
workload JSON Sample File
{
"global": {
"concurrency": 4,
"prefix": "test",
"requests": 20
},
"workload": [
{
"bucket": "b1",
"operation": "put"
},
{
"bucket": "b2",
"copy-source-bucket": "b1",
"operation": "copy"
},
{
"bucket": "b2",
"operation": "get"
},
{
"bucket": "b2",
"operation": "head"
},
{
"bucket": "b1",
"operation": "delete"
},
{
"bucket": "b2",
"operation": "delete"
}
]
}
NOTE: File Sample and Template Support
mixedWorkload JSON Sample File
{
"mixedWorkload": [{
"operationType": "put",
"ratio": 25
}, {
"operationType": "get",
"ratio": 25
}, {
"operationType": "updatemeta",
"ratio": 25
}, {
"operationType": "delete",
"ratio": 25
}]
}
NOTE: The order of operations specified will generate the requests in the same order. That is, if you have DELETE followed by a PUT, but no objects on your grid to delete, all your deletes will fail.
Exit codes
1
: one or more requests has failed
Examples
Writing objects into a bucket
./s3tester -concurrency=128 -size=20000000 -operation=put -requests=20000 -endpoint="https://10.96.105.5:18443" -prefix=3
- Starts writing objects into the default bucket
test
. - The bucket needs to be created prior to running s3tester.
- The naming of the ingested objects will be
3-object#
where3
is the prefix specified andobject#
is a sequential number starting from zero and going to the number of requests. - This command will perform a total of 20,000 PUT requests (or in this case slightly less because 20,000 does not divide by 128).
- The object size is 20,000,000 bytes.
- Replace the sample IP/port combination with the one you are using.
Reading objects from a bucket (and other operations)
./s3tester -concurrency=128 -operation=get -requests=200000 -endpoint="https://10.96.105.5:18443" -prefix=3
- Matches the request above and will read the same objects written in the same sequence.
- If you use the
randget
operation the objects will be read in random order simulating a random-access workload. - If you use the
head
operation then the S3 HEAD operation will be performed against the objects in sequence. - If you use the
delete
operation then the objects will be deleted.
As of version 2.1.0 the concurrency on a retrieval operation can be different from the concurrency used to ingest the objects. The goal is to save time by ingesting data once and retrieving at different concurrencies to observe the impact on performance. However, the number of requests has to match the number that was actually ingested. For example, if we ingest with concurrency 1000 and requests set to 1100 then only 1000 requests will actually be ingested (1100 - 1100%1000) to keep the number of requests per client thread equal. Now when performing the retrieval the number of requests specified must be 1000, not 1100.
Interpreting the results
--- Total Results ---
Operation: put
Concurrency: 64
Total number of requests: 99968
Total number of unique objects: 99968
Failed requests: 0
Total elapsed time: 2m43.251246249s
Average request time: 101.057175ms
Minimum request time: 13.84ms
Maximum request time: 712.75ms
Nominal requests/s: 633.3
Actual requests/s: 612.4
Content throughput: 2.392018 MB/s
Average Object Size: 4096
Response Time Percentiles
50 : 93.91 ms
75 : 114.68 ms
90 : 140.4 ms
95 : 166 ms
99 : 331.71 ms
99.9 : 492.57 ms
Latency(ms) : Operations
0 - 1 : 0 |
2 - 3 : 0 |
4 - 7 : 0 |
8 - 15 : 7 |
16 - 31 : 945 ||
32 - 63 : 12093 ||||||||||||||
64 - 127 : 71662 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
128 - 255 : 13671 ||||||||||||||||
256 - 511 : 1505 ||
512 - 713 : 85 |
Nominal requests/s
is calculated ignoring any client side overheads. This number will always be higher than actual requests/s. If those two numbers diverge significantly it can be an indication that the client machine isn't capable of generating the required workload and you may want to consider using multiple machines.Actual requests/s
is the total number of requests divided by the total elapsed time in seconds.Content throughput
is the total amount of data ingested and retrieved in MB divided by the total elapsed time in seconds.Total number of unique objects
is the total number of unique objects being operated on successfully.
For per request details, s3tester can be run with the -logdetail
option for capturing all the request latencies into a .csv
file.