This project aims to replicate a variety of benchmarks taken from various sources.
A sample benchmark report run on a MacBook Pro (Retina, 13-inch, Early 2015, 2.7 GHz Intel Core i5, 16 GB 1867 MHz DDR3) can be seen at https://davidkellis.github.io/benchmarks/.
Every benchmark implementation runs in a self-contained Docker container. Necessary build and runtime artifacts are copied into the container.
- Install Docker Community Edition (see https://store.docker.com/search?type=edition&offering=community)
- Install Ruby 2.0+
git clone git@github.com:davidkellis/benchmarks.git
cd benchmarks
./run.rb
- Open
./index.html
in a web browser to view the results.
Many benchmark ideas and implementations are taken from:
- https://benchmarksgame-team.pages.debian.net/benchmarksgame/
- https://github.com/kostya/benchmarks
- https://github.com/kostya/crystal-benchmarks-game
- https://github.com/attractivechaos/plb
- https://github.com/drujensen/fib
- https://github.com/trizen/language-benchmarks
- https://github.com/JuliaLang/Microbenchmarks
- https://github.com/archer884/i-before-e
Attribution is captured in each benchmark program implementation.
The base64 benchmark (1) constructs a string consisting of 10,000,000 'a' characters, (2) base64 encodes the string 100 times, (3) base64 decodes the encoded string 100 times, and then (4) compares the unencoded string generated in step (1) with the decoded string in step (3) and prints whether they match or not.
The binarytrees benchmark constructs perfect binary trees of various heights in an attempt to exercise memory allocation and deallocation performance.
The fib benchmark calculates and prints the 45th Fibonacci number, 1,134,903,170.
The i-before-e benchmark, taken from https://github.com/archer884/i-before-e, implements a variation of the r/dailyprogrammer challenge for 2018-06-11 (see https://www.reddit.com/r/dailyprogrammer/comments/8q96da/20180611_challenge_363_easy_i_before_e_except/).
Specifically, this benchmark lists the words from the enable1 word list that are exceptions to the rule "i before e except after c".
The json benchmark reads a json document from disk, parses the json document, extracts 3-dimensional coordinates from the in-memory json document, calculates the arithmetic average of the coordinates, and then prints the averages to STDOUT.
The matrixmultiply benchmark calculates the product of two 1000x1000 element matrices using the standard iterative matrix multiplication algorithm described at https://en.wikipedia.org/wiki/Matrix_multiplication_algorithm#Iterative_algorithm.
The matrixmultiply-fast benchmark is identical to the matrixmultiply benchmark, with the exception that optimized matrix multiplication libraries and optimized matrix multiplication algorithms may be used in place of the naively implemented standard iterative algorithm that is mandated in the matrixmultiply benchmark.
The pidigits benchmark uses aribitrary precision arithmetic to implement the step-by-step "spigot" algorithm described in http://web.comlab.ox.ac.uk/oucl/work/jeremy.gibbons/publications/spigot.pdf to generate the first 10,000 digits of Pi.
The quicksort benchmark reads a comma-delimited list of double precision floating point values from a text file on disk into memory, and then sorts the list of floating point values in ascending order using the Quicksort sorting algorithm.
The sudoku benchmark reads a puzzle file consisting of 20 sudoku puzzles from disk, solves them sequentially, and then prints out the solved puzzles.
The tapelang-alphabet benchmark tests how quickly a Tapelang interpreter can execute a Tapelang program that prints the alphabet in reverse order, from Z to A.
Every directory in the top level project directory corresponds to a benchmark suite. For example:
benchmarks
- base64
- json
- pidigits
- ...
Within the directory for a given benchmark suite, there are five things: (1) benchmark implementation directories, (2) a README.md that defines the benchmark specification, including any relevant rules/guidelines for implementing the benchmark, (3) an optional setup.rb script, (4) an optional teardown.rb script, and (5) any extra artifacts that are necessary for the benchmark suite to be executed.
Each benchmark implementation is a directory with a Dockerfile
that defines how to build and run the benchmark implementation.
If we're going to add a new benchmark, count_to_a_million, then we do the following:
- Add a top-level directory in the project's root directory. For example,
mkdir count_to_a_million
. - If there needs to be any prep work done prior to the execution of the benchmark implementations, then create a script called
setup.rb
in thecount_to_a_million
directory, and add whatever setup logic needs to be evaluated prior to the execution of the implementations for that benchmark suite to thesetup.rb
script. - If there needs to be any teardown work performed after all the benchmark implementations for the benchmark suite are run, then create a script called
teardown.rb
in thecount_to_a_million
directory. - Add benchmark implementations.
If we're going to add a new ruby 2.5.0 implementation of the count_to_a_million benchmark, then we would create a new subdirectory within the count_to_a_million
benchmark suite directory (i.e. <project root>/count_to_a_million/ruby2.5.0
), and then add the souce code that implements the benchmark, as well as the Dockerfile for that implementation.
The Dockerfile
for every benchmark implementation must copy the time.sh
script into the container, and execute the benchmark via the time.sh
script. For example:
FROM ruby:2.5.0
RUN apt-get update && apt-get install time
WORKDIR /app
# keep in mind that the docker build command is being run from the benchmarks/ project's root directory
# and the context directory is being specified as the benchmarks/ root directory
COPY time.sh . # copy the time.sh script into the container's working directory (i.e. /app)
COPY count_to_a_million/ruby2.5.0/run.rb . # copy the implementation of the benchmark into the container's working directory (i.e. /app)
CMD ["./time.sh", "ruby", "run.rb"] # run the benchmark implementation via the time.sh script
The following sample run was performed on a MacBook Pro (Retina, 13-inch, Early 2015), 2.7 GHz Intel Core i5, 16 GB 1867 MHz DDR3.
The corresponding html report can be viewed at https://davidkellis.github.io/benchmarks/.
$ ./run.rb
Running binarytrees benchmark suite
binarytrees/ruby
binarytrees/gcc8.1.0
binarytrees/go
binarytrees/rust
Running helloworld benchmark suite
helloworld/ruby
helloworld/crystal
Running base64 benchmark suite
base64/ruby
base64/crystal
base64/rust
Running matrixmultiply benchmark suite
matrixmultiply/gogonum
matrixmultiply/ruby
matrixmultiply/go
Running json benchmark suite
Running setup for json benchmark
json/ruby
json/crystal
Running teardown for json benchmark
Running pidigits benchmark suite
pidigits/ruby
pidigits/crystal
pidigits/gcc8.1.0
pidigits/go
pidigits/rust
Running quicksort benchmark suite
Running setup for quicksort benchmark
quicksort/ruby
quicksort/crystal
Running teardown for quicksort benchmark
All benchmarks run in 511.381802 seconds.
Metrics (also written to index.html):
{"binarytrees:ruby:process_user_time"=>"140.26",
"binarytrees:ruby:process_system_time"=>"3.05",
"binarytrees:ruby:process_real_time"=>"0:46.03",
"binarytrees:ruby:process_percent_cpu_time"=>"311%",
"binarytrees:ruby:process_max_rss_mb"=>411.49609375,
"binarytrees:gcc8.1.0:process_user_time"=>"7.53",
"binarytrees:gcc8.1.0:process_system_time"=>"0.21",
"binarytrees:gcc8.1.0:process_real_time"=>"0:02.37",
"binarytrees:gcc8.1.0:process_percent_cpu_time"=>"325%",
"binarytrees:gcc8.1.0:process_max_rss_mb"=>130.984375,
"binarytrees:go:process_user_time"=>"92.16",
"binarytrees:go:process_system_time"=>"9.60",
"binarytrees:go:process_real_time"=>"0:31.37",
"binarytrees:go:process_percent_cpu_time"=>"324%",
"binarytrees:go:process_max_rss_mb"=>386.21875,
"binarytrees:rust:process_user_time"=>"9.55",
"binarytrees:rust:process_system_time"=>"2.52",
"binarytrees:rust:process_real_time"=>"0:03.36",
"binarytrees:rust:process_percent_cpu_time"=>"359%",
"binarytrees:rust:process_max_rss_mb"=>167.734375,
"helloworld:ruby:time"=>"1.74e-05s",
"helloworld:ruby:process_user_time"=>"0.05",
"helloworld:ruby:process_system_time"=>"0.02",
"helloworld:ruby:process_real_time"=>"0:00.10",
"helloworld:ruby:process_percent_cpu_time"=>"68%",
"helloworld:ruby:process_max_rss_mb"=>8.765625,
"helloworld:crystal:time"=>"00:00:00.000036000s",
"helloworld:crystal:process_user_time"=>"0.00",
"helloworld:crystal:process_system_time"=>"0.00",
"helloworld:crystal:process_real_time"=>"0:00.01",
"helloworld:crystal:process_percent_cpu_time"=>"0%",
"helloworld:crystal:process_max_rss_mb"=>2.98828125,
"base64:ruby:process_user_time"=>"3.03",
"base64:ruby:process_system_time"=>"0.36",
"base64:ruby:process_real_time"=>"0:03.40",
"base64:ruby:process_percent_cpu_time"=>"99%",
"base64:ruby:process_max_rss_mb"=>180.84375,
"base64:crystal:process_user_time"=>"2.53",
"base64:crystal:process_system_time"=>"0.02",
"base64:crystal:process_real_time"=>"0:02.57",
"base64:crystal:process_percent_cpu_time"=>"99%",
"base64:crystal:process_max_rss_mb"=>57.26953125,
"base64:rust:process_user_time"=>"1.94",
"base64:rust:process_system_time"=>"1.26",
"base64:rust:process_real_time"=>"0:03.21",
"base64:rust:process_percent_cpu_time"=>"99%",
"base64:rust:process_max_rss_mb"=>37.078125,
"matrixmultiply:gogonum:process_user_time"=>"0.97",
"matrixmultiply:gogonum:process_system_time"=>"0.01",
"matrixmultiply:gogonum:process_real_time"=>"0:00.33",
"matrixmultiply:gogonum:process_percent_cpu_time"=>"294%",
"matrixmultiply:gogonum:process_max_rss_mb"=>25.58203125,
"matrixmultiply:ruby:process_user_time"=>"262.24",
"matrixmultiply:ruby:process_system_time"=>"0.03",
"matrixmultiply:ruby:process_real_time"=>"4:22.63",
"matrixmultiply:ruby:process_percent_cpu_time"=>"99%",
"matrixmultiply:ruby:process_max_rss_mb"=>93.76171875,
"matrixmultiply:go:process_user_time"=>"17.36",
"matrixmultiply:go:process_system_time"=>"0.18",
"matrixmultiply:go:process_real_time"=>"0:17.52",
"matrixmultiply:go:process_percent_cpu_time"=>"100%",
"matrixmultiply:go:process_max_rss_mb"=>26.421875,
"json:ruby:time"=>"7.7473012s",
"json:ruby:process_user_time"=>"7.75",
"json:ruby:process_system_time"=>"0.60",
"json:ruby:process_real_time"=>"0:08.37",
"json:ruby:process_percent_cpu_time"=>"99%",
"json:ruby:process_max_rss_mb"=>805.09765625,
"json:crystal:time"=>"00:00:02.415821000s",
"json:crystal:process_user_time"=>"2.78",
"json:crystal:process_system_time"=>"0.50",
"json:crystal:process_real_time"=>"0:02.58",
"json:crystal:process_percent_cpu_time"=>"126%",
"json:crystal:process_max_rss_mb"=>1035.5234375,
"pidigits:ruby:process_user_time"=>"8.18",
"pidigits:ruby:process_system_time"=>"0.60",
"pidigits:ruby:process_real_time"=>"0:08.80",
"pidigits:ruby:process_percent_cpu_time"=>"99%",
"pidigits:ruby:process_max_rss_mb"=>158.4921875,
"pidigits:crystal:process_user_time"=>"8.09",
"pidigits:crystal:process_system_time"=>"2.26",
"pidigits:crystal:process_real_time"=>"0:10.16",
"pidigits:crystal:process_percent_cpu_time"=>"101%",
"pidigits:crystal:process_max_rss_mb"=>7.8359375,
"pidigits:gcc8.1.0:process_user_time"=>"0.76",
"pidigits:gcc8.1.0:process_system_time"=>"0.00",
"pidigits:gcc8.1.0:process_real_time"=>"0:00.77",
"pidigits:gcc8.1.0:process_percent_cpu_time"=>"98%",
"pidigits:gcc8.1.0:process_max_rss_mb"=>2.125,
"pidigits:go:process_user_time"=>"1.23",
"pidigits:go:process_system_time"=>"0.06",
"pidigits:go:process_real_time"=>"0:01.26",
"pidigits:go:process_percent_cpu_time"=>"101%",
"pidigits:go:process_max_rss_mb"=>9.01953125,
"pidigits:rust:process_user_time"=>"0.76",
"pidigits:rust:process_system_time"=>"0.00",
"pidigits:rust:process_real_time"=>"0:00.76",
"pidigits:rust:process_percent_cpu_time"=>"99%",
"pidigits:rust:process_max_rss_mb"=>4.40234375,
"quicksort:ruby:process_user_time"=>"4.21",
"quicksort:ruby:process_system_time"=>"0.08",
"quicksort:ruby:process_real_time"=>"0:04.30",
"quicksort:ruby:process_percent_cpu_time"=>"99%",
"quicksort:ruby:process_max_rss_mb"=>151.21484375,
"quicksort:crystal:process_user_time"=>"1.78",
"quicksort:crystal:process_system_time"=>"0.16",
"quicksort:crystal:process_real_time"=>"0:01.65",
"quicksort:crystal:process_percent_cpu_time"=>"117%",
"quicksort:crystal:process_max_rss_mb"=>103.734375}