A small benchmark, currently only comparing GHC and Koka on basic allocation-heavy examples.
Installation and running:
- For GHC: install
stack
from here or throughghcup
, then hitstack run
in theghc
directory. You can change RTS options bystack run -- +RTS <options>
. - For Koka: install Koka,
then hit
koka bench.kk -O2 -e
in thekoka
directory.
There's a normalization-by-evaluation implementation for pure lambda calculus, with a normalizer and a beta-conversion checker. We use Church-encoded binary trees for benchmarks.
Tree NF
: normalizes a complete binary tree of depth 20, without any in-memory sharing of subtrees.Tree Conv
: checks beta-conversion of the same tree against itself, again no structure sharing.Tree force
: folds over the same tree without actually building up a normalized result in memory; this is possible because of the Church-coding.Tree NF share
: same asTree NF
except subtrees are shared in memory.Tree Conv share
: same asTree Conv
except subtrees are shared in memory.
The practical relevance here is that beta-conversion of lambda terms is used heavily in typechecking for every dependently typed language.
Reuse estimation is not easy for the NbE benchmarks. We can do the following:
look at GHC heap allocation just for Tree NF
and Tree force
. We know that
the difference between them is the quotation of the resuting normal tree, and we
also know that quotation should reuse the Var
and App
nodes in Koka, so
essentially all allocation in quotation is reused. From that we get a rough
estimate of reuse.
- Tree NF heap allocation: 9060M
- Tree force heap allocation: 7550M
Hence, around 16.7% of allocations should be reused in Tree NF in Koka.
Second, we switch over to binary trees in the host language, defined as ADTs. Here we look at an extremely simple benchmark where we can precisely control the ratio of allocation reuse in Koka. We create a tree and the map over it N times.
The creation of the initial tree has no reuse in Koka, but every subsequent mappping has 100% reused allocations. Hence, mapping once has 1/2 total reuse, twice 2/3 and so on.
GHC has no reuse so it copies the tree once whenever the new generation heap is filled up. So the size of the new heap is the key factor for overheads.
- OS: Ubuntu 22.04
- CPU: AMD Ryzen 7700X, stock voltage & frequency, no thermal throttling (cooler: Arctic LF III 280mm AIO).
- RAM: DDR5 6000 MT/s 32-38-38 2x16GB, using Buildzoid secondary timings (https://www.youtube.com/watch?v=dlYxmRcdLVw)
Build used:
- ghc 9.8.2 with
-O1 -threaded -fworker-wrapper-cbv -rtsopts
- Koka 3.1.1 with gcc 11.4.0 with
-O2
.
RSS (resident set size) is reported from /usr/bin/time -v
in all cases. I used
/usr/bin/time -v
on the generated executables in the .stack
and .koka
directories.
RESULTS TO BE UPDATED