haskell/cabal

Building a package is slow

Opened this issue · 4 comments

I recently discovered that building packages is surprisingly more costly than building modules. See commercialhaskell/stack#2831 for a benchmark and results. The takeaway is this:

I chased that behavior down to cabal configure and cabal build, so the problem isn't really with Stack. That made me wonder about how long it takes to build a package with Cabal, so I wrote a benchmark for that. Here are the raw results:

benchmarking cabal sandbox init; cabal sandbox delete
time                 152.6 ms   (146.5 ms .. 156.6 ms)
                     0.998 R²   (0.992 R² .. 1.000 R²)
mean                 155.7 ms   (152.5 ms .. 159.4 ms)
std dev              4.734 ms   (2.695 ms .. 6.772 ms)
variance introduced by outliers: 12% (moderately inflated)

benchmarking cabal sandbox init; cabal build; cabal sandbox delete
time                 1.414 s    (1.362 s .. NaN s)
                     1.000 R²   (0.999 R² .. 1.000 R²)
mean                 1.433 s    (1.422 s .. 1.443 s)
std dev              17.97 ms   (271.9 as .. 18.36 ms)
variance introduced by outliers: 19% (moderately inflated)

benchmarking cabal build
time                 495.2 ms   (448.5 ms .. 532.5 ms)
                     0.999 R²   (0.996 R² .. 1.000 R²)
mean                 502.3 ms   (495.4 ms .. 507.5 ms)
std dev              8.000 ms   (0.0 s .. 9.039 ms)
variance introduced by outliers: 19% (moderately inflated)

What I take away from that is this: Building an empty package with a cold cache takes 1.4 seconds on my machine. A warm cache brings that down to 0.5 seconds. Both of those seem high, but especially the former. Are these results expected?

Hi @tfausak; no, it is a bug that Cabal is slow :) I assume by cold/warm cache you mean, whether or not you have configured previously?

I'm not going to ask you to submit patches to fix the performance, but if you're interested in drilling down and finding out exactly what it is that is taking up this time, that would be very helpful information.

I should have been more specific about cold/warm caches. By "cold cache" I mean building the package in a fresh sandbox. The benchmark ran cabal sandbox init; cabal build; cabal sandbox delete in a loop. By "warm cache" I mean that the package had already been built, then the benchmark ran cabal build in a loop.

I did run a build with --verbose=3 (as you can see in the Dockerfile), but it wasn't as useful as I'd hoped. Cabal doesn't display timing information on log lines. I'll need to write a shell script to prepend each line with the time and see which parts are taking a while.

flamegraphs are easy to build:

// in cabal repo
stack build cabal-install --executable-profiling --library profiling

// in dummy project repo after building sandbox
<path-to-cabal> build +RTS -p

cat cabal.prof | ghc-prof-flamegraph | flamegraph.pl > cabal.prof.svg

I looked around at one and didn't see anything obviously taking up way too much time. But, perhaps of note, 35% of the time is spent just getting to Main.buildAction.

#4093 suggests this might be happening at the GHC level rather than the Cabal level.