Travis Build Times

We noticed some slowness on travis, so figured I'd take a look.

Goal here was to calculate the likelihood a build will timeout given the current elapsed duration of the build.

The data was split into two sets:

'control' which is all of our travis builds between 2016/03 - 2016/11/18
'upgradedVM' which is once travis upgraded our machine .... 11/18-12/20

To run:

bundle install
make local

You'll need rvm / ruby / gnuplot.

Result

When we went from the 'open source' VM pool to the 'premium' VM, we did see improvement:

A few things to note:

the curve went 'down' - suggesting that total frequency of timeouts went down (the area under the curve is lower)
the curve went 'right'
- the old 90% likelihood a build would fail happened at minute 34
- now it happens at minute 48
- suggesting that the VM still has resources and is doing work, and can complete before the timeout
the curve got less 'steep' - suggesting that the VM makes real progress for all of the time the build is going

Questions

can we measure the frequency of restarted builds?
is there more resource contention on friday (anecdotally we believe this is true)?

Todo

Didn't quite finish setting it up as a pachyderm pipeline. With the services feature about to land, it would be cool to have the final 'pipeline' be a job hosting the png's generated by gnuplot

sjezewski/travis_build_times

Travis Build Times

Result

Questions

Todo