googleapis/google-cloud-cpp-common

Proposal: cache build artifacts for Kokoro builds

Closed this issue · 3 comments

Problem Statement

Our builds on Kokoro are slow, they rebuild all our code even if nothing has changed, or even if the change affects a single target. This limits the maximum size tolerable for a repository, discourages the creation of more tests (particularly integration tests), and slows down our development cycles.

We would like to speed up the builds, particularly for pull requests.

Overview

For CMake-based builds

CMake can integrate with ccache to avoid rebuilding targets when the dependencies have not changed, even if the timestamps have. The build scripts would download the cache from GCS, and at the end of a successful build we would refresh the cache.

For Bazel-based builds

Bazel builds its own cache, we can just use that. Like we do for CMake, we would download the cache from GCS, and at the end of a successful build we would refresh the cache.

Details

The initial implementation would just update the cache when the master branch is rebuilt as part of a continuous build.

We would cache both $HOME/.ccache (the ccache cache location) and $HOME/.cache (the Bazel cache location). These directories would be stored as tarballs (tar.gz files) in GCS.

We would use ${BUCKET_NAME}/build-cache/${REPOSITORY_NAME}/ as the folder to store the cache tarballs.

The name of the tarball would include the distribution name, the distribution version, and the name of the build. These constitute enough information to make it unique.

Future Work

Pull requests sometimes see multiple revisions, ideally we would want to cache the results of a previous build on the PR. We may want to create a new GCS folder for each PR (e.g. using the folder number), try to download the tarball from said folder, if it fails fallback to the folder for master. On a successful PR build we would update the new cache contents to the PR folder. This would also require setting up an object lifecycle policy in GCS to clear these folders after N days without any use. Note that these policies are per-bucket and would affect any other files in that bucket.

Alternatives Considered

The tarball name includes the build name, e.g. clang-tidy, from time to time the exact definition of what a clang-tidy build is changes. For example, we might use a new version of the compiler. We could try to capture this information is some sort of SHA and append it to the cache tarball name. Better to trust ccache and Bazel to do their caching correctly.

Fixed ci/kokoro/docker/ for both -cpp and -spanner, I am not planning to make more changes about this for a while.

@coryan I'm thinking this issue could be closed. These changes have basically been made in -cpp, and this repo is moving there imminently. If you agree, I'll let you close.

Agree.