fullstaq-ruby/server-edition

Use sccache instead of ccache

FooBarWidget opened this issue · 1 comments

We currently use ccache for caching compilation, but Github Action's cache is tiny, only 2 GB. We should use sccache, which is like ccache but uses cloud storage. This way we can get unlimited cache size.

  • We will use a different cache per distribution build target. However, all jobs on all branches, as long as they target the same distribution, share the same cache.
  • The cache is to be located in new bucket gs://fullstaq-ruby-server-edition-ci-cache/sccache/<distro> on Azure Blob Storage. Objects in the storage container will use the key sccache/<distro>. This requires mozilla/sccache#1109
  • The storage container will use lifecycle rules to automatically delete objects that haven't been accessed for 90 days (daysAfterLastAccessTimeGreaterThan ). This requires hashicorp/terraform-provider-azurerm#15407

On second thought, storing the cache on Google Cloud Storage is not a good idea. Github Actions runners are hosted on Azure and it looks like the latencies between Google Cloud and Azure are abysmal. We should store the cache on Azure Blob Storage instead, location west US.

An added benefit of Azure Blob Storage is that we get to use lifecycle management rules to automatically delete objects that haven't been accessed for an amount of period. Google Cloud Storage's lifecycle management only allows deleting objects based on age since creation, not age since last access.


I ran a benchmark on Github Actions in which I compiled Jemalloc in 3 manners:

  1. Without sccache
  2. Sccache backed by Google Cloud Storage
  3. Sccache backed by Azure Blob Storage

I ran make with a concurrency of 2. Jemalloc is compiled twice: the first time against an empty cache (the cold run), then another time against a full cache (the hot run).

The run times were as follows:

Without sccache

  • Run 1:
    • Runner location: Tappahannock, Virginia (52.224.111.185)
    • Cold run: 19s
    • Hot run: 18s
  • Run 2:
    • Runner location: Washington state (20.69.111.165)
    • Cold run: 15s
    • Hot run: 14s
  • Run 3:
    • Runner location: Washington state (20.80.185.87)
    • Cold run: 18s
    • Hot run: 17s
  • Run 4:
    • Runner location: Tappahannock, Virginia (40.71.187.160)
    • Cold run: 12s
    • Hot run: 12s
  • Run 5:
    • Runner location: Washington state (20.94.195.54)
    • Cold run: 15s
    • Hot run: 13s

Average: cold=15.8, hot=14.8

Google Cloud Storage

Bucket location: east-us4 (Ashburn, Virginia)

  • Run 1:
    • Runner location: Tappahannock, Virginia (137.117.97.31)
    • Cold run: 18s
    • Hot run: 16s
  • Run 2:
    • Runner location: Washington state (40.91.105.134)
    • Cold run: 24s
    • Hot run: 23s
  • Run 3:
    • Runner location: Washington state (52.191.132.176)
    • Cold run: 22s
    • Hot run: 20s
  • Run 4:
    • Runner location: Des Moines, Iowa (40.122.163.69)
    • Cold run: 19s
    • Hot run: 18s
  • Run 5:
    • Runner location: Washington state (52.137.116.25)
    • Cold run: 25s
    • Hot run: 23s

Average: cold=21.6, hot=17.0

Azure Blob Storage

Container location: east-us2 (Virginia)

  • Run 1:
    • Runner location: Washington state (52.229.9.182)
    • Cold run: 24s
    • Hot run: 10s
  • Run 2:
    • Runner location: Washington state (52.183.80.160)
    • Cold run: 23s
    • Hot run: 10s
  • Run 3:
    • Runner location: Des Moines, Iowa (13.67.160.59)
    • Cold run: 19s
    • Hot run: 7s
  • Run 4:
    • Runner location: Washington state (20.114.23.7)
    • Cold run: 23s
    • Hot run: 11s
  • Run 5:
    • Runner location: Tappahannock, Virginia (20.120.36.65)
    • Cold run: 21s
    • Hot run: 7s
  • Run 6:
    • Runner location: Tappahannock, Virginia (20.127.110.110)
    • Cold run: 22s
    • Hot run: 8s

Average: cold=22.0, hot=8.3

Analysis

  • Cold compilation times are significantly slower (~28%) when sccache is used.

    • Cold compilation times are similar between Google Cloud Storage and Azure Blob Storage.
  • Using Google Cloud Storage slows down hot compilation: 13% slower! There is no point in using Google Cloud Storage.

    • The only exception is when the Github Actions runner is located in Virginia (same location as the Google Cloud Storage bucket). In this case, hot compilation time is nearly the same as when sccache is not used. But it still doesn't make things faster.
  • Using Azure Blob Storage speeds up hot compilation: 56% faster.

    • Latency between west US (Washington state) and east US 2 (Virginia) is low. Hot compilation times in Washington state are only slightly slower than hot compilation times in Virginia. Still results in an overall speedup.
  • The runners' geographic distributions are as follows:

    • 9x Washington state (west US)
    • 5x Tappahannock, Virginia (east US)
    • 2x Des Moines, Iowa (central US)

    Thus it makes most sense to place the storage container in either west US (because runners are most likely to be in Washington state) or central US (balanced latency between east and west US).