quantumlib/qsim

Increased runtime for Docker tests

Closed this issue · 4 comments

Starting with #379 (merged on July 26), Docker tests began taking ~10 minutes longer to complete on average. Either that PR or some other change around that time is suspected to have caused this. We should investigate to see if we can bring the runtime back down, as this is starting to negatively affect the pace at which we can merge PRs.

This appears to be caused by the release of Cirq v0.11.1. Sample logs from PRs before and after the change:

Before (~7 min for Cirq install):

2021-07-21T15:32:46.7853502Z Step 3/9 : RUN pip3 install cirq --force
2021-07-21T15:32:46.8099970Z  ---> Running in 563d9e25a03d
2021-07-21T15:32:48.2058400Z Collecting cirq
{...}
2021-07-21T15:39:28.1570052Z Successfully installed cachetools-4.2.2 certifi-2021.5.30 charset-normalizer-2.0.3 cirq-0.11.0 cirq-core-0.11.0 cirq-google-0.11.0 cycler-0.10.0 google-api-core-1.31.0 google-auth-1.33.1 googleapis-common-protos-1.53.0 grpcio-1.38.1 idna-3.2 kiwisolver-1.3.1 matplotlib-3.4.2 mpmath-1.2.1 networkx-2.6.1 numpy-1.21.1 packaging-21.0 pandas-1.3.0 pillow-8.3.1 protobuf-3.13.0 pyasn1-0.4.8 pyasn1-modules-0.2.8 pyparsing-2.4.7 python-dateutil-2.8.2 pytz-2021.1 requests-2.26.0 rsa-4.7.2 scipy-1.7.0 setuptools-57.4.0 six-1.16.0 sortedcontainers-2.4.0 sympy-1.8 tqdm-4.61.2 typing-extensions-3.10.0.0 urllib3-1.26.6
2021-07-21T15:39:41.3545220Z Removing intermediate container 563d9e25a03d
2021-07-21T15:39:41.3546401Z  ---> 28ff747448f4
2021-07-21T15:39:41.3547021Z Step 4/9 : COPY ./pybind_interface/ /qsim/pybind_interface/

After (~15 min for Cirq install):

2021-07-26T19:44:17.5926968Z Step 3/9 : RUN pip3 install cirq --force
2021-07-26T19:44:17.6139183Z  ---> Running in f9f9b5f04dcb
2021-07-26T19:44:18.7984364Z Collecting cirq
{...}
2021-07-26T19:59:13.0063285Z Successfully installed cachetools-4.2.2 certifi-2021.5.30 charset-normalizer-2.0.3 cirq-0.11.1 cirq-core-0.11.1 cirq-google-0.11.1 cycler-0.10.0 google-api-core-1.31.0 google-auth-1.33.1 googleapis-common-protos-1.53.0 grpcio-1.39.0 idna-3.2 kiwisolver-1.3.1 matplotlib-3.4.2 mpmath-1.2.1 networkx-2.6.1 numpy-1.21.1 packaging-21.0 pandas-1.3.1 pillow-8.3.1 protobuf-3.13.0 pyasn1-0.4.8 pyasn1-modules-0.2.8 pyparsing-2.4.7 python-dateutil-2.8.2 pytz-2021.1 requests-2.26.0 rsa-4.7.2 scipy-1.7.0 setuptools-57.4.0 six-1.16.0 sortedcontainers-2.4.0 sympy-1.8 tqdm-4.61.2 typing-extensions-3.10.0.0 urllib3-1.26.6
2021-07-26T19:59:24.1299382Z Removing intermediate container f9f9b5f04dcb
2021-07-26T19:59:24.1300689Z  ---> ad6147be6d18
2021-07-26T19:59:24.1301565Z Step 4/9 : COPY ./pybind_interface/ /qsim/pybind_interface/

Cirq is installed twice during the Docker tests (once for Docker, and once for the "install test"), so an increase of 10-15 minutes roughly matches with the difference seen above.

Specifically, most of the time in these tests is lost to building wheels:

  • grpcio appears in both the old and new logs, but is only required by cirq-google
  • pandas only appears in the new logs, but has been a cirq dependency for a long time

This suggests a couple of things:

  • We can skip the grpcio wheel by only installing cirq-core, saving some time.
  • Cirq 0.11.1 may have introduced compatibility issues with pandas, as 0.11.0 didn't need to build the wheel

I am unable to reproduce the wheel-building step locally, even with the pip cache disabled. My system consistently retrieves the wheel files for grpcio, pandas, etc., where these Docker tests instead pickup tarballs and build them.

It might be possible to use the pip --prefer-binary flag to avoid this.

--prefer-binary works! This reduces the runtime of Docker tests enough that they are no longer the bottleneck; instead, "Build all wheels for testing" is the longest-running check at ~30 minutes for the Windows build.