jenkinsci/docker-swarm-plugin

How to re-use the cache ?

Opened this issue · 5 comments

Hi,
I'm trying to use the caching feature which could solve some of the problems I am facing with Jenkins and Maven.
One of my use case is the following :
I want to cache the maven repo between Jenkins jobs to prevent maven from re-downloading all the third party libraries.

Jenkins' maven instances are configured to set the repo at /home/jenkins/.repository.
I then configure the Jenkins cloud template like this :

Docker Swarm Cloud Configuration
Cache driver name : rapt/cachedriver
Cache Dirs (newline-separated) : /home/jenkins/.repository

The plugin has been installed on all nodes, the folder /cache exists on all the host nodes.
Everything is fine, maven will do its thing and download jars & poms into the /home/jenkins/.repository folder and then after the build, the repo gets persisted into /cache/part_of_xxx_»_yyy#zzz.

I've got several questions regarding the reusability of the cache :

  • How do I reuse the cache that has just been persisted by the previous build ?
  • What happens if the next jobs spins up the container onto another node ? (will the cache be duplicated ? reused ?)

Thanks

I have solved it in a different way.

  • I have a NFS fileserver that shares a directory /srv/shares/m2repo
  • all my cluster nodes mount this share under /mnt/m2repo
  • In my agent template I configured the following host bind: /mnt/m2repo:/home/jenkins/.repository

In this way, all build containers use the NFS server share for their repository and so they share this repository, no matter on what node the build is executed.

Thanks for your insight, that is indeed a solution I considered and tested (not the NFS fileserver but simple host-binding of the repo) but I would not consider this a cache as good as the one provided with rapt/cachedriver.

The data inside of the repo would indeed not be wiped after each job, but I still have one issue :
when multiple instances of maven hit the same .m2 repository, they try to access the artefacts (or metadata files) at the same time and fail to do so. (maven is not built to be accessed by multiple processes at the same time).
This is a use case specified by the plugin : https://github.com/suryagaddipati/docker-cache-volume-plugin#build-caches-in-a-continuous-integration-system

Hey, I'm trying to use caching too without any luck and i'm guessing : does it actually work ?

I'm trying to cache the whole jenkins workspace directory just to give it a shot before narrowing things so i put /tmp/workspace in the cache dir config. I started a build and after finished, I can see my workspace on the host in /cache/part_of_XXX#NN/YYY/ where XXX is the job name, NN the build number and YYY something related to the agent name used. For now, everything seems perfect.

Then I started a new build and it creates a new directory /cache/part_of_XXX#MM/ where MM=NN+1, the new build number. Because it's a new directory, nothing is cached from the previous build. If I understand correctly how to use the rapt/cachedriver, the volume name should be something like foo-AAA to create a /cache/foo so transposed to this plugin, I asume volume name is part_of_XXX#NN-YYY and it's wrong : it must not contain the build number for the cache to work across builds.

Maybe I am mistaken ?

After some testing, I found that caching works across builds, but only with freestyle jobs (classic jobs), not pipeline jobs.

Here is a cached file (named date) with freestyle jobs (this file endures after build finishes):
<DOCKER_ROOT>/plugins/b65da049a252b87f137302ef4b493c61757e97ed83b11c10d110e4b7f648823e/propagated-mount/test_docker_swarm_freestyle/_swarm_freestyle/date

Here is a cached file with a pipeline job (exists only while build is running):
<DOCKER_ROOT>/plugins/b65da049a252b87f137302ef4b493c61757e97ed83b11c10d110e4b7f648823e/propagated-mount/part_of_test_docker_swarm__17/docker_swarm_17/date

As @mpnico noticed, BUILD_NUMBER is incorrectly used as part of the volume, but only for pipeline jobs.

I can look into a PR if @Roemer can confirm that this project is still being maintained.

Side Note: I think that it would be helpful to provide a way to customize the volume name so that caching is not per-job. For example, in the modern world, I may have 100 jobs for one project via a Multibranch Pipeline (one per branch, PR, etc), and I may be OK with sharing a cache between them (or some subset).

I use docker in pipeline ,but my docker use is :
stage("build"){
sh 'docker pull harbor.devops.narwal.com/t1-docker/cross_release_env_t1:latest'
sh """ docker run -i --rm -v /workspace:/home/root cross_release_env_t1:latest bash -c " ./custom_compile_arm.sh " """
}
how i can use re-build cache in next build‘s docker?
I see sccatch ,there is some help for my question?
https://github.com/mozilla/sccache/blob/main/docs/Jenkins.md#sccache-on-jenkins