jenkinsci/nomad-plugin

RFC: job-specific workers configuration

multani opened this issue · 12 comments

Disclaimer: I'm not sure where to ask this, hopefully that's the right place.

We are currently using Jenkins like the following:

  • we have a Docker image, containing lot of development tools, built to act as a Jenkins worker
  • our jobs are using Jenkinsfile-based configuration
  • they declare something like:
node("nomad-4GB") {
...
}
  • they expect there's a nomad-4GB Nomad slave template configured, with the correct version of the Docker image to run the job with.

In order to give our developers more flexibility over the worker configuration and to track more closely the changes in their requirements, I would like to be able to:

  1. set resource limits and/or other kind of constraints differently on different jobs, such as a it can be properly scheduled onto the machines, using resources accordingly:

    • most of our jobs are using less than 4 GB of memory
    • we have C++ jobs using up to 15GB of memory (they need a dedicated slave template for that)
    • we also have small jobs using ~100 MB of memory
  2. give the developers the opportunity to specify the environment in which their jobs get executed (we are running our jobs inside Docker containers.):

    • right now, we have a single image containing C++, Java, Python, Javascript development tools
    • it's too big, difficult to update without breaking all the other jobs
    • it doesn't give enough flexibility to our developers (recently, our JS devs wanted to have a very very recent package, which required to update the base image of this worker image, which had lot of consequences for C++ devs)

We could already solve that today by creating a Nomad slave template for every needs, using labels as a kinf of versioning mechanism. We would have something like:

  • nomad-js-dev-14 configured with 2 GB of memory + Docker image internal/jenkins-worker-js:14
  • nomad-cpp-dev-5 configured with 20 GB of memory + Docker image internal/jenkins-worker-cpp:5
  • etc.

That could work, but it has several disadvantages:

  • the configuration screen is "difficult" to reach (you need to open Jenkins > Manage > System Settings + scroll down at the bottom)
  • the documentation is ... sparse :)
  • there are lot of fields to fill in, although (our) developers would probably be only interested in memory + CPU reservation and the Docker image to use as the base image
  • there's no versioning of the templates. Since it's difficult to mange these templates (per the above points), developers would probably just change the Docker image for nomad-js-dev-14 and next build might fail, although in the context of the project being built itself, nothing has changed.

Something that would be cool to have, although I'm not sure:

  1. if it's possible to do :)
  2. if that makes any sense at all

would be to be able to specify my build as follow:

node("nomad-worker", image: "internal/jenkins-worker-js:14", memory_resource: "4GB") {
    // build steps
}

Behind the scene, Jenkins would send a new Nomad job to the scheduler, based on a template where placeholders would be filled-in with the extra parameters specified in the node() call.

I haven't seen any Jenkins plugin doing something like this AFAIK. But that would give both the flexibility and the traceability that other CI tools like Bitbucket Pipeline, Travis or the like are offering regarding these specific settings.

What do you guys think?

jippi commented

That would be pretty cool!

I'm probably going to add a template text area with a full job spec - and maybe rework it to support task groups rather than separate jobs per slave, since i want to be able to specify constraints like only one slave per instance - something that isn't possible with the current setup

jippi commented

You mean: you want to have a single job instance in Nomad for Jenkins jobs, but a new build scheduled in Jenkins would create a new task group in the job?

Yep :)

jippi commented

If you know any prior art, I'd be interested to have a look :D

I don't, never done any kind of Jenkins code before, and been years since i've done Java, it's a miracle the local changes I've done at $job didn't blew up everything :P

The Kubernetes plugin is doing something similar in this field: https://github.com/jenkinsci/kubernetes-plugin/blob/master/README.md

I especially like that example, it's very similar to what I had in mind.

Exactly. Just wanted to mention the Kubernetes plugin. I agree this would provide much more flexibility. And the current way of manually managing multiple slave templates sucks. Wish I had time to look into implementing this but I don't see a chance of touching the Nomad plugin any time soon. Maybe worth trying to get some Hashicorp folks on board.

jippi commented

I've gone away from using the jenkins nomad plugin and just gone with a simple nomad job and in my case, 3 static and persistent jenkins slaves configured

The name of the slaves follow the pattern slave-$index where $index is $NOMAD_ALLOC_INDEX in the job spec.

It's way more reliable, and Jenkins is fine with a node coming and going dynamically

@jippi, what were the reliability issues that you were seeing with the plugin?

(FWIW, I can schedule jobs with a standalone jobs similar to yours, but doing it via the nomad plugin seems to fail (Jenkins 2.77-alpine container), without any kind of log, except "This agent is offline because Jenkins failed to launch the agent process on it".)

For those who are following this thread: I've been working on a completely new plugin, inspired (cough hacked from cough) the Kubernetes plugin, which allows to define the Jenkins workers as Nomad job from a Jenkins pipeline definition.

Basically, it allows to write something like this (simpler things can be written as well):

def label = "job-${UUID.randomUUID().toString()}"

nomadJobTemplate(
    label: label,
    taskGroups: [
      taskTemplate(
        name: 'jnlp',
        image: 'jenkins/jnlp-slave:alpine',
        command: 'sh',
        args: ['-c', 'java -jar /local/slave.jar -jnlpUrl $JENKINS_JNLP_URL -secret $JENKINS_SECRET'],
      )
    ]
) {
    node(label) {
        stage("Run shell command") {
            echo "Hello world!"
        }
    }
}

It's more or less working at that stage. Actually, It's working pretty well :)

I'm going to test this more thoroughly internally in the coming weeks (it's still eaaarrly stage and there are still a few bugs I need to fix), polish the syntax, fix all the packaging stuff and I'll publish a public version. I'll post an update here when it's done.

For those who are following this thread: I've been working on a completely new plugin, inspired (cough hacked from cough) the Kubernetes plugin, which allows to define the Jenkins workers as Nomad job from a Jenkins pipeline definition.

See https://github.com/multani/nomad-pipeline