Customisation script and files in `/home/runner`

Question

Customisation script and files in `/home/runner`

eliandoran opened this issue 10 months ago · 7 comments

Hi,

I'm trying to set up a cluster using GitHub Runner Controller (gha-runner-scale-set), using Kubernetes mode.

By default GitHub Runner Controller uses the official ghcr.io/actions/actions-runner:latest image, but I find it quite lacking (not even git is available on it), so I tried this image instead.

It's working just fine, except when I try to run a GitHub Docker action where it will fail with:

##[debug]System.Exception: Executing the custom container implementation failed. Please contact your self hosted runner administrator.
##[debug] ---> System.IO.FileNotFoundException: File not found at '/home/runner/k8s/index.js'. Set ACTIONS_RUNNER_CONTAINER_HOOKS to the path of an existing file.
##[debug]   at GitHub.Runner.Worker.Container.ContainerHooks.ContainerHookManager.ValidateHookExecutable()
##[debug]   at GitHub.Runner.Worker.Container.ContainerHooks.ContainerHookManager.ExecuteHookScript[T](IExecutionContext context, HookInput input, ActionRunStage stage, String prependPath)
##[debug]   --- End of inner exception stack trace ---
##[debug]   at GitHub.Runner.Worker.Container.ContainerHooks.ContainerHookManager.ExecuteHookScript[T](IExecutionContext context, HookInput input, ActionRunStage stage, String prependPath)
##[debug]   at GitHub.Runner.Worker.Container.ContainerHooks.ContainerHookManager.RunContainerStepAsync(IExecutionContext context, ContainerInfo container, String dockerFile)
##[debug]   at GitHub.Runner.Worker.Handlers.ContainerActionHandler.RunAsync(ActionRunStage stage)
##[debug]   at GitHub.Runner.Worker.ActionRunner.RunAsync()
##[debug]   at GitHub.Runner.Worker.StepsRunner.RunStepAsync(IStep step, CancellationToken jobCancellationToken)

I've tracked this error a bit and I found in the GitHub Actions docs there is a description of this:

The custom script must be located on the runner, but should not be stored in the self-hosted runner application directory (that is, the directory into which you downloaded and unpacked the runner software). The scripts are executed in the security context of the service account that's running the runner service.

Note: The triggered script is processed synchronously, so it will block job execution while running.

The script is automatically executed when the runner has the following environment variable containing an absolute path to the script:

ACTIONS_RUNNER_CONTAINER_HOOKS: The script defined in this environment variable is triggered when a job has been assigned to a runner, but before the job starts running.

I managed to look at the official image using docker run -it --entrypoint="/bin/bash" ghcr.io/actions/actions-runner:latest and I found out that /home/runner is populated with a few files, including the /home/runner/k8s/index.js that I presumably need in order to run containers in Kubernetes mode.

What is your opinion, should this k8s container hook be part of this Docker image? What about the other files that are available in /home/runner?

Answer 1 · 2024-02-05T19:55:30.000Z

If people find that this image would suit the actions scaler then that's fine but it's not built that way

I'm happy to review a PR to see a gut check against it, but I dont currently have any opinion other than my previously stated: I dont wish to directly tie this to any orchestrator (k8s, nomad, etc)

I don't mind adding in the functionality but don't with to start that work myself. I'd think that a new set of workflow scripts that build off the current artifacts and add a /k8s or similar might be a decent middle ground

Answer 2 · 2024-02-05T20:06:22.000Z

@myoung34 ,

Thank you for your prompt response and I think it's a fair take.

I'll have a look to see what I can do and if I can find more information about this script.

Answer 3 · 2024-05-22T20:23:30.000Z

Hi @eliandoran, rougly 4 months have passed. Did you get this beautiful image to work with the actions runner controller?

I tried to use it myself where I used this as my base image. In my Dockerfile I did add the runner-container-hooks binary.
I used the installation that is used in the official image:
https://github.com/actions/runner/blob/main/images%2FDockerfile#L20

Then I did set the group to docker with chgrp -R, as is the case with the official image too.
And then in the values.yaml file of the runner-scale-set, I point to the myoung image that I built upon, and there I set all the environment variables that are required for myoung.
But I also set the ACTIONS_RUNNER_CONTAINER_HOOKS which in my case points to /actions-runner/k8s/index.js
This is also done in the original chart's values file.

And as you mentioned, I too use containerMode.type: "kubernetes".
And my error is not the same as yours, it is equal to the one mentioned in this issue #349.
As said in that issue, that's related with Docker stuff, but with the official image it does work (it spins up a separate container suffixed with workflow).
Also, the image works for all none container jobs.

I also tried setting the securityContext.fsgroup to be equal to 121, the one defined in here https://github.com/myoung34/docker-github-actions-runner/blob/master/Dockerfile.base#L90.

And then I also tried adding volumes for the working directory but without success 😢

So, still curious how you solved it.

Maybe we can set an env var ACTIONS_RUNNER_CONTROLLER so that these are installed only when people need it and for backwards compatibility.

Answer 4 · 2024-05-23T07:51:30.000Z

@VincentVerweij , on my side I ended up trying a fork of the official image which is compatible with gha-runner-scale-set: eliandoran/runner@cba275f

It worked to bypass the issue I was having, but I ended up abandoning it since it did not work for my use case:

Error: Error: Building container actions is not currently supported
Error: Process completed with exit code 1.
Error: Executing the custom container implementation failed. Please contact your self hosted runner administrator.

Answer 5 · 2024-06-23T02:21:27.000Z

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

Answer 6 · 2024-06-29T02:11:57.000Z

This issue was closed because it has been stalled for 5 days with no activity.

Answer 7 · 2024-07-05T07:57:18.000Z

I managed to get this working for Actions Runner Controller.

The Dockerfile that I am using is looking as follows:

# The base image's source can be found here:
# https://github.com/myoung34/docker-github-actions-runner
FROM myoung34/github-runner:2.317.0-ubuntu-jammy

# Copied from https://github.com/actions/runner/blob/70746ff593636b07ad251a1525a3fabd1a7a36e9/images/Dockerfile#L37-L40
ENV DEBIAN_FRONTEND=noninteractive
ENV RUNNER_MANUALLY_TRAP_SIG=1
ENV ACTIONS_RUNNER_PRINT_LOG_TO_STDOUT=1
ENV ImageOS=ubuntu22

# Install necessary packages for the customer
RUN apt-get update && apt-get install --no-install-recommends -y \
        vim \
        git  \
        # Install Azure CLI
        mkdir -p /etc/apt/keyrings && \
        curl -sLS https://packages.microsoft.com/keys/microsoft.asc | gpg --dearmor | tee /etc/apt/keyrings/microsoft.gpg > /dev/null && \
        chmod go+r /etc/apt/keyrings/microsoft.gpg && \
        AZ_DIST=$(lsb_release -cs) && \
        echo "deb [arch=`dpkg --print-architecture` signed-by=/etc/apt/keyrings/microsoft.gpg] https://packages.microsoft.com/repos/azure-cli/ $AZ_DIST main" \
              | tee /etc/apt/sources.list.d/azure-cli.list && \
        AZ_VERSION=2.61.0 && \
        apt-get update && \
        apt-get install --no-install-recommends -y azure-cli=$AZ_VERSION-1~$AZ_DIST && \
        az aks install-cli --client-version "1.28.5" && \
        # Install Actions Runner Controller - Container Hooks which is used for containermode Kubernetes
        # Versions can be found at https://github.com/actions/runner-container-hooks/tags
        RUNNER_CONTAINER_HOOKS_VERSION=0.6.1 && \
        pushd /actions-runner && \
        # Lines below taken from https://github.com/actions/runner/blob/main/images/Dockerfile#L20
        curl -f -L -o runner-container-hooks.zip https://github.com/actions/runner-container-hooks/releases/download/v${RUNNER_CONTAINER_HOOKS_VERSION}/actions-runner-hooks-k8s-${RUNNER_CONTAINER_HOOKS_VERSION}.zip && \
        unzip ./runner-container-hooks.zip -d ./k8s && \
        rm runner-container-hooks.zip && \
        # Change this dir and file to the docker group, as is the case with the default GitHub runner image
        chgrp -R docker ./k8s && \
        # Go back the initial directory before we changed to '/actions-runner'
        popd

CMD ["./bin/Runner.Listener", "run", "--startuptype", "service"]

I am not sure whether the chgrp -R docker ./k8s is required as this was part of my troubleshootingg, so might be that this could be removed.

For the ARC controller I did not alter anything and just took this values file: https://github.com/actions/actions-runner-controller/blob/master/charts/gha-runner-scale-set-controller/values.yaml

For the runner-scale-set, I had to modify some things.

# Comes originally from: https://github.com/actions/actions-runner-controller/blob/master/charts/gha-runner-scale-set/values.yaml
githubConfigUrl: "https://github.com/your-org"
githubConfigSecret: some-secret-that-exists-in-my-k8s
minRunners: 1
maxRunners: 5
# !!! Must be created in GitHub before the runner scale set is created !!!
runnerGroup: "arc-runner-group"
containerMode:
  type: "kubernetes"
  kubernetesModeWorkVolumeClaim:
    accessModes: ["ReadWriteOnce"]
    # Got this by running 'kubectl get storageclass -A'
    storageClassName: "default"
    resources:
      requests:
        storage: 1Gi
  kubernetesModeServiceAccount:
    annotations:

template:
  spec:
    securityContext:
      # This points to the group ID that is created in the myoung image
      # In there, in the Dockerfile.base, we have a groupadd -g 121 runner
      # so, the group below should point to that same group ID.
      fsGroup: 121
    containers:
      - name: runner
        image: someacr.azurecr.io/github-runner:2.317.0-somehash
        env:
          # This environment variable is set to true by default.
          # If it is set to true and the workflow job does not have a container section,
          # it will fail to run the job.
          # If it is set to false, the job will run even without a container section.
          - name: ACTIONS_RUNNER_REQUIRE_JOB_CONTAINER
            value: "false"
          # For our custom image, the container hooks is installed in a different directory than /home/runner/k8s/index.js
          # hence we need to set this environment variable to the correct path that corresponds to the image.
          # If not set, we get the following in debug lines:
          # ##[debug] ---> System.IO.FileNotFoundException: File not found at '/home/runner/k8s/index.js'. Set ACTIONS_RUNNER_CONTAINER_HOOKS to the path of an existing file.
          - name: ACTIONS_RUNNER_CONTAINER_HOOKS
            value: "/actions-runner/k8s/index.js"
          # The environment variables under this comment are specifically used for the myoung image that we are using
          - name: ORG_NAME
            value: "your-org"
          - name: RUNNER_SCOPE
            value: "org"
          - name: APP_ID
            valueFrom:
              secretKeyRef:
                name: some-secret-that-exists-in-my-k8s
                key: github_app_id
          - name: APP_PRIVATE_KEY
            valueFrom:
              secretKeyRef:
                name: some-secret-that-exists-in-my-k8s
                key: github_app_private_key
          - name: EPHEMERAL
            value: "true"
          # Required to be 'true' otherwise Actions Runner Controller will run one-by-one (even though multiple pods are created on the cluster)
          - name: RUN_AS_ROOT
            value: "true"
          - name: DISABLE_AUTO_UPDATE
            value: "true"
    ## The following volumeMounts and volumes have to be added, otherwise the following error is thrown during job execution:
    ## ---> System.Exception: The hook script at '/actions-runner/k8s/index.js' running command 'RunScriptStep' did not execute successfully
        volumeMounts:
          - name: work
            mountPath: /_work
    volumes:
      - name: work
        ephemeral:
          volumeClaimTemplate:
            spec:
              accessModes: [ "ReadWriteOnce" ]
              # Got this from running 'kubectl get storageclass -A'
              storageClassName: "default"
              resources:
                requests:
                  storage: 1Gi

I tried to add as much comments as possible to my future self but also for whoever reads it here.

By using these I was able to get myoung's image working within ARC. It scales properly and it runs container jobs due to the use of hooks.