evryfs/github-actions-runner-operator

"A runner exists with the same name" in runner logs

Opened this issue · 0 comments

Greetings,
It's the same issue as #368: after the node restart, runner pool pod does not start. Here are the logs:

$ kubectl logs runner-pool-pod-9rmfv -c runner -n github-actions-runner-operator 

# Runner removal

Cannot connect to server, because config files are missing. Skipping removing runner from the server.
Does not exist. Skipping Removing .credentials
Does not exist. Skipping Removing .runner


--------------------------------------------------------------------------------
|        ____ _ _   _   _       _          _        _   _                      |
|       / ___(_) |_| | | |_   _| |__      / \   ___| |_(_) ___  _ __  ___      |
|      | |  _| | __| |_| | | | | '_ \    / _ \ / __| __| |/ _ \| '_ \/ __|     |
|      | |_| | | |_|  _  | |_| | |_) |  / ___ \ (__| |_| | (_) | | | \__ \     |
|       \____|_|\__|_| |_|\__,_|_.__/  /_/   \_\___|\__|_|\___/|_| |_|___/     |
|                                                                              |
|                       Self-hosted runner registration                        |
|                                                                              |
--------------------------------------------------------------------------------

# Authentication


√ Connected to GitHub

# Runner Registration




A runner exists with the same name
A runner exists with the same name runner-pool-pod-9rmfv.

Indeed, in Github repo Runners section does exist a runner with the same name in "Offline" state.

Here's the CR:

apiVersion: garo.tietoevry.com/v1alpha1
kind: GithubActionRunner
metadata:
  name: runner-pool
  namespace: github-actions-runner-operator
spec:
  minRunners: 1
  maxRunners: 6
  organization: myorg
  reconciliationPeriod: 1m
  repository: "myrepo"
  podTemplateSpec:
    metadata:
      annotations:
        "prometheus.io/scrape": "true"
        "prometheus.io/port": "3903"
    spec:
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
            - weight: 100
              podAffinityTerm:
                topologyKey: kubernetes.io/hostname
                labelSelector:
                  matchExpressions:
                    - key: garo.tietoevry.com/pool
                      operator: In
                      values:
                        - runner-pool
      containers:
        - name: runner
          env:
            - name: RUNNER_DEBUG
              value: "true"
            - name: DOCKER_TLS_CERTDIR
              value: /certs
            - name: DOCKER_HOST
              value: tcp://localhost:2376
            - name: DOCKER_TLS_VERIFY
              value: "1"
            - name: DOCKER_CERT_PATH
              value: /certs/client
            - name: GH_ORG
              value: myorg
          #if runner for repo:
            - name: GH_REPO
              value: myrepo
          envFrom:
            - secretRef:
                name: runner-pool-regtoken
          # find the fixed-in-time tags at https://quay.io/repository/evryfs/github-actions-runner?tab=tags if you want to avoid pulling on a moving tag
          # due to https://github.com/actions/runner/issues/246 the runner sw needs to be recent
          # you can subscribe to release-feeds at https://github.com/evryfs/github-actions-runner/releases.atom
          image: quay.io/evryfs/github-actions-runner:master
          imagePullPolicy: Always
          resources: {}
          volumeMounts:
            - mountPath: /certs
              name: docker-certs
            - mountPath: /home/runner/_diag
              name: runner-diag
            - mountPath: /home/runner/_work
              name: runner-work
        - name: docker
          env:
            - name: DOCKER_TLS_CERTDIR
              value: /certs
          image: docker:stable-dind
          imagePullPolicy: Always
          args:
            # See linked issues from: https://github.com/evryfs/github-actions-runner-operator/issues/39
            - --mtu=1430
          resources: {}
          securityContext:
            privileged: true
          volumeMounts:
            - mountPath: /var/lib/docker
              name: docker-storage
            - mountPath: /certs
              name: docker-certs
            - mountPath: /home/runner/_work
              name: runner-work
        - name: exporter
          image: quay.io/evryfs/github-actions-runner-metrics:v0.0.6
          ports:
            - containerPort: 3903
              protocol: TCP
          volumeMounts:
            - name: runner-diag
              mountPath: /_diag
              readOnly: true
      volumes:
        - name: runner-work
          emptyDir: {}
        - name: runner-diag
          emptyDir: {}
        - name: docker-storage
          emptyDir: {}
        - name: docker-certs
          emptyDir: {}

I've installed this operator via the helm chart:

helm install github-actions-runner-operator evryfs-oss/github-actions-runner-operator --namespace github-actions-runner-operator --set githubapp.existingSecret=github-runner-app --set githubapp.enabled=true

If I delete runner from GitHub and delete runner-pool pod, it'd have been recreated and works normally, but I'm in a situation when the cluster node restarts daily so it it's not a viable solution for me. Can this be fixed?