Help debugging runner deployment "Operation cannot be fulfilled..."
brian-pickens opened this issue · 5 comments
Hey @davidkarlsen,
I appreciate that you are so active in this project. Its been a very useful tool for us. Recently after trying to update my runner image which is a custom image based on your runner (just to install a few extra things), I am now unable to get a runner to start. I've verified that the image is being pulled properly. Its not a runner startup issue as the operator seems to be unable to even start a runner. The following logs is the only error I have come across. I've tried destroying the entire deployment and namespace and rebuilding everything including using my original image, but it won't spin up any new images. One recent change that may or may not be relevant is we just enabled SSO for our org but I don't know if or how that would effect a github app. I did regenerate and apply a new app certificate to rule that out.
2022-04-04T12:06:39.631Z ERROR util.api unable to update status {"error": "Operation cannot be fulfilled on githubactionrunners.garo.tietoevry.com \"runner-pool\": the object has been modified; please apply your changes to the latest version and try again"}
Here is the full log
2022-04-04T12:06:21.807Z INFO controller-runtime.metrics metrics server is starting to listen {"addr": ":8080"}
2022-04-04T12:06:21.809Z INFO setup starting manager
I0404 12:06:21.810499 33 leaderelection.go:248] attempting to acquire leader lease actions-runner/4ef9cd91.tietoevry.com...
2022-04-04T12:06:21.810Z INFO starting metrics server {"path": "/metrics"}
I0404 12:06:21.827065 33 leaderelection.go:258] successfully acquired lease actions-runner/4ef9cd91.tietoevry.com
2022-04-04T12:06:21.827Z DEBUG events Normal {"object": {"kind":"ConfigMap","namespace":"actions-runner","name":"4ef9cd91.tietoevry.com","uid":"00b457a5-2ab3-45a8-96ca-3ac1c998feab","apiVersion":"v1","resourceVersion":"37026813"}, "reason": "LeaderElection", "message": "github-actions-runner-operator-bf669486f-xht8h_7e4e34ca-f5e4-4e68-8898-974ba7c1c90a became leader"}
2022-04-04T12:06:21.827Z INFO controller.githubactionrunner Starting EventSource {"reconciler group": "garo.tietoevry.com", "reconciler kind": "GithubActionRunner", "source": "kind source: /, Kind="}
2022-04-04T12:06:21.827Z INFO controller.githubactionrunner Starting EventSource {"reconciler group": "garo.tietoevry.com", "reconciler kind": "GithubActionRunner", "source": "kind source: /, Kind="}
2022-04-04T12:06:21.827Z INFO controller.githubactionrunner Starting EventSource {"reconciler group": "garo.tietoevry.com", "reconciler kind": "GithubActionRunner", "source": "kind source: /, Kind="}
2022-04-04T12:06:21.827Z INFO controller.githubactionrunner Starting Controller {"reconciler group": "garo.tietoevry.com", "reconciler kind": "GithubActionRunner"}
2022-04-04T12:06:21.929Z INFO controller.githubactionrunner Starting workers {"reconciler group": "garo.tietoevry.com", "reconciler kind": "GithubActionRunner", "worker count": 1}
2022-04-04T12:06:38.603Z INFO controllers.GithubActionRunner Reconciling GithubActionRunner {"githubactionrunner": "actions-runner/runner-pool"}
2022-04-04T12:06:39.130Z INFO controllers.GithubActionRunner Registration secret not found, creating {"githubactionrunner": "actions-runner/runner-pool"}
2022-04-04T12:06:39.371Z INFO controllers.GithubActionRunner Pods and runner API not in sync, returning earl{"githubactionrunner": "actions-runner/runner-pool"}
2022-04-04T12:06:39.389Z INFO controllers.GithubActionRunner Reconciling GithubActionRunner {"githubactionrunner": "actions-runner/runner-pool"}
2022-04-04T12:06:39.621Z INFO controllers.GithubActionRunner Pods and runner API not in sync, returning earl{"githubactionrunner": "actions-runner/runner-pool"}
2022-04-04T12:06:39.631Z ERROR util.api unable to update status {"error": "Operation cannot be fulfilled on githubactionrunners.garo.tietoevry.com \"runner-pool\": the object has been modified; please apply your changes to the latest version and try again"}
github.com/evryfs/github-actions-runner-operator/controllers.(*GithubActionRunnerReconciler).manageOutcome
/workspace/controllers/githubactionrunner_controller.go:181
github.com/evryfs/github-actions-runner-operator/controllers.(*GithubActionRunnerReconciler).handleScaling
/workspace/controllers/githubactionrunner_controller.go:122
github.com/evryfs/github-actions-runner-operator/controllers.(*GithubActionRunnerReconciler).Reconcile
/workspace/controllers/githubactionrunner_controller.go:97
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.10.2/pkg/internal/controller/controller.go:114
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.10.2/pkg/internal/controller/controller.go:311
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.10.2/pkg/internal/controller/controller.go:266
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.10.2/pkg/internal/controller/controller.go:227
2022-04-04T12:06:39.631Z ERROR controller.githubactionrunner Reconciler error {"reconciler group": "garo.tietoevry.com", "reconciler kind": "GithubActionRunner", "name": "runner-pool", "namespace": "actions-runner", "error": "Operation cannot be fulfilled on githubactionrunners.garo.tietoevry.com \"runner-pool\": the object has been modified; please apply your changes to the latest version and try again"}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.10.2/pkg/internal/controller/controller.go:227
2022-04-04T12:06:39.637Z INFO controllers.GithubActionRunner Reconciling GithubActionRunner {"githubactionrunner": "actions-runner/runner-pool"}
2022-04-04T12:06:39.868Z INFO controllers.GithubActionRunner Pods and runner API not in sync, returning earl{"githubactionrunner": "actions-runner/runner-pool"}
2022-04-04T12:07:39.884Z INFO controllers.GithubActionRunner Reconciling GithubActionRunner {"githubactionrunner": "actions-runner/runner-pool"}
Here is the describe on my runner pool
Name: runner-pool
Namespace: actions-runner
Labels: <none>
Annotations: <none>
API Version: garo.tietoevry.com/v1alpha1
Kind: GithubActionRunner
Metadata:
Creation Timestamp: 2022-04-04T12:06:38Z
Generation: 2
Managed Fields:
API Version: garo.tietoevry.com/v1alpha1
Fields Type: FieldsV1
fieldsV1:
f:metadata:
f:annotations:
.:
f:kubectl.kubernetes.io/last-applied-configuration:
f:spec:
.:
f:deletionOrder:
f:maxRunners:
f:minRunners:
f:minTtl:
f:organization:
f:podTemplateSpec:
.:
f:metadata:
.:
f:annotations:
.:
f:prometheus.io/port:
f:prometheus.io/scrape:
f:sensor.falcon-system.crowdstrike.com/injection:
f:spec:
.:
f:affinity:
.:
f:podAntiAffinity:
.:
f:preferredDuringSchedulingIgnoredDuringExecution:
f:containers:
f:volumes:
f:reconciliationPeriod:
Manager: kubectl-client-side-apply
Operation: Update
Time: 2022-04-04T12:06:38Z
API Version: garo.tietoevry.com/v1alpha1
Fields Type: FieldsV1
fieldsV1:
f:status:
.:
f:conditions:
.:
k:{"type":"ReconcileSuccess"}:
.:
f:lastTransitionTime:
f:message:
f:observedGeneration:
f:reason:
f:status:
f:type:
f:currentSize:
Manager: manager
Operation: Update
Time: 2022-04-04T12:06:39Z
Resource Version: 37028843
UID: 96a6197c-6da3-438d-b424-c0104b3f21bd
Spec:
Deletion Order: LeastRecent
Max Runners: 8
Min Runners: 2
Min Ttl: 0m
Organization: foo
Pod Template Spec:
Metadata:
Annotations:
prometheus.io/port: 3903
prometheus.io/scrape: true
sensor.falcon-system.crowdstrike.com/injection: disabled
Spec:
Affinity:
Pod Anti Affinity:
Preferred During Scheduling Ignored During Execution:
Pod Affinity Term:
Label Selector:
Match Expressions:
Key: foo.com/pool
Operator: In
Values:
runner-pool
Topology Key: kubernetes.io/hostname
Weight: 100
Containers:
Env:
Name: ACTIONS_RUNNER_INPUT_LABELS
Value: nonprod
Name: DOCKER_TLS_CERTDIR
Value: /certs
Name: DOCKER_HOST
Value: tcp://localhost:2376
Name: DOCKER_TLS_VERIFY
Value: 1
Name: DOCKER_CERT_PATH
Value: /certs/client
Name: GH_ORG
Value: fooorg
Env From:
Secret Ref:
Name: runner-pool-regtoken
Image: myrepo/github-actions-runner:latest
Image Pull Policy: Always
Name: runner
Resources:
Volume Mounts:
Mount Path: /certs
Name: docker-certs
Mount Path: /home/runner/_diag
Name: runner-diag
Mount Path: /home/runner/_work
Name: runner-work
Args:
--mtu=1430
Env:
Name: DOCKER_TLS_CERTDIR
Value: /certs
Image: docker:stable-dind
Image Pull Policy: Always
Name: docker
Resources:
Security Context:
Privileged: true
Volume Mounts:
Mount Path: /var/lib/docker
Name: docker-storage
Mount Path: /certs
Name: docker-certs
Mount Path: /home/runner/_work
Name: runner-work
Image: quay.io/evryfs/github-actions-runner-metrics:v0.0.3
Name: exporter
Ports:
Container Port: 3903
Protocol: TCP
Volume Mounts:
Mount Path: /_diag
Name: runner-diag
Read Only: true
Volumes:
Empty Dir:
Name: runner-work
Empty Dir:
Name: runner-diag
Empty Dir:
Name: mvn-repo
Empty Dir:
Name: docker-storage
Empty Dir:
Name: docker-certs
Reconciliation Period: 1m
Status:
Conditions:
Last Transition Time: 2022-04-04T12:15:42Z
Message:
Observed Generation: 2
Reason: LastReconcileCycleSucceded
Status: True
Type: ReconcileSuccess
Current Size: 0
Events: <none>
Another thing, I have two runner operators running in different clusters. A 'prod' instance, and a 'nonprod' instance. Could it be that it is picking up the runner from the other operator? How would it differentiate between them?
the problem is this:
2022-04-04T12:06:39.868Z INFO controllers.GithubActionRunner Pods and runner API not in sync, returning earl{"githubactionrunner": "actions-runner/runner-pool"}
check for broken runners in the GH UI and forcefully delete them.
AFAIR the pools you create in the two clusters need to have different names.
I appreciate the quick follow up. Here are my current runners. The first runner is from our 'prod' pool. The other three runners are just running locally from my workstation so our CI isn't completely down while I'm troubleshooting this issue. Those three extra runners don't seem to make a difference if they are there either way.
Well what do you know. I renamed my runner pool to something more specific and now the runners are starting up.
Closing issue since this was resolved. Feel free to open another issue if anything needs to be fixed.