Bug: k8s auto upgrade fails if using all-in-one with http (ie. without https)
Closed this issue · 1 comments
dlarson04 commented
Describe the bug.
Set up : all-in-one hub with http (and not https)
Push one version into hub (ie. 2.31.0-1534)
Push newer version to hub (ie. 2.31.0-1540)
Install agent 2.31.0-1534 on an edgecluster
Create NMP to upgrade to 2.31.0-1540
Agent logs
I0617 23:31:14.178271 15 worker.go:353] CommandDispatcher: NodeManagement command processor blocking for commands
I0617 23:31:14.188107 15 node_management_status.go:40] Putting node management policy status for node myorg/my-edge-agent and policy MyNmpFor1540. Status is: AgentUpgrade: ScheduledTime: 2024-06-17T23:26:06Z, ActualStartTime: 2024-06-17T23:26:31Z, CompletionTime: , UpgradedVersions: SoftwareVersion: 2.31.0-1540, CertVersion: , ConfigVersion: 1.0.0, Status: initiated, K8S: <nil>, ErrorMessage: , BaseWorkingDirectory: /var/horizon/nmp, AgentUpgradeInternal: <nil>.
I0617 23:31:14.188338 15 rpc.go:94] Exchange RPC Invoking exchange PUT at http://9.46.84.245:3090/v1/orgs/myorg/nodes/my-edge-agent/managementStatus/MyNmpFor1540 with AgentUpgrade: ScheduledTime: 2024-06-17T23:26:06Z, ActualStartTime: 2024-06-17T23:26:31Z, CompletionTime: , UpgradedVersions: SoftwareVersion: 2.31.0-1540, CertVersion: , ConfigVersion: 1.0.0, Status: initiated, K8S: <nil>, ErrorMessage: , BaseWorkingDirectory: , AgentUpgradeInternal: <nil>
I0617 23:31:14.220752 15 cluster_upgrade_worker.go:603] Cluster upgrade worker: reading in agent config file: /var/horizon/nmp/myorg/MyNmpFor1540/agent-install.cfg
I0617 23:31:14.220912 15 cluster_install_files.go:53] Cluster upgrade worker: get HZN_EXCHANGE_URL=http://9.46.84.245:3090/v1
I0617 23:31:14.220927 15 cluster_install_files.go:53] Cluster upgrade worker: get HZN_FSS_CSSURL=http://9.46.84.245:9443/
I0617 23:31:14.220954 15 cluster_install_files.go:53] Cluster upgrade worker: get HZN_AGBOT_URL=http://9.46.84.245:3111
I0617 23:31:14.220963 15 cluster_install_files.go:53] Cluster upgrade worker: get HZN_FDO_SVC_URL=http://9.46.84.245:9008/api
I0617 23:31:14.220971 15 cluster_install_files.go:53] Cluster upgrade worker: get AGENT_NAMESPACE=openhorizon-agent
I0617 23:31:14.220984 15 cluster_install_files.go:53] Cluster upgrade worker: get HZN_CONFIG_VERSION=1.0.0
I0617 23:31:14.221012 15 kubeClient.go:62] Cluster upgrade worker: Read configmap value openhorizon-agent-config under agent namespace my-edge
I0617 23:31:14.221050 15 kubeClient.go:55] Cluster upgrade worker: Get configmap openhorizon-agent-config under agent namespace my-edge
I0617 23:31:14.237151 15 kubeClient.go:100] Cluster upgrade worker: In configmap openhorizon-agent-config find HZN_EXCHANGE_URL=http://9.46.84.245:3090/v1
I0617 23:31:14.237180 15 kubeClient.go:100] Cluster upgrade worker: In configmap openhorizon-agent-config find HZN_FSS_CSSURL=http://9.46.84.245:9443/
I0617 23:31:14.237188 15 kubeClient.go:100] Cluster upgrade worker: In configmap openhorizon-agent-config find HZN_AGBOT_URL=http://9.46.84.245:3111
I0617 23:31:14.237195 15 kubeClient.go:100] Cluster upgrade worker: In configmap openhorizon-agent-config find HZN_FDO_SVC_URL=http://9.46.84.245:9008/api
I0617 23:31:14.237205 15 kubeClient.go:100] Cluster upgrade worker: In configmap openhorizon-agent-config find HZN_DEVICE_ID=my-edge-agent
I0617 23:31:14.237212 15 kubeClient.go:100] Cluster upgrade worker: In configmap openhorizon-agent-config find HZN_NODE_ID=my-edge-agent
I0617 23:31:14.237218 15 kubeClient.go:100] Cluster upgrade worker: In configmap openhorizon-agent-config find HZN_AGENT_PORT=8510
I0617 23:31:14.237225 15 kubeClient.go:100] Cluster upgrade worker: In configmap openhorizon-agent-config find HZN_CONFIG_VERSION=
I0617 23:31:14.237235 15 cluster_upgrade_worker.go:637] Cluster upgrade worker: agent install config is same: false
I0617 23:31:14.240257 15 cluster_install_files.go:239] Cluster upgrade worker: configmap.needChange is set to true in status file
I0617 23:31:14.240280 15 cluster_upgrade_worker.go:645] Cluster upgrade worker: reading in agent cert file: /var/horizon/nmp/myorg/MyNmpFor1540/agent-install.crt
I0617 23:31:14.240304 15 cluster_upgrade_worker.go:443] Cluster upgrade worker: configIsSame: false, certIsSame: true, will need to validate config and cert for nmp myorg/MyNmpFor1540
E0617 23:31:14.240400 15 cluster_upgrade_worker.go:459] Cluster upgrade worker: Failed to validate exchangeURL and/or cert for nmp: myorg/MyNmpFor1540, error: open /etc/default/cert/agent-install.crt: no such file or directory
I0617 23:31:14.240415 15 cluster_upgrade_worker.go:259] Cluster upgrade worker: Set status to precheck failed in db and status file for nmp myorg/MyNmpFor1540
I0617 23:31:14.242427 15 node_management_status.go:15] Saving nmp status AgentUpgrade: ScheduledTime: 2024-06-17T23:26:06Z, ActualStartTime: 2024-06-17T23:26:31Z, CompletionTime: , UpgradedVersions: SoftwareVersion: 2.31.0-1540, CertVersion: , ConfigVersion: 1.0.0, Status: precheck failed, K8S: <nil>, ErrorMessage: Failed to validate exchangeURL and/or cert for nmp: myorg/MyNmpFor1540, error: open /etc/default/cert/agent-install.crt: no such file or directory, BaseWorkingDirectory: /var/horizon/nmp, AgentUpgradeInternal: AllowDowngrade: false, Manifest: IBM/edgeNodeFiles_manifest_2.31.0-1540, ScheduledUnixTime: 2024-06-17 23:26:06 +0000 UTC, LatestMap: SoftwareLatest: false, ConfigLatest: false, CertLatest: false
I0617 23:31:14.249051 15 node_management_status.go:40] Putting node management policy status for node myorg/my-edge-agent and policy MyNmpFor1540. Status is: AgentUpgrade: ScheduledTime: 2024-06-17T23:26:06Z, ActualStartTime: 2024-06-17T23:26:31Z, CompletionTime: , UpgradedVersions: SoftwareVersion: 2.31.0-1540, CertVersion: , ConfigVersion: 1.0.0, Status: precheck failed, K8S: <nil>, ErrorMessage: Failed to validate exchangeURL and/or cert for nmp: myorg/MyNmpFor1540, error: open /etc/default/cert/agent-install.crt: no such file or directory, BaseWorkingDirectory: /var/horizon/nmp, AgentUpgradeInternal: <nil>.
I0617 23:31:14.249207 15 rpc.go:94] Exchange RPC Invoking exchange PUT at http://9.46.84.245:3090/v1/orgs/myorg/nodes/my-edge-agent/managementStatus/MyNmpFor1540 with AgentUpgrade: ScheduledTime: 2024-06-17T23:26:06Z, ActualStartTime: 2024-06-17T23:26:31Z, CompletionTime: , UpgradedVersions: SoftwareVersion: 2.31.0-1540, CertVersion: , ConfigVersion: 1.0.0, Status: precheck failed, K8S: <nil>, ErrorMessage: Failed to validate exchangeURL and/or cert for nmp: myorg/MyNmpFor1540, error: open /etc/default/cert/agent-install.crt: no such file or directory, BaseWorkingDirectory: , AgentUpgradeInternal: <nil>
I0617 23:31:14.289990 15 cluster_upgrade_worker.go:286] Cluster upgrade worker: Status is updated to AgentUpgrade: ScheduledTime: 2024-06-17T23:26:06Z, ActualStartTime: 2024-06-17T23:26:31Z, CompletionTime: , UpgradedVersions: SoftwareVersion: 2.31.0-1540, CertVersion: , ConfigVersion: 1.0.0, Status: precheck failed, K8S: <nil>, ErrorMessage: Failed to validate exchangeURL and/or cert for nmp: myorg/MyNmpFor1540, error: open /etc/default/cert/agent-install.crt: no such file or directory, BaseWorkingDirectory: /var/horizon/nmp, AgentUpgradeInternal: AllowDowngrade: false, Manifest: IBM/edgeNodeFiles_manifest_2.31.0-1540, ScheduledUnixTime: 2024-06-17 23:26:06 +0000 UTC, LatestMap: SoftwareLatest: false, ConfigLatest: false, CertLatest: false for nmp myorg/MyNmpFor1540
I0617 23:31:14.290041 15 worker.go:325] CommandDispatcher: ClusterUpgrade handled command (*clusterupgrade.ClusterUpgradeCommand)
I0617 23:31:14.290061 15 worker.go:353] CommandDispatcher: ClusterUpgrade command processor blocking for commands
I
Since this is http, there the check for the cert should be skipped if just doing a software update or a config update...
A cert update should just be ignored I expect.
Describe the steps to reproduce the behavior.
No response
Expected behavior.
No response
Screenshots.
No response
Operating Environment
Linux
Additional Information
No response
dlarson04 commented
Closing this.. the manifest had to upgrade the certificate which was invalid in this case