(nucleus): greengrass system service cannot launch
Closed this issue · 3 comments
Describe the bug
The greengrass system service stuck in an activating state and kept exiting.
To Reproduce
- Create a deployment that has only one component (zipped artifact 20MB ) and that component only has
Install
script to run. - After the deployment is sent to an on-premise device but before it starts to execute bootstraps, I ran the following commands
sudo systemctl restart greengrass.service
andsudo systemctl stop greengrass.service
3 times. - The greengrass system service stuck in an
activating
state.
burro@burro-8-883:~$ sudo systemctl status greengrass.service
● greengrass.service - Greengrass Core
Loaded: loaded (/etc/systemd/system/greengrass.service; enabled; vendor preset: enabled)
Active: activating (auto-restart) (Result: exit-code) since Tue 2022-10-18 18:25:27 EDT; 1s ago
Process: 13146 ExecStart=/bin/sh /greengrass/v2/alts/current/distro/bin/loader (code=exited, status=127)
Main PID: 13146 (code=exited, status=127)
- After rebooting, the greengrass system service still cannot launch.
- In the end, I need to follow the instrruction to uninstall the greengrass service, delete the greengrass IOT device/core device on the cloud, then reinstall the greengrass service back.
Expected behavior
The greengrass service should be alive after rebooting the computer
Actual behavior
The greengrass system service is stuck in an activating state
Environment
- OS: Ubuntu 20.04
- JDK version: 8
- Nucleus version: probably 2.5.x ( not sure )
Additional context
The log I was looking into what happened to the greengrass service
burro@burro-8-883:~$ sudo tail -50 /greengrass/v2/logs/greengrass.log
2022-10-18T22:05:08.770Z [INFO] (TelemetryAgent-lifecycle) com.aws.greengrass.telemetry.TelemetryAgent: service-set-state. {serviceName=TelemetryAgent, currentState=STOPPING, newState=FINISHED}
2022-10-18T22:05:09.631Z [INFO] (DeploymentService-lifecycle) com.aws.greengrass.deployment.DeploymentService: Stopping backingTask start. {serviceName=DeploymentService, currentState=STOPPING}
2022-10-18T22:05:09.631Z [INFO] (DeploymentService-lifecycle) com.aws.greengrass.deployment.DeploymentService: service-set-state. {serviceName=DeploymentService, currentState=STOPPING, newState=FINISHED}
2022-10-18T22:05:09.631Z [WARN] (pool-2-thread-14) com.aws.greengrass.deployment.DeploymentService: service-run-interrupted. Service interrupted while running startup. {serviceName=DeploymentService, currentState=STOPPING}
2022-10-18T22:05:09.640Z [INFO] (Serialized listener processor) com.aws.greengrass.lifecyclemanager.GenericExternalService: service-config-change. Requesting restart for component. {configNode=services.burro-configurator.lifecycle.Bootstrap.Script, serviceName=burro-configurator, currentState=FINISHED}
2022-10-18T22:05:09.640Z [INFO] (pool-2-thread-4) com.aws.greengrass.deployment.DeploymentDirectoryManager: Persist configuration snapshot. {file=/greengrass/v2/deployments/50d5cf45-d057-4c58-8292-9b5e5fcbe796/target_config.tlog}
2022-10-18T22:05:09.643Z [INFO] (Serialized listener processor) com.aws.greengrass.lifecyclemanager.GenericExternalService: service-config-change. Requesting reinstallation for component. {configNode=services.burro-configurator.version, serviceName=burro-configurator, currentState=FINISHED}
2022-10-18T22:05:09.646Z [INFO] (Serialized listener processor) com.aws.greengrass.lifecyclemanager.GenericExternalService: service-config-change. Requesting restart for component. {configNode=services.main.lifecycle, serviceName=main, currentState=FINISHED}
2022-10-18T22:05:09.663Z [INFO] (pool-2-thread-4) com.aws.greengrass.deployment.bootstrap.BootstrapManager: Saving bootstrap task list to file. {filePath=/greengrass/v2/deployments/50d5cf45-d057-4c58-8292-9b5e5fcbe796/bootstrap_task.json}
2022-10-18T22:05:09.667Z [INFO] (pool-2-thread-4) com.aws.greengrass.deployment.bootstrap.BootstrapManager: Bootstrap task list is saved to file. {filePath=/greengrass/v2/deployments/50d5cf45-d057-4c58-8292-9b5e5fcbe796/bootstrap_task.json}
2022-10-18T22:05:09.667Z [INFO] (pool-2-thread-4) com.aws.greengrass.lifecyclemanager.KernelAlternatives: Setting up launch directory for new Nucleus. {}
2022-10-18T22:05:09.669Z [INFO] (pool-2-thread-4) com.aws.greengrass.lifecyclemanager.KernelAlternatives: Finished setup of launch directory for new Nucleus. {}
2022-10-18T22:05:09.669Z [INFO] (pool-2-thread-4) com.aws.greengrass.deployment.bootstrap.BootstrapManager: Execute component bootstrap step. {componentName=burro-configurator}
2022-10-18T22:05:27.085Z [WARN] (Thread-1) com.aws.greengrass.lifecyclemanager.Kernel: Shutting down Nucleus due to external signal. {}
2022-10-18T22:05:27.085Z [INFO] (Thread-1) com.aws.greengrass.lifecyclemanager.KernelLifecycle: system-shutdown. {main=services.main:FINISHED}
2022-10-18T22:05:27.087Z [INFO] (pool-2-thread-4) com.aws.greengrass.deployment.bootstrap.BootstrapManager: Saving bootstrap task list to file. {filePath=/greengrass/v2/deployments/50d5cf45-d057-4c58-8292-9b5e5fcbe796/bootstrap_task.json}
2022-10-18T22:05:27.097Z [INFO] (pool-2-thread-4) com.aws.greengrass.deployment.bootstrap.BootstrapManager: Bootstrap task list is saved to file. {filePath=/greengrass/v2/deployments/50d5cf45-d057-4c58-8292-9b5e5fcbe796/bootstrap_task.json}
2022-10-18T22:05:27.097Z [INFO] (Thread-1) com.aws.greengrass.lifecyclemanager.Kernel: effective-config-dump-complete. {file=/greengrass/v2/config/effectiveConfig.yaml}
2022-10-18T22:05:27.097Z [INFO] (pool-2-thread-20) com.aws.greengrass.status.FleetStatusService: service-close. Service is now closing. {serviceName=FleetStatusService, currentState=FINISHED}
2022-10-18T22:05:27.097Z [INFO] (pool-2-thread-9) com.aws.greengrass.lifecyclemanager.UpdateSystemPolicyService: service-close. Service is now closing. {serviceName=UpdateSystemPolicyService, currentState=FINISHED}
2022-10-18T22:05:27.097Z [INFO] (pool-2-thread-14) com.aws.greengrass.lifecyclemanager.GenericExternalService: service-close. Service is now closing. {serviceName=aws.greengrass.Nucleus, currentState=FINISHED}
2022-10-18T22:05:27.097Z [INFO] (pool-2-thread-16) com.aws.greengrass.lifecyclemanager.GenericExternalService: service-close. Service is now closing. {serviceName=main, currentState=FINISHED}
2022-10-18T22:05:27.097Z [INFO] (pool-2-thread-2) com.aws.greengrass.deployment.DeploymentService: service-close. Service is now closing. {serviceName=DeploymentService, currentState=FINISHED}
2022-10-18T22:05:27.097Z [INFO] (Thread-1) com.aws.greengrass.lifecyclemanager.KernelLifecycle: Waiting for services to shutdown. {}
2022-10-18T22:05:27.097Z [INFO] (pool-2-thread-9) com.aws.greengrass.lifecyclemanager.GenericExternalService: service-close. Service is now closing. {serviceName=burro-configurator, currentState=FINISHED}
2022-10-18T22:05:27.097Z [INFO] (pool-2-thread-14) com.aws.greengrass.telemetry.TelemetryAgent: service-close. Service is now closing. {serviceName=TelemetryAgent, currentState=FINISHED}
2022-10-18T22:05:27.097Z [INFO] (pool-2-thread-4) com.aws.greengrass.deployment.activator.DeploymentActivator: merge-config. Rolling back failed deployment. {deploymentId=50d5cf45-d057-4c58-8292-9b5e5fcbe796}
com.aws.greengrass.deployment.exceptions.ServiceUpdateException: Fail to execute bootstrap step for burro-configurator, exit code: 143
at com.aws.greengrass.deployment.bootstrap.BootstrapManager.executeAllBootstrapTasksSequentially(BootstrapManager.java:435)
at com.aws.greengrass.deployment.activator.KernelUpdateActivator.activate(KernelUpdateActivator.java:84)
at com.aws.greengrass.deployment.DeploymentConfigMerger.updateActionForDeployment(DeploymentConfigMerger.java:150)
at com.aws.greengrass.deployment.DeploymentConfigMerger.lambda$mergeInNewConfig$0(DeploymentConfigMerger.java:102)
at com.aws.greengrass.lifecyclemanager.UpdateSystemPolicyService.runUpdateActions(UpdateSystemPolicyService.java:95)
at com.aws.greengrass.lifecyclemanager.UpdateSystemPolicyService.lambda$startup$0(UpdateSystemPolicyService.java:165)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
2022-10-18T22:05:27.097Z [INFO] (pool-2-thread-4) com.aws.greengrass.deployment.DeploymentDirectoryManager: Persist deployment metadata. {file=/greengrass/v2/deployments/50d5cf45-d057-4c58-8292-9b5e5fcbe796/deployment_metadata.json, deploymentId=50d5cf45-d057-4c58-8292-9b5e5fcbe796}
2022-10-18T22:05:27.098Z [WARN] (FleetStatusService-lifecycle) com.aws.greengrass.status.FleetStatusService: service-state-transition-interrupted. Service lifecycle thread interrupted. Thread will exit now. {serviceName=FleetStatusService, currentState=FINISHED}
2022-10-18T22:05:27.098Z [WARN] (pool-2-thread-17) com.aws.greengrass.mqttclient.MqttClient: Shutting down spooler task. {}
2022-10-18T22:05:27.098Z [INFO] (pool-2-thread-5) com.aws.greengrass.deployment.DeploymentService: deployment-task-execution. Cancelled deployment merge future due to interrupt, update may not get cancelled if it is already being applied. {deploymentId=50d5cf45-d057-4c58-8292-9b5e5fcbe796, serviceName=DeploymentService, currentState=FINISHED}
2022-10-18T22:05:27.099Z [INFO] (Serialized listener processor) com.aws.greengrass.lifecyclemanager.KernelLifecycle: executor-service-shutdown-initiated. {}
2022-10-18T22:05:27.099Z [INFO] (Thread-1) com.aws.greengrass.lifecyclemanager.KernelLifecycle: Waiting for executors to shutdown. {}
2022-10-18T22:05:27.099Z [INFO] (Thread-1) com.aws.greengrass.lifecyclemanager.KernelLifecycle: executor-service-shutdown-complete. {executor-terminated=false, scheduled-executor-terminated=true}
2022-10-18T22:05:27.099Z [INFO] (Thread-1) com.aws.greengrass.lifecyclemanager.KernelLifecycle: context-shutdown-initiated. {}
2022-10-18T22:05:27.102Z [INFO] (Thread-3) com.aws.greengrass.mqttclient.AwsIotMqttClient: Connection interrupted. {clientId=burro-8-883}
2022-10-18T22:05:27.102Z [INFO] (Thread-1) com.aws.greengrass.lifecyclemanager.KernelLifecycle: context-shutdown-complete. {}
I then checked why the ExecStart
script kept failing, but the /greengrass/v2/alts/current
symlink is gone
burro@burro-8-883:~$ /bin/sh /greengrass/v2/alts/current/distro/bin/loader
/bin/sh: 0: Can't open /greengrass/v2/alts/current/distro/bin/loader
burro@burro-8-883:~$ sudo su
root@burro-8-883:/home/burro# /bin/sh /greengrass/v2/alts/
50d5cf45-d057-4c58-8292-9b5e5fcbe796/ fe7fed58-cc49-4b27-8c96-51dd1cef1fef/
broken/ old/
The impact of the bug is that I need to manually remove the greengrass system service, delete cloud agents and install the greengrass system service back.
E.g. what is the impact of the bug?
Hello @DeboBurro,
Could you share the full greengrass log?
From what you shared, I can see the deployment was executing a bootstrap step when it receives external shutdown signal. Because of that, deployment would fail after Nucleus restart.
With more log I can understand better what you mean by greengrass system service stuck in an activating state
.
Hi @junfuchen99 , I don't have the log anymore since I already reinstalled the greengrass service entirely.
The Nucleus was trying to activate but failed to activate immediately, I think it's because the /greengrass/v2/alts/current
symbolic link is removed somehow during the shutdown process.
We can reopen this issue next time I encounter again.
Thanks, please reach out to us if you run into this issue again.