Agent starts running and then exits with Cannot translate JSON, ERROR is exit status 1
altonotch opened this issue · 4 comments
Describe the bug
Agent is installed on Raspberry Pi, successfully fetches the config, says its valid. About 30 seconds after it starts running - exits with an error: Cannot translate JSON, ERROR is exit status 1
Steps to reproduce
Install the agent:
wget https://amazoncloudwatch-agent.s3.amazonaws.com/ubuntu/arm64/latest/amazon-cloudwatch-agent.deb && \
dpkg -i -E ./amazon-cloudwatch-agent.deb
Have SSM agent installed as well. Update common-config.toml
to use shared_credential_profile = "default"
.
Use following configs json in ssm:
{
"agent": {
"logfile": "/opt/aws/amazon-cloudwatch-agent/logs/amazon-cloudwatch-agent.log",
"metrics_collection_interval": 10
},
"logs": {
"force_flush_interval": 15,
"log_stream_name": "{instance_id}",
"logs_collected": {
"files": {
"collect_list": [
{
"file_path": "/var/log/ecs/ecs-agent.log*",
"log_group_name": "group/ecs-agent",
"log_stream_name": "external/ecs-agent.log",
"timezone": "UTC"
},
{
"file_path": "/var/log/ecs/ecs-init.log*",
"log_group_name": "group/ecs-init",
"log_stream_name": "external/ecs-init.log",
"timezone": "UTC"
},
{
"file_path": "/var/log/ecs/audit.log*",
"log_group_name": "group/audit",
"log_stream_name": "external/audit.log",
"timezone": "UTC"
},
{
"file_path": "/opt/aws/amazon-cloudwatch-agent/logs/amazon-cloudwatch-agent.log*",
"log_group_name": "group/cloudwatch-agent",
"log_stream_name": "external/cw-agent.log",
"timezone": "UTC"
}
]
}
}
}
}
Start the agent:
amazon-cloudwatch-agent-ctl -m onPrem -s -c ssm:AmazonCloudWatch-Config-External -a fetch-config
See output:
****** processing amazon-cloudwatch-agent ******
Got Home directory: /root I! Set home dir Linux: /root I! SDKRegionWithCredsMap region: us-east-2 Region: us-east-2 credsConfig: map[shared_credential_profile:default] Successfully fetched the config and saved in /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.d/ssm_AmazonCloudWatch-Config-External.tmp
Start configuration validation...
2023/12/05 21:36:26 Reading json config file path: /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.d/ssm_AmazonCloudWatch-Config-External.tmp ...
2023/12/05 21:36:26 I! Valid Json input schema.
2023/12/05 21:36:26 Configuration validation first phase succeeded
I! Detecting run_as_user...
Got Home directory: /root
I! Set home dir Linux: /root
I! SDKRegionWithCredsMap region: us-east-2
Got Home directory: /root
/opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent -schematest -config /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.toml
Configuration validation second phase succeeded
Configuration validation succeeded
Check status after 30 seconds:
amazon-cloudwatch-agent-ctl -m onPremise -a status
Says:
{
"status": "stopped",
"starttime": "",
"configstatus": "configured",
"version": "1.300031.0b313"
}
The agent is now in a loop of start-stop.
What did you expect to see?
Running agent.
What did you see instead?
Stopped
What version did you use?
Version: 1.300031.0b313
What config did you use?
See above.
Environment
OS: Raspberry Pi, 20231012~bullseye
Additional context
Same configs used to work on an older version of agent.
I downloaded nightly build. It failed to run with error:
****** processing amazon-cloudwatch-agent ******
Got Home directory: /root I! Set home dir Linux: /root I! SDKRegionWithCredsMap region: us-east-2 Region: us-east-2 credsConfig: map[shared_credential_profile:default] Successfully fetched the config and saved in /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.d/ssm_AmazonCloudWatch-Config-External.tmp
Start configuration validation...
2023/12/05 22:18:45 Reading json config file path: /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.d/ssm_AmazonCloudWatch-Config-External.tmp ...
2023/12/05 22:18:45 I! Valid Json input schema.
I! Detecting run_as_user...
Got Home directory: /root
Got Home directory: /root
I! Set home dir Linux: /root
I! SDKRegionWithCredsMap region: us-east-2
No csm configuration found.
No metric configuration found.
Configuration validation first phase succeeded
/opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent -schematest -config /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.toml
Configuration validation second phase succeeded
Configuration validation succeeded
amazon-cloudwatch-agent has already been stopped
Tail on cloudwatch logs:
I! Set home dir Linux: /root
I! SDKRegionWithCredsMap region:
Got Home directory: /root
2023/12/05 22:18:52 E! Failed to generate configuration validation content.
2023/12/05 22:18:52 E! Failed to generate configuration validation content.
2023/12/05 22:18:52 Under path : /agent/ruleRegion/ | Error : Region info is missing for mode: onPrem
2023/12/05 22:18:52 Configuration validation first phase failed. Agent version: 1.0. Verify the JSON input is only using features supported by this version.
2023/12/05 22:18:52 I! Return exit error: exit code=1
2023/12/05 22:18:52 E! Cannot translate JSON config into TOML, ERROR is exit status 1
I modified the json configs to include region
under agent
section and now it runs w.o failing.
Not officially, but it will work. You need to set region in the config since this would be considered on prem install.
Oh, I see you set the region and it worked. That is as expected. The agent when running outside aws infra requires you to pass in a region because it can't find a default of where the instance of the agent is running.
Hello,
Did adding the region to the config fixed your issue? I will close this issue due to inactivity. If your issue is persistent please re-open this issue and we will assist you as soon as possible.