Agent 1.8.3-1 stopped after 24h
Closed this issue · 3 comments
Ubuntu 24.04.1 LTS (GNU/Linux 6.8.0-51-generic x86_64)
Python 3.12.3 (main, Nov 6 2024, 18:32:19) [GCC 13.2.0]
agent version=1.8.3-1
I've installed the agent yesterday and started at 08:40 GMT.
Today the agent stopped working at 07:42 GMT.
From the log I see it's due to 500 Error:
2025-01-15 07:42:15,452 [1158647] supervisor failed POST "https://receiver.amplify.nginx.com:443/1.4/225879d76ec3b6f1b68a881c6421a66e/agent/", exception: "50
0 Server Error: INTERNAL SERVER ERROR for url: https://receiver.amplify.nginx.com:443/1.4/225879d76ec3b6f1b68a881c6421a66e/agent/"
2025-01-15 07:42:15,452 [1158647] supervisor [None] post https://receiver.amplify.nginx.com:443/1.4/225879d76ec3b6f1b68a881c6421a66e/agent/ 500 94 0 0.114
2025-01-15 07:42:15,452 [1158647] supervisor uncaught exception during run time
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/amplify/agent/supervisor.py", line 356, in talk_to_cloud
context.http_client.post('agent/', data=root_object)
File "/usr/lib/python3/dist-packages/amplify/agent/common/util/http.py", line 128, in post
return self.make_request(url, 'post', data=data, timeout=timeout, json=json)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/amplify/agent/common/util/http.py", line 110, in make_request
raise e
File "/usr/lib/python3/dist-packages/amplify/agent/common/util/http.py", line 102, in make_request
r.raise_for_status()
File "/usr/lib/python3/dist-packages/requests/models.py", line 1021, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 500 Server Error: INTERNAL SERVER ERROR for url: https://receiver.amplify.nginx.com:443/1.4/225879d76ec3b6f1b68a881c6421a66e/a
gent/
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/amplify/agent/main.py", line 147, in run
daemon_runner.do_action()
File "/usr/lib/python3/dist-packages/amplify/agent/common/runner.py", line 42, in do_action
self.app.run()
File "/usr/lib/python3/dist-packages/amplify/agent/supervisor.py", line 277, in run
self.talk_to_cloud(root_object=context.objects.root_object.definition)
File "/usr/lib/python3/dist-packages/amplify/agent/supervisor.py", line 375, in talk_to_cloud
self.cloud_talk_delay = exponential_delay(self.cloud_talk_fails)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/amplify/agent/common/util/backoff.py", line 35, in exponential_delay
return randint(0, period_size - 1)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.12/random.py", line 336, in randint
return self.randrange(a, b+1)
^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.12/random.py", line 312, in randrange
istop = _index(stop)
^^^^^^^^^^^^
TypeError: 'float' object cannot be interpreted as an integer
2025-01-15 07:42:15,549 [1158647] supervisor [f6570d50a942b622dc832fab71898fc2] post https://receiver.amplify.nginx.com:443/1.4/225879d76ec3b6f1b68a881c6421a
66e/update/ 202 1925 0 0.091
2025-01-15 07:42:15,557 [1158647] supervisor agent stopped, version=1.8.3-1 pid=1158647 uuid=6675ae55021f56ceb9a413de8016376b
@oburlaca thanks for submitting this.
Question: were there any log entries between this
2025-01-15 07:42:15,452 [1158647] supervisor uncaught exception during run time [..]
and this
2025-01-15 09:42:15,549 [1158647] supervisor [f6570d50a942b622dc832fab71898fc2] post https://receiver.amplify.nginx.com:443/1.4/225879d76ec3b6f1b68a881c6421a
66e/update/ 202 1925 0 0.091
2025-01-15 09:42:15,557 [1158647] supervisor agent stopped, version=1.8.3-1 pid=1158647 uuid=6675ae55021f56ceb9a413de8016376b
in the agent log you've provided? That ~2hrs gap looks weird.
in the agent log you've provided? That ~2hrs gap looks weird.
my mistake: I've manually replaced the time so it's GMT for clarity and maybe that will help find the error on the amplify server. (the time on my the server is GMT+2).
There is no time gap. I've updated my first post (time 07:42:15 GMT when the error occured).
I've started now the agent again, will post here when it crashes again.
I 've installed Agent 1.8.3-1 on 4 servers with Ubuntu 24.04, all running ok for a couple of days,
including the server where the agent stopped.
Closing the issue, thanks @defanator.