BigBoot/AutoKuma

Possible memory leak

Closed this issue · 17 comments

I've been running AutoKuma for a couple of days and I noticed that its memory usage is always increasing:

image

I'm running the latest version of AutoKuma (0.6.0) on Docker.

Hmm yes, that definitely seems wrong, do you get any frequent warnings/errors in the logs?
Memory usage should generally be quite low:
grafik

Unfortunately, the logs don't have any error. I limited the memory to 1 GB hoping that it would trigger some OOM error, but it seems it kept running until it restarted "by itself."

image image

Yes, I'm dealing with the same problems. I just restarted one of them around 3.5Gigs of ram usage

I used Valgrind to run AutoKuma and got this:

...
==22== 377,936 bytes in 23 blocks are possibly lost in loss record 1,109 of 1,122
==22==    at 0x48407B4: malloc (vg_replace_malloc.c:381)
==22==    by 0x5254B0: hashbrown::map::HashMap<K,V,S>::with_capacity_and_hasher (in /usr/local/bin/autokuma)
==22==    by 0x42416F: serde_json::value::de::visit_object (in /usr/local/bin/autokuma)
==22==    by 0x3CF7D5: serde_json::value::de::<impl serde::de::Deserializer for serde_json::value::Value>::deserialize_map (in /usr/local/bin/autokuma)
==22==    by 0x447592: kuma_client::client::Worker::on_event::{{closure}} (in /usr/local/bin/autokuma)
==22==    by 0x44284E: kuma_client::client::Worker::connect::{{closure}}::{{closure}}::{{closure}}::{{closure}} (in /usr/local/bin/autokuma)
==22==    by 0x472968: tokio::runtime::task::core::Core<T,S>::poll (in /usr/local/bin/autokuma)
==22==    by 0x4FE9DE: tokio::runtime::task::harness::Harness<T,S>::poll (in /usr/local/bin/autokuma)
==22==    by 0x86FF4A: tokio::runtime::scheduler::multi_thread::worker::Context::run_task (in /usr/local/bin/autokuma)
==22==    by 0x86F03B: tokio::runtime::scheduler::multi_thread::worker::Context::run (in /usr/local/bin/autokuma)
==22==    by 0x87BC36: tokio::runtime::context::scoped::Scoped<T>::set (in /usr/local/bin/autokuma)
==22==    by 0x87BA2E: tokio::runtime::context::runtime::enter_runtime (in /usr/local/bin/autokuma)

I'll try to run further tests. How can I enable the trace logs?

You can enable trace logs using the env var RUST_LOG="trace", this would also enable traces logs for all dependencies which is a lot, alternatively you can enable them just for autokuma using RUST_LOG="kuma_client=trace, autokuma=trace".

I'm not sure how useful memcheck is going to be here though. I suspect there's some tokio tasks getting stuck but I still can't reproduce this here...

I've added tokio-console support on master, you can run it like this: RUSTFLAGS="--cfg tokio_unstable" cargo run --bin autokuma -F tokio-console after that just run tokio-console and it should connect, there should be around 2-5 running tasks at most times with a bunch of very short lived ones spawning every now and then:

grafik

tokio-console is showing an increasing number of IDLE tasks:

image

Commit: 1c77d7418f15809d59953416f80719b62eb83660

I saw a similar increase over the weekend.

image

It seemed to be related (either causing, caused by, or just related) to Uptime Kuma pegging at 100% CPU and eventually they couldn't communicate.

image

I also saw similar warn messages as @undaunt and I never understand why I get this. So this might be in relation with uptime Kuma connection handling.

hxii commented

I wanted to add that I'm experiencing the same behavior running on Unraid. Unfortunately no logs. I'll see if I can capture something.

image

can confirm i'm seeing its memory use creep up way too much

But i have not a single log posted by autokuma during the increase. i assume its a lack of garbage collection on something?

i dont see the cpu doing anything crazy but

Getting similar memory leak issues here on version 0.7.0

I'm not able to reproduce this on my end, can someone affected either provide a reproducable example (i.e. a docker-compose file e.t.c) or is willing to dissect the issue (by stopping all containers except uptime-kuma/autokuma, see if the issue is gone and then starting them up on-by-one until the issue occurs again)

image

Yes absolutely here's my compose

version: "3.3"

services:
  uptime-kuma:
    restart: unless-stopped
    image: louislam/uptime-kuma:1
    container_name: uptime_kuma
    ports:
      - "3003:3001"
    volumes:
      - ./config:/app/data
      - /var/run/docker.sock:/var/run/docker.sock
 
  autokuma:
    container_name: autokuma
    image: ghcr.io/bigboot/autokuma:latest
    restart: unless-stopped
    environment:
      AUTOKUMA__KUMA__URL: http://uptime-kuma:3001
      AUTOKUMA__KUMA__USERNAME: <USERNAME>
      AUTOKUMA__KUMA__PASSWORD: <PASSWORD>
      AUTOKUMA__KUMA__CALL_TIMEOUT: 5
      AUTOKUMA__KUMA__CONNECT_TIMEOUT: 5
      AUTOKUMA__TAG_NAME: AutoKuma
      AUTOKUMA__TAG_COLOR: "#42C0FB"
      AUTOKUMA__DEFAULT_SETTINGS: |- 
         docker.docker_container: {{container_name}}
         http.max_redirects: 10
         *.max_retries: 3
         *.notification_id_list: { "1": true }      # Discord
      AUTOKUMA__SNIPPETS__DOCKER: |- 
         {{container_name}}_docker.docker.name: {{container_name}}
         {{container_name}}_docker.docker.docker_container: {{container_name}}
         {{container_name}}_docker.docker.docker_host: 1
      AUTOKUMA__DOCKER__SOCKET: /var/run/docker.sock
    depends_on:
      uptime-kuma:
        condition: service_healthy
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock

  # Getting the leak with and without a container being monitored via snippets
  # test:
  #   container_name: test
  #   image: busybox
  #   restart: unless-stopped
  #   command: sleep infinity
  #   labels:
  #     - "kuma.__docker"

CleanShot 2024-08-04 at 09 23 48@2x

Hello, I have the same issue (v0.7.0). I see that Autokuma connects every ~6/7 seconds to UpTime-Kuma. Do these connections remain active in Autokuma?
image
image

Same issue here, I stopped the container after 15 Go of ram usage.

By deleting the configuration line by line, I manage to avoid memory leaks with this code :

  SUPERVISION_AutoKuma_2:
    container_name: SUPERVISION_AutoKuma_2
    image: ghcr.io/bigboot/autokuma:0.7.0
    environment:
      AUTOKUMA__KUMA__URL: https://uptime-kuma.domain.com

Just add the username + password and the memory leaks will resume.

Here are the logs during a memory leak with the RUST_LOG environment variable: "kuma_client=trace, autokuma=trace" :

DEBUG [kuma_client::client] Connection opened!
DEBUG [kuma_client::client] Waiting for Kuma to get ready...
DEBUG [kuma_client::client] Waiting for Kuma to get ready...
TRACE [kuma_client::client] Client::on_any(Custom("info"), Text([Object {"primaryBaseURL": Null, "serverTimezone": String("Europe/Paris"), "serverTimezoneOffset": String("+02:00")}]))
DEBUG [kuma_client::client] call login -> Text([Array [Object {"ok": Bool(true), "token": String("eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJ1c2VybmFtZSI6ImFudG9pbmUucGFpbGxhcmQiLCJoIjoiZWViYTUxYTQyMmM0OTdkNjYyOTBlZWVhOWU3MmU0MTgiLCJpYXQiOjE3MjQzMjI0OTB9.nSJAV0tYIPC4uVsOLzJdsUfDIvBIiyMWf8vtwwe39tM")}]])
DEBUG [kuma_client::client] Logged in as username!
TRACE [kuma_client::client] Client::on_any(Custom("monitorList"), Text([Object {}]))
TRACE [kuma_client::client] Client::on_any(Custom("maintenanceList"), Text([Object {}]))
TRACE [kuma_client::client] Client::on_any(Custom("info"), Text([Object {"isContainer": Bool(true), "latestVersion": String("1.23.11"), "primaryBaseURL": Null, "serverTimezone": String("Europe/Paris"), "serverTimezoneOffset": String("+02:00"), "version": String("1.23.13")}]))
TRACE [kuma_client::client] Client::on_any(Custom("notificationList"), Text([Array [Object {"active": Bool(true), "config": String("{\"name\":\"NTFY\",\"type\":\"ntfy\",\"isDefault\":true,\"ntfyserverurl\":\"https://ntfy.domain.com\",\"ntfyPriority\":5,\"ntfyAuthenticationMethod\":\"usernamePassword\",\"ntfytopic\":\"supervision\",\"ntfyusername\":\"username_ntfy\",\"ntfypassword\":\"password_ntfy\",\"ntfyIcon\":\"https://test.domain.com\",\"applyExisting\":true}"), "id": Number(1), "isDefault": Bool(true), "name": String("NTFY"), "userId": Number(1)}]]))
TRACE [kuma_client::client] Client::on_any(Custom("proxyList"), Text([Array []]))
TRACE [kuma_client::client] Client::on_any(Custom("dockerHostList"), Text([Array [Object {"dockerDaemon": String("/var/run/docker.sock"), "dockerType": String("socket"), "id": Number(1), "name": String("Socket Docker"), "userID": Number(1)}]]))
TRACE [kuma_client::client] Client::on_any(Custom("apiKeyList"), Text([Array []]))
DEBUG [kuma_client::client] Waiting for Kuma to get ready...
DEBUG [kuma_client::client] Waiting for Kuma to get ready...
TRACE [kuma_client::client] Client::on_any(Custom("statusPageList"), Text([Object {"1": Object {"customCSS": String("body {\n  \n}\n"), "description": Null, "domainNameList": Array [], "footerText": Null, "googleAnalyticsId": Null, "icon": String("/icon.svg"), "id": Number(1), "published": Bool(true), "showCertificateExpiry": Bool(false), "showPoweredBy": Bool(false), "showTags": Bool(false), "slug": String("dashboard"), "theme": String("auto"), "title": String("Page")}}]))
DEBUG [kuma_client::client] Connected!
DEBUG [kuma_client::client] call getTags -> Text([Array [Object {"ok": Bool(true), "tags": Array [Object {"color": String("#42C0FB"), "id": Number(1), "name": String("AutoKuma")}]}]])
WARN [kuma_client::util] Using DOCKER_HOST=None
WARN [autokuma::sync] Encountered error during sync: Error in the hyper legacy client: client error (Connect)
DEBUG [kuma_client::client] Connection closed!
DEBUG [kuma_client::client] Waiting for connection
DEBUG [kuma_client::client] Connection opened!
DEBUG [kuma_client::client] Waiting for Kuma to get ready...
DEBUG [kuma_client::client] Waiting for Kuma to get ready...
TRACE [kuma_client::client] Client::on_any(Custom("info"), Text([Object {"primaryBaseURL": Null, "serverTimezone": String("Europe/Paris"), "serverTimezoneOffset": String("+02:00")}]))
DEBUG [kuma_client::client] call login -> Text([Array [Object {"ok": Bool(true), "token": String("eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJ1c2VybmFtZSI6ImFudG9pbmUucGFpbGxhcmQiLCJoIjoiZWViYTUxYTQyMmM0OTdkNjYyOTBlZWVhOWU3MmU0MTgiLCJpYXQiOjE3MjQzMjI0OTd9.c2uiq5IuYcFvC5EhHa5b5NqpJ3xgE1NW263W2pJQl_Y")}]])
DEBUG [kuma_client::client] Logged in as username!
TRACE [kuma_client::client] Client::on_any(Custom("monitorList"), Text([Object {}]))
TRACE [kuma_client::client] Client::on_any(Custom("maintenanceList"), Text([Object {}]))
TRACE [kuma_client::client] Client::on_any(Custom("info"), Text([Object {"isContainer": Bool(true), "latestVersion": String("1.23.11"), "primaryBaseURL": Null, "serverTimezone": String("Europe/Paris"), "serverTimezoneOffset": String("+02:00"), "version": String("1.23.13")}]))
TRACE [kuma_client::client] Client::on_any(Custom("notificationList"), Text([Array [Object {"active": Bool(true), "config": String("{\"name\":\"NTFY\",\"type\":\"ntfy\",\"isDefault\":true,\"ntfyserverurl\":\"[https://ntfy.domain.com\",\"ntfyPriority\":5,\"ntfyAuthenticationMethod\":\"usernamePassword\",\"ntfytopic\":\"supervision\",\"ntfyusername\":\"username_ntfy\",\"ntfypassword\":\"password_ntfy\",\"ntfyIcon\":\"https://test.domain.com\",\"applyExisting\":true}"),](https://ntfy.domain.com/%22,/%22ntfyPriority/%22:5,/%22ntfyAuthenticationMethod/%22:/%22usernamePassword/%22,/%22ntfytopic/%22:/%22supervision/%22,/%22ntfyusername/%22:/%22username_ntfy/%22,/%22ntfypassword/%22:/%22password_ntfy/%22,/%22ntfyIcon/%22:/%22https://test.domain.com/%22,/%22applyExisting/%22:true%7D%22),) "id": Number(1), "isDefault": Bool(true), "name": String("NTFY"), "userId": Number(1)}]]))
TRACE [kuma_client::client] Client::on_any(Custom("proxyList"), Text([Array []]))
TRACE [kuma_client::client] Client::on_any(Custom("dockerHostList"), Text([Array [Object {"dockerDaemon": String("/var/run/docker.sock"), "dockerType": String("socket"), "id": Number(1), "name": String("Socket Docker"), "userID": Number(1)}]]))
TRACE [kuma_client::client] Client::on_any(Custom("apiKeyList"), Text([Array []]))
DEBUG [kuma_client::client] Waiting for Kuma to get ready...
DEBUG [kuma_client::client] Waiting for Kuma to get ready...
TRACE [kuma_client::client] Client::on_any(Custom("statusPageList"), Text([Object {"1": Object {"customCSS": String("body {\n  \n}\n"), "description": Null, "domainNameList": Array [], "footerText": Null, "googleAnalyticsId": Null, "icon": String("/icon.svg"), "id": Number(1), "published": Bool(true), "showCertificateExpiry": Bool(false), "showPoweredBy": Bool(false), "showTags": Bool(false), "slug": String("dashboard"), "theme": String("auto"), "title": String("Page")}}]))
DEBUG [kuma_client::client] Connected!
DEBUG [kuma_client::client] call getTags -> Text([Array [Object {"ok": Bool(true), "tags": Array [Object {"color": String("#42C0FB"), "id": Number(1), "name": String("AutoKuma")}]}]])
WARN [kuma_client::util] Using DOCKER_HOST=None
WARN [autokuma::sync] Encountered error during sync: Error in the hyper legacy client: client error (Connect)
DEBUG [kuma_client::client] Connection closed!
DEBUG [kuma_client::client] Waiting for connection
DEBUG [kuma_client::client] Connection opened!

Thank you @ITM-AP I was finally able to find the problem and will issue a fix later today.