`on_track` increases memory usage over time if files have cover
vitoyucepi opened this issue ยท 12 comments
Describe the bug
For testing purposes, I'm running liquidsoap in memory-constrained containers.
It's not clear to me if memory is leaking, but eventually all containers were OOM-killed, even when I set a larger memory limit.
I tried calling the garbage collector manually, but it doesn't seem to make any difference.
Journal log
Nov 07 08:12:54 arch kernel: liquidsoap invoked oom-killer: gfp_mask=0xcc0(GFP_KERNEL), order=0, oom_score_adj=200
Nov 07 08:12:54 arch kernel: CPU: 5 PID: 46290 Comm: liquidsoap Tainted: P OE 6.5.9-arch2-1 #1 f3e31240753f1687edd57c7c0d0f930d7e8c4a55
Nov 07 08:12:54 arch kernel: Hardware name:
Nov 07 08:12:54 arch kernel: Call Trace:
Nov 07 08:12:54 arch kernel: <TASK>
Nov 07 08:12:54 arch kernel: dump_stack_lvl+0x47/0x60
Nov 07 08:12:54 arch kernel: dump_header+0x4a/0x240
Nov 07 08:12:54 arch kernel: oom_kill_process+0xf9/0x190
Nov 07 08:12:54 arch kernel: out_of_memory+0x246/0x590
Nov 07 08:12:54 arch kernel: mem_cgroup_out_of_memory+0x134/0x150
Nov 07 08:12:54 arch kernel: try_charge_memcg+0x6ed/0x7d0
Nov 07 08:12:54 arch kernel: charge_memcg+0x39/0xf0
Nov 07 08:12:54 arch kernel: __mem_cgroup_charge+0x2c/0x80
Nov 07 08:12:54 arch kernel: do_anonymous_page+0x89/0x3a0
Nov 07 08:12:54 arch kernel: __handle_mm_fault+0xacc/0xbb0
Nov 07 08:12:54 arch kernel: ? smp_call_function_single_async+0x22/0x50
Nov 07 08:12:54 arch kernel: handle_mm_fault+0x155/0x350
Nov 07 08:12:54 arch kernel: do_user_addr_fault+0x15f/0x640
Nov 07 08:12:54 arch kernel: exc_page_fault+0x7f/0x180
Nov 07 08:12:54 arch kernel: asm_exc_page_fault+0x26/0x30
Nov 07 08:12:54 arch kernel: RIP: 0033:0x7fcfd795d62c
Nov 07 08:12:54 arch kernel: Code: 7c 10 8a 0e 74 0a 0f b7 74 16 fe 66 89 74 17 fe 88 0f c3 8b 4c 16 fc 8b 36 89 4c 17 fc 89 37 c3 c5 fa 6f 06 c5 fa 6f 4c 16 f0 <c5> fa 7f 07 c5 fa 7f 4c 17 f0 c3 66 0f 1f 84 00>
Nov 07 08:12:54 arch kernel: RSP: 002b:00007fcfad1fb1a8 EFLAGS: 00010202
Nov 07 08:12:54 arch kernel: RAX: 00007fcfa0a9e000 RBX: 00007fcfafb9e018 RCX: 00007fcfa0a6a220
Nov 07 08:12:54 arch kernel: RDX: 0000000000000018 RSI: 00007fcfafb9e000 RDI: 00007fcfa0a9e000
Nov 07 08:12:54 arch kernel: RBP: 0000000000000018 R08: 0000000000000000 R09: 00007fcfbcf69000
Nov 07 08:12:54 arch kernel: R10: 00000000001c5be0 R11: 0000000000000001 R12: 00007fcfb0cb9000
Nov 07 08:12:54 arch kernel: R13: 00007fcfaf001000 R14: 0000000000000002 R15: 00007fcfbcbd9a18
Nov 07 08:12:54 arch kernel: </TASK>
Nov 07 08:12:54 arch kernel: memory: usage 307200kB, limit 307200kB, failcnt 285
Nov 07 08:12:54 arch kernel: swap: usage 0kB, limit 307200kB, failcnt 0
Nov 07 08:12:54 arch kernel: Memory cgroup stats for /user.slice/user-1000.slice/user@1000.service/user.slice/docker-eaec52440d1b63e469f909bd15d7f7d2b489830e0ced84f2a8766b2a42122ab6.scope:
Nov 07 08:12:54 arch kernel: anon 309870592
Nov 07 08:12:54 arch kernel: file 1851392
Nov 07 08:12:54 arch kernel: kernel 2850816
Nov 07 08:12:54 arch kernel: kernel_stack 180224
Nov 07 08:12:54 arch kernel: pagetables 1331200
Nov 07 08:12:54 arch kernel: sec_pagetables 0
Nov 07 08:12:54 arch kernel: percpu 2912
Nov 07 08:12:54 arch kernel: sock 0
Nov 07 08:12:54 arch kernel: vmalloc 12288
Nov 07 08:12:54 arch kernel: shmem 1851392
Nov 07 08:12:54 arch kernel: zswap 0
Nov 07 08:12:54 arch kernel: zswapped 0
Nov 07 08:12:54 arch kernel: file_mapped 0
Nov 07 08:12:54 arch kernel: file_dirty 0
Nov 07 08:12:54 arch kernel: file_writeback 0
Nov 07 08:12:54 arch kernel: swapcached 0
Nov 07 08:12:54 arch kernel: anon_thp 266338304
Nov 07 08:12:54 arch kernel: file_thp 0
Nov 07 08:12:54 arch kernel: shmem_thp 0
Nov 07 08:12:54 arch kernel: inactive_anon 24793088
Nov 07 08:12:54 arch kernel: active_anon 286883840
Nov 07 08:12:54 arch kernel: inactive_file 0
Nov 07 08:12:54 arch kernel: active_file 0
Nov 07 08:12:54 arch kernel: unevictable 0
Nov 07 08:12:54 arch kernel: slab_reclaimable 434336
Nov 07 08:12:54 arch kernel: slab_unreclaimable 839144
Nov 07 08:12:54 arch kernel: slab 1273480
Nov 07 08:12:54 arch kernel: workingset_refault_anon 0
Nov 07 08:12:54 arch kernel: workingset_refault_file 0
Nov 07 08:12:54 arch kernel: workingset_activate_anon 0
Nov 07 08:12:54 arch kernel: workingset_activate_file 0
Nov 07 08:12:54 arch kernel: workingset_restore_anon 0
Nov 07 08:12:54 arch kernel: workingset_restore_file 0
Nov 07 08:12:54 arch kernel: workingset_nodereclaim 0
Nov 07 08:12:54 arch kernel: pgscan 0
Nov 07 08:12:54 arch kernel: pgsteal 0
Nov 07 08:12:54 arch kernel: pgscan_kswapd 0
Nov 07 08:12:54 arch kernel: pgscan_direct 0
Nov 07 08:12:54 arch kernel: pgscan_khugepaged 0
Nov 07 08:12:54 arch kernel: pgsteal_kswapd 0
Nov 07 08:12:54 arch kernel: pgsteal_direct 0
Nov 07 08:12:54 arch kernel: pgsteal_khugepaged 0
Nov 07 08:12:54 arch kernel: pgfault 282000785
Nov 07 08:12:54 arch kernel: pgmajfault 0
Nov 07 08:12:54 arch kernel: pgrefill 0
Nov 07 08:12:54 arch kernel: pgactivate 0
Nov 07 08:12:54 arch kernel: pgdeactivate 0
Nov 07 08:12:54 arch kernel: pglazyfree 0
Nov 07 08:12:54 arch kernel: pglazyfreed 0
Nov 07 08:12:54 arch kernel: zswpin 0
Nov 07 08:12:54 arch kernel: zswpout 0
Nov 07 08:12:54 arch kernel: thp_fault_alloc 19602
Nov 07 08:12:54 arch kernel: thp_collapse_alloc 33
Nov 07 08:12:54 arch kernel: Tasks state (memory values in pages):
Nov 07 08:12:54 arch kernel: [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name
Nov 07 08:12:54 arch kernel: [ 46251] 109999 46251 618 288 40960 0 200 tini
Nov 07 08:12:54 arch kernel: [ 46282] 109999 46282 253883 96246 1306624 0 200 liquidsoap
Nov 07 08:12:54 arch kernel: oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=docker-eaec52440d1b63e469f909bd15d7f7d2b489830e0ced84f2a8766b2a42122ab6.scope,mems_allowed=0,oom_memcg=/user.slice/user-1>
Nov 07 08:12:54 arch kernel: Memory cgroup out of memory: Killed process 46282 (liquidsoap) total-vm:1015532kB, anon-rss:302296kB, file-rss:82688kB, shmem-rss:0kB, UID:109999 pgtables:1276kB oom_score_adj:200
Nov 07 08:12:54 arch systemd[623]: docker-eaec52440d1b63e469f909bd15d7f7d2b489830e0ced84f2a8766b2a42122ab6.scope: A process of this unit has been killed by the OOM killer.
Nov 07 08:12:54 arch systemd[623]: docker-eaec52440d1b63e469f909bd15d7f7d2b489830e0ced84f2a8766b2a42122ab6.scope: Consumed 50min 31.232s CPU time.
Nov 07 08:12:54 arch systemd[623]: user.slice: A process of this unit has been killed by the OOM killer.
Nov 07 08:12:54 arch systemd[1]: user@1000.service: A process of this unit has been killed by the OOM killer.
To Reproduce
- Generate tracks with covers.
mkdir music; for i in $(seq 1 9); do convert -size 600x600 plasma: cover.png; ffmpeg \ -f lavfi -i "sine=frequency=${i}00:duration=1" \ -i "cover.png" \ -map 0:0 -map 1:0 \ -c:v copy \ -c:a libmp3lame -b:a 128k \ -id3v2_version 4 \ -metadata "title=$i" \ -metadata:s:v title="Album cover" \ -metadata:s:v comment="Cover (front)" \ "music/$i.mp3"; rm cover.png; done
compose.yaml
version: "3.8" services: liquidsoap: image: savonet/liquidsoap:v2.2.2 command: - "/tmp/test/main.liq" deploy: resources: limits: memory: 300M logging: driver: "json-file" options: max-size: "1m" max-file: "3" volumes: - ./:/tmp/test - type: tmpfs target: /tmp tmpfs: size: 100M
main.liq
s = mksafe(playlist("/tmp/test/music", mode="normal")) s.on_track(ignore) s = clock(sync="none", s) output.dummy(s) thread.run(runtime.gc.compact, every=60., fast=false)
docker compose up -d
- observe container stats using
docker stats
Expected behavior
Process memory consumption stays at a certain level.
Version details
- OS: Debian in docker container
- Version: 2.2.2, 2.2.1
Install method
savonet/liquidsoap:v2.2.2
Common issues
#3502
#3402 (comment)
Hi @vitoyucepi ! Probably it's a dumb question, but what if we change on_track
to on_metadata
. Would it behave differently?
Thanks for reporting this one. I'll have a look asap.
I've run the cover manager extractor function (#3402 (reply in thread)) with s.on_metadata
for a few days. Result is the same as with on_track
.
Slowly liquidsoap process memory increases, I would say with each new track - memory increases by the size of the coverart (20-150 kb, after couple of days memory has increased by roughly 500Mb).
Hi @toots! Have you looked into this issue? Is there any chance in solving it?
I know this is not a priority but still. I'm not rushing you in any way, I'm just curious.
I'm able to reproduce. So far this looks like an issue with the GC. I generated a very large 6000x6000 pixel cover art and ran the script with agressive GC parameters and regular full_major
.
I'm able to trace the issue down to the memory allocating the string for the cover (and also probably found the cause for #3502).
I'm also able to trace all the memory from the requests being collected.
However, the bad news is that memory does not seem to leak but, rather, to plateau at a very high value:
For scale, the cover art is about 168Mo.
I'm gonna see what we can do to optimize the GC's behavior on this.
I see. Thanks for your work!
We've identified the underlying source as coming from the output list of the last 10 metadata, at least in my reproduction test. We're working on a short term and long term mitigation plan ๐
Hi @toots!
So I upgraded to 2.2.3 and have been running it more than 3 days with Vito's cover manager function
switched back on:
def cover_files_manager(~id=null(), ~default_cover_file="", ~covers_keep=2)
id = string.id.default(id, default="cover")
# log = logger(id)
mime_types = [
("image/gif", "gif"),
("image/jpeg", "jpeg"),
("image/jpg", "jpeg"),
("image/png", "png"),
("image/webp", "webp")
]
cover_files = ref([])
current_cover_file = ref(default_cover_file)
def remove_file(filename)
log.info("Remove #{string.quote(filename)}")
file.remove(filename)
end
def rotate_cover_files(cover_file)
current_cover_file := cover_file
cover_files := cover_file::cover_files()
if list.length(cover_files()) > covers_keep then
remove_file(list.last(cover_files()))
cover_files := list.prefix(covers_keep, cover_files())
end
end
def save_cover_to_file(cover)
mime = cover.mime
extension = mime_types[mime]
log.debug("mime=#{mime}, extension=#{extension}")
cover_file =
if extension != "" then
cover_file = file.temp("#{id}_", ".#{extension}")
file.write(cover_file, data=cover)
cover_file
else
null()
end
(cover_file, mime)
end
def extract_cover_from_metadata(_metadata)
filename = _metadata["filename"]
log.info("Extracting cover from #{string.quote(filename)}.")
cover = metadata.cover(_metadata)
let (cover_file, mime) = null.case(cover, {(null(), "")}, save_cover_to_file)
if cover == null() then
log.important("File #{string.quote(filename)} has no cover.")
current_cover_file := default_cover_file
elsif cover_file == null() then
log.important("File #{string.quote(filename)} has unknown mime type \
#{string.quote(mime)}.")
current_cover_file := default_cover_file
else
cover_file = null.get(cover_file)
log.important("Cover for #{string.quote(filename)} saved to \
#{string.quote(cover_file)}.")
rotate_cover_files(cover_file)
end
end
def get_cover()
log.debug("current_cover_file=#{current_cover_file()}")
current_cover_file()
end
get_cover.{set = extract_cover_from_metadata}
end
cover_files = cover_files_manager(default_cover_file=station_dir ^ "/img/cover_default.png")
radio.on_track(cover_files.set)
background = video.add_image(x=50, y=280, width=220, height=220, file=cover_files, background)
After 3 days of running extra 600Mb
of RAM has been taken by liquidsoap process and continued increasing bit by bit so I switched off cover manager function
.
So the problem persists.