shelfio/libreoffice-lambda-layer

[soffice.bin] <defunct> processes

sheakelly opened this issue · 5 comments

When executing libreoffice we are getting lots of [soffice.bin] <defunct> processes. We are attempt to convert lots of files to pdf (one per lambda execution). As the lambda container is reused we eventually run out of available processes and can no longer convert anything to pdf

I have followed the instructions and successfully upgraded to use v3.2.1.2 of libreoffice. We get less leaked pids however each lambda invocation stills leaks a new pid.

Is this a known issue? Is there a something we can do to prevent the leaking pids?

Example output running ps xl from inside the lambda function:

F   UID   PID  PPID PRI  NI    VSZ   RSS WCHAN  STAT TTY        TIME COMMAND
4   482     1     0  20   0 1326212 222788 SyS_ep Ssl ?         0:24 /var/lang/bin/node --expose-gc --max-semi-space-size=150 --max-old-space-size=2707 /var/runtime/node_modules/awslambda/index.js
1   482    20     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482    31     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482    59     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482    91     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482   123     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482   155     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482   187     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482   219     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482   251     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482   283     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482   315     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482   347     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482   379     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482   411     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482   443     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482   475     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482   507     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482   539     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482   571     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482   603     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482   635     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482   667     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482   699     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482   731     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482   763     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482   795     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482   827     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482   859     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482   891     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482   923     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482   955     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482   987     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  1019     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  1051     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  1083     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  1115     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  1147     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  1179     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  1211     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  1243     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  1275     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  1307     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  1339     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  1371     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  1403     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  1435     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  1467     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  1499     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  1531     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  1563     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  1595     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  1627     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  1659     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  1691     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  1723     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  1755     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  1787     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  1819     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  1851     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  1883     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  1916     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  1948     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  1980     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  2012     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  2044     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  2076     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  2108     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  2140     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  2172     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  2204     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  2236     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  2268     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  2300     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  2332     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  2364     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  2396     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  2428     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  2460     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  2492     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  2524     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  2556     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  2588     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  2620     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  2652     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  2684     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  2716     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  2748     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  2780     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  2812     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  2844     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  2876     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  2908     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  2940     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  2972     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  3004     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  3036     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  3068     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  3100     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  3132     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  3164     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  3196     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  3228     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  3260     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  3292     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  3324     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  3356     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  3388     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  3420     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  3452     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  3484     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  3516     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  3548     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  3580     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  3612     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  3644     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  3676     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  3708     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  3740     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  3772     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  3804     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  3836     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  3868     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  3900     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  3932     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  3964     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  3996     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  4028     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  4060     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  4092     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  4124     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  4156     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  4188     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  4220     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  4252     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  4284     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  4316     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  4348     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  4380     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  4412     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  4444     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  4476     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  4508     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  4540     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  4572     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  4604     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  4636     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  4668     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  4700     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  4732     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  4764     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  4796     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  4828     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  4860     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  4892     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  4924     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  4956     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  4988     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  5020     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  5052     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  5084     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  5116     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  5148     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  5180     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  5212     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  5244     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  5276     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  5294     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  5326     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  5358     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  5390     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  5422     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  5454     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  5486     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  5518     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  5550     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  5582     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  5614     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  5646     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  5678     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  5710     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  5728     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  5760     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  5796     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  5832     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  5864     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  5896     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  5914     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  5946     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  5978     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  5996     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  6028     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  6060     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  6078     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  6110     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
1   482  6128     1  20   0      0     0 -      Z    ?          0:00 [soffice.bin] <defunct>
0   482  6145     1  20   0 115144  2208 -      R    ?          0:00 ps xl

Hey @sheakelly

Thanks for bringing this up, I haven't noticed such an issue before.

Do you think it would be a reasonable solution to kill those processes after file processing? That's something which could be implemented in https://github.com/shelfio/aws-lambda-libreoffice

Hi @vladgolubev thanks for responding so quickly! I am not sure it is that simple. As they are zombie processes they are already dead and it seems the only way they get cleaned up is if the parent process ends. In the context of aws lambda the parent process is the node process which remain for the duration of the lambda execution context.

We have tried spawning another node process like this puppeteer/puppeteer#1825 (comment). It did not work. We were able to kill the spawned node process and the libreoffice process but I think it is libreoffice itself that is the problem. There is a bug that describes the problem with 6.1.0.0 of libreoffice. https://bugs.documentfoundation.org/show_bug.cgi?id=117523. The problem has been reduced in 6.2.1.2 but still exists.

Our current workaround is that we count the number of [soffice.bin] <defunct> and when it reaches a threshold of 800 we call process.exit(). This kills the node process and release the defunct processes. AWS lambda then retries and a new node process is created.

Thanks for such a valuable input and sharing a possible workaround, Shea!
Looks like it's still a good idea to recompile a newer LO version. Not promises, but I'll try to prioritize this task in my list and let you know.

Hey @sheakelly

We've just compiled a new LibreOffice version: https://github.com/shelfio/libreoffice-lambda-layer#version-arns

Would you mind giving it a spin?

I can confirm that the new layer solves this issue. No more defunct processes :)