python process gets killed when trying to export a large buffer
Opened this issue · 7 comments
VM has around 500mb of ram, 3gb of free space.
went in the psql prompt to check how many lines the buffer has
psql query:
select buffer.buffername,count(buffer.buffername) as counts from buffer inner join backlog on buffer.bufferid=backlog.bufferid group by buffer.buffername order by counts
number of lines:
#ubuntu | 5505647
i execute ./quasselgrep -N 'Freenode' -b '#ubuntu' > ubuntu.txt
which gets killed and dmesg -T | grep process
results to:
[Fri Jan 12 15:04:48 2018] [<ffffffff81151eae>] oom_kill_process+0x1ce/0x330
[Fri Jan 12 15:04:48 2018] Out of memory: Kill process 32263 (python) score 536 or sacrifice child
[Fri Jan 12 15:04:48 2018] Killed process 32263 (python) total-vm:948560kB, anon-rss:380396kB, file-rss:824kB
Here is the problem:
quasselgrep/quasselgrep/query.py
Line 301 in 502c88b
For non-context queries it's probably an easy fix.
This should be fixed now. Could you check that your use-case works?
Still happens,
quasselgrep -N 'network'' -b 'targetbuffer' > test.txt
returns 700k lines.
if i add the -i switch
quasselgrep -N 'network'' -b 'targetbuffer' -i > test.txt
it gets killed
[Mon Jun 18 12:31:30 2018] [<ffffffff81151eae>] oom_kill_process+0x1ce/0x330
[Mon Jun 18 12:31:30 2018] Out of memory: Kill process 3813 (quasselgrep) score 483 or sacrifice child
[Mon Jun 18 12:31:30 2018] Killed process 3813 (quasselgrep) total-vm:1135348kB, anon-rss:670992kB, file-rss:0kB
the query below returns more than 5 million lines, which i am guessing it is the amount which should be returned if i add the -i switch.
COPY (
SELECT back.time,sender.sender,back.message
FROM backlog AS back JOIN buffer AS buff ON buff.bufferid=back.bufferid
JOIN sender ON sender.senderid=back.senderid
WHERE buffername='#targetbuffer'
ORDER BY back.time ASC
) TO '/tmp/something.txt';
I have now finally managed to reproduce this! It somehow managed to kill my quasselcore at the same time as the python process, so that's annoying. I also think I know the cause now - turns out iterating over a postgres database cursor actually just slurps all the rows initially by default. Shouldn't be too hard to fix!
OK @pitastrudl it should be fixed in master, can you test hopefully for the last time?
@pitastrudl did you ever see this again?
Hi @fish-face, sadly not, but it is in my to-do to test again. Since i've posted this i've come to learn more about postgres and python so maybe i'll be able to help more! Quassel also just released a new release, so it will be more interesting to test. I'll try it out and let you know.
Happy new years!