Space leak when executing multiple `Text.XML.Stream.Parse.parseBytes` conduits
mbid opened this issue · 0 comments
I've encountered a weird space leak using Text.XML.Stream.Parse
. I believe a reasonably minimal example is this:
doTwice :: Applicative f => f () -> f ()
doTwice x = x *> x
leakSpace :: IO ()
leakSpace =
runResourceT $ runConduit $
doTwice (sourceFile "large-file.xml" .| Text.XML.Stream.parseBytes def) .|
sinkNull
If large-file.xml
is large enough, this crashes with OOM, even before the second iteration over the file.
If the line
doTwice (sourceFile "large-file.xml" .| Text.XML.Stream.parseBytes def) .|
is replaced with either
sourceFile "large-file.xml" .| Text.XML.Stream.parseBytes def .|
(i.e. only parsing the file once) or
doTwice (sourceFile "large-file.xml.gz" .| ungzip) .|
(i.e. not parsing at all, just connecting sourceFile to something else), everything works as expected. This makes me think that the cause of the issue is somewhere in parseBytes
.
I've encountered this issue when trying to combine multiple conduits with for_
, i.e. something like
for_ files $ \file -> sourceFile file .| Text.XML.Stream.parseBytes .| ...
and then I triggered the issue even when files
was a singleton list. I've not been able to reproduce this with constant singleton lists though, perhaps because of optimization. If files
was a dynamic Maybe
instead, the issue did not occure.
I'm using stackage's lts-12.5, i.e. xml-conduit-1.8.0.