Memory consumption depending on type annotation
Closed this issue · 3 comments
Hi!
While experimenting with Haxl for our MSc thesis we ran into a weird memory issue that we can only reproduce with a combination of Haxl and stack-run
. We cannot reproduce with only Haxl (without using stack-run
), or purely with only stack-run
(no Haxl involved), but are ot sure exactly where the problem lies.
A strict foldl'
eats lots of memory depending on whether there is a type annotation for the result or not. See this gist for a minimal code example. The issue persists over different machines.
Hey @m0ar, thanks for reporting this. I was able to repro the memory usage issue and this is what I got.
With Type Signature:
6,400,166,472 bytes allocated in the heap
7,212,144 bytes copied during GC
63,872 bytes maximum residency (2 sample(s))
22,144 bytes maximum slop
1 MB total memory in use (0 MB lost due to fragmentation)
Without Type Signature:
15,200,165,928 bytes allocated in the heap
18,804,184 bytes copied during GC
79,752 bytes maximum residency (2 sample(s))
18,048 bytes maximum slop
1 MB total memory in use (0 MB lost due to fragmentation)
So the runtime allocated about 2.5x more memory without the type signature. I took a look at the code generated by the compiler and this is what I found:
@@ -306,25 +306,45 @@
@ ()
Data.Foldable.$fFoldable[]
GHC.Base.$fMonadIO
- (\ (ds5_dd51 :: BlockedFetch Heavy) ->
- case ds5_dd51
- of _ [Occ=Dead] { BlockedFetch @ a_ad2d req_a2Oi var_a2Oj ->
+ (\ (ds5_ddcy :: BlockedFetch Heavy) ->
+ case ds5_ddcy
+ of _ [Occ=Dead] { BlockedFetch @ a_ad9J req_a2Oi var_a2Oj ->
case req_a2Oi of _ [Occ=Dead] { Mock cobox0_acPy ->
- case foldl'
- @ []
- Data.Foldable.$fFoldable[]
- @ Integer
- @ Integer
- (+ @ Integer GHC.Num.$fNumInteger)
- 0
- (enumFromTo @ Integer GHC.Enum.$fEnumInteger 1 100000000)
- of n_a2do { __DEFAULT ->
putSuccess
- @ a_ad2d
+ @ a_ad9J
var_a2Oj
- (n_a2do
- `cast` (Sub (Sym cobox0_acPy) :: (Integer :: *) ~R# (a_ad2d :: *)))
- }
+ (foldl'
+ @ []
+ Data.Foldable.$fFoldable[]
+ @ a_ad9J
+ @ a_ad9J
+ (+ @ a_ad9J
+ (GHC.Num.$fNumInteger
+ `cast` ((Num (Sym cobox0_acPy))_R
+ :: (Num Integer :: Constraint) ~R# (Num a_ad9J :: Constraint))))
+ (fromInteger
+ @ a_ad9J
+ (GHC.Num.$fNumInteger
+ `cast` ((Num (Sym cobox0_acPy))_R
+ :: (Num Integer :: Constraint) ~R# (Num a_ad9J :: Constraint)))
+ 0)
+ (enumFromTo
+ @ a_ad9J
+ (GHC.Enum.$fEnumInteger
+ `cast` ((Enum (Sym cobox0_acPy))_R
+ :: (Enum Integer :: Constraint) ~R# (Enum a_ad9J :: Constraint)))
+ (fromInteger
+ @ a_ad9J
+ (GHC.Num.$fNumInteger
+ `cast` ((Num (Sym cobox0_acPy))_R
+ :: (Num Integer :: Constraint) ~R# (Num a_ad9J :: Constraint)))
+ 1)
+ (fromInteger
+ @ a_ad9J
+ (GHC.Num.$fNumInteger
+ `cast` ((Num (Sym cobox0_acPy))_R
+ :: (Num Integer :: Constraint) ~R# (Num a_ad9J :: Constraint)))
+ 100000000)))
}
})
reqs_a2Oh)
It looks like when the type signature is included, the entire foldl
expression gets specialized at Integer
. The unspecialized version has to pass the Num
type class dictionary around, making the code less performant.
The underlying issue here seems to be in GHC, not the Haxl framework. I'm not super familiar with GHC internals, so I don't know if this is a feature or a bug. Either way, it may be worth bringing up to the maintainers of GHC. As for your specific code, I would recommend just leaving the type signature in there :)
I hope this helps!
Here's a simpler example without using Haxl:
{-# LANGUAGE GADTs #-}
{-# LANGUAGE BangPatterns #-}
import Data.IORef
import Data.List
data Heavy a where
Mock :: Heavy Integer
{-# NOINLINE runHeavy #-}
runHeavy :: Heavy a -> IORef a -> IO ()
runHeavy Mock var = do
-- This runs as expected:
-- let !n = foldl' (+) 0 [1..100000000] :: Integer
-- Without the type annotation it eats RAM like crazy
let !n = foldl' (+) 0 [1..100000000]
writeIORef var n
main = do
ref <- newIORef 0
runHeavy Mock ref
readIORef ref >>= print
The type signature version allocates less, but actually it runs slower:
With type signature:
3,200,051,784 bytes allocated in the heap
Total time 2.328s ( 2.350s elapsed)
Without type signature:
8,800,051,816 bytes allocated in the heap
Total time 2.148s ( 2.167s elapsed)
But looking at the generated Core, the issue is that with the type signature the list is fused away, whereas without the type signature there's no fusion and the [1..10000000]
is lifted to the top level. My guess is that this is just something related to the order of optimisations. In practice you won't be using a large constant list like this in your code anyway.
As @zilberstein said, this isn't something specific to Haxl. Feel free to raise a ticket over on the GHC Trac: https://ghc.haskell.org/trac/ghc/