Haskell Space Leak -
all.
while trying solve programming quiz: https://www.hackerrank.com/challenges/missing-numbers , came across space leak.
main function difference
, implements multi-set difference. i've found out list ':' , triples (,,) kept on heaps -ht option profiling. however, big lists difference
's 2 arguments, , shrinks difference
keeps on tail recursion. memory consumed lists keeps increasing program runs.
triples ephemeral array structure, used bookkeeping count of multiset's each element. memory consumed triples keeps increasing, , cannot find out why.
though i've browsed similar 'space leak' questions in stackoverflow, couldn't grasp idea. surely have study.
i appreciate comments. thank you.
p.s) executable compiled -o2 switch.
$ ./difference -ht < input04.txt stack space overflow: current size 8388608 bytes. $ ghc --version glorious glasgow haskell compilation system, version 7.6.3
.
import data.list import data.array -- array (non-zero-count, start-offset, array_data) array_size=101 myindex :: int -> int -> int myindex key offset | key >= offset = key - offset | otherwise = key - offset + array_size mylookup x (_,offset,arr) = arr ! idx idx = myindex x offset addorreplace :: int -> int -> (int, int, array int (int,int)) -> (int, int, array int (int,int)) addorreplace key value (count,offset,arr) = (count', offset, arr // [(idx,(key,value))]) idx = myindex key offset (_,prev_value) = arr ! idx count' = case (prev_value, value) of (0,0) -> count (0,_) -> count + 1 (_,0) -> count - 1 otherwise -> count difference :: (int,int,array int (int,int)) -> [int] -> [int] -> [int] difference (count,offset,arr) [] [] | count == 0 = [] | otherwise = [ k | x <- [0..array_size-1], let (k,v) = (arr ! x), v /= 0] difference m (x:xs) y = difference new_m xs y (_,v) = mylookup x m new_m = addorreplace x (v + 1) m difference m [] (y:ys) = difference new_m [] ys (_,v) = mylookup y m new_m = if v == 0 m else addorreplace y (v - 1) m main = n <- readln :: io int pp <- getline m <- readln :: io int qq <- getline let p = map (read :: string->int) . words $ pp q = map (read :: string->int) . words $ qq startarray = (0,head q, array (0,100) [(i,(0,0)) | <- [0..100]] ) putstrln . unwords . map show . sort $ difference startarray q p
[edit] seq'ed value , array carl's advice. attach heap diagram.
[original heap profiling] []1
[after seq'ing value v
]
difference m (x:xs) y = difference new_m xs y (_,v) = mylookup x m new_m = v `seq` addorreplace x (v + 1) m
[after seq'ing value v
, array
]
difference m (x:xs) y = new_m `seq` difference new_m xs y (_,v) = mylookup x m new_m = v `seq` addorreplace x (v + 1) m
i see 3 main problems code.
first (and not cause of memory use, cause of poor performance) array
horrible use case. o(1) lookups useless when updates o(n).
speaking of, values being stored in array
aren't forced while difference
looping on first input. thunks containing pointers unevaluated lookup in previous version of array. can ensure value evaluated @ same time array updated, in variety of ways. when difference
loops on second input, accidentally, in fact, comparing value against 0.
third, difference
doesn't force evaluation of new arrays being created while traversing first argument. nothing requires old array evaluated during portion of loop.
both of latter issues need resolved fix space leak. first issue doesn't cause space leak, higher overheads needed.
Comments
Post a Comment