Haskell Space Leak -


all.

while trying solve programming quiz: https://www.hackerrank.com/challenges/missing-numbers , came across space leak.

main function difference, implements multi-set difference. i've found out list ':' , triples (,,) kept on heaps -ht option profiling. however, big lists difference's 2 arguments, , shrinks difference keeps on tail recursion. memory consumed lists keeps increasing program runs.

triples ephemeral array structure, used bookkeeping count of multiset's each element. memory consumed triples keeps increasing, , cannot find out why.

though i've browsed similar 'space leak' questions in stackoverflow, couldn't grasp idea. surely have study.

i appreciate comments. thank you.

p.s) executable compiled -o2 switch.

$ ./difference -ht < input04.txt stack space overflow: current size 8388608 bytes. $ ghc --version glorious glasgow haskell compilation system, version 7.6.3 

.

import data.list import data.array  -- array (non-zero-count, start-offset, array_data) array_size=101  myindex :: int -> int -> int myindex key offset     | key >= offset = key - offset     | otherwise     = key - offset + array_size  mylookup x (_,offset,arr) = arr ! idx     idx = myindex x offset  addorreplace :: int -> int -> (int, int, array int (int,int)) -> (int, int, array int (int,int)) addorreplace key value (count,offset,arr) = (count', offset, arr // [(idx,(key,value))])     idx = myindex key offset           (_,prev_value) = arr ! idx           count' = case (prev_value, value) of                       (0,0) -> count                       (0,_) -> count + 1                       (_,0) -> count - 1                       otherwise -> count  difference :: (int,int,array int (int,int)) -> [int] -> [int] -> [int] difference (count,offset,arr) [] []     | count == 0 = []     | otherwise  = [ k | x <- [0..array_size-1], let (k,v) = (arr ! x), v /= 0] difference m (x:xs) y = difference new_m xs y     (_,v) = mylookup x m           new_m = addorreplace x (v + 1) m difference m [] (y:ys) = difference new_m [] ys     (_,v) = mylookup y m           new_m = if v == 0             m             else addorreplace y (v - 1) m  main =     n <- readln :: io int     pp <- getline     m <- readln :: io int     qq <- getline     let p = map (read :: string->int) . words $ pp         q = map (read :: string->int) . words $ qq         startarray = (0,head q, array (0,100) [(i,(0,0)) | <- [0..100]] )     putstrln . unwords . map show . sort $ difference startarray q p 

[edit] seq'ed value , array carl's advice. attach heap diagram.

[original heap profiling] [original heap]1

[after seq'ing value v]

difference m (x:xs) y = difference new_m xs y     (_,v) = mylookup x m           new_m = v `seq` addorreplace x (v + 1) m 

heap after seq'ing v

[after seq'ing value v , array]

difference m (x:xs) y = new_m `seq` difference new_m xs y     (_,v) = mylookup x m           new_m = v `seq` addorreplace x (v + 1) m 

heap after seq'ing v , array

i see 3 main problems code.

first (and not cause of memory use, cause of poor performance) array horrible use case. o(1) lookups useless when updates o(n).

speaking of, values being stored in array aren't forced while difference looping on first input. thunks containing pointers unevaluated lookup in previous version of array. can ensure value evaluated @ same time array updated, in variety of ways. when difference loops on second input, accidentally, in fact, comparing value against 0.

third, difference doesn't force evaluation of new arrays being created while traversing first argument. nothing requires old array evaluated during portion of loop.

both of latter issues need resolved fix space leak. first issue doesn't cause space leak, higher overheads needed.


Comments

Popular posts from this blog

sql - VB.NET Operand type clash: date is incompatible with int error -

SVG stroke-linecap doesn't work for circles in Firefox? -

python - TypeError: Scalar value for argument 'color' is not numeric in openCV -