Assume I have a function f
which accepts some input and produce a number. Within the function f
, a list is created according the input, which is then reduced (e.g. using foldl' g
) to produce the final output number. Because the intermediate list is to be reduced after all, is it possible to apply the reduce function g
without expressing the intermediate list. The goal here is to limit memory used for storing (or expressing, if 'storing' a less accurate word) the list.
To illustrate it, this function foldPairProduct
takes O(N1 * N2)
space for the intermediate list (the space consumed maybe more complicated due to expression and lazy evaluation, but I assume it is proportional or worse). Here N1, N2
are the size of the two input lists.
foldPairProduct :: (Num a, Ord a) => (a -> a -> a) -> [a] -> [a] -> a
foldPairProduct f xs ys = foldl1 f [ x*y | x <- xs, y <- ys]
An alternative implementation to the logic is foldPairProduct', which takes O(2 * 2)
space.
foldPairProduct' :: Num a => (Maybe a -> Maybe a -> Maybe a) -> [a] -> [a] -> Maybe a
foldPairProduct' _ _ [] = Nothing
foldPairProduct' _ [] _ = Nothing
foldPairProduct' f (x:xs) (y:ys) =
foldl1 f [Just $ x*y, foldPairProduct' f [x] ys, foldPairProduct' f xs [y],
foldPairProduct' f xs ys]
The situation is exacerbated for foldCrossProduct
whose implementation is similar to foldPairProduct
except that it accepts multiple lists as input. Its space complexity (still in imperative languages' sense) for the intermediate list is O(N1 * N2 * ...* Nk)
, where k
being the length of [[a]]
.
foldCrossProduct :: Num a => (a -> a -> a) -> [[a]] -> a
foldCrossProduct f xss = foldl1 f (crossProduct xss)
crossProduct :: Num a => [[a]] -> [a]
crossProduct [] = []
crossProduct (xs:[]) = xs
crossProduct (xs:xss) = [x * y | x <- xs, y <- crossProduct xss]
If we had followed the implementation idea of foldPairProduct'
, the space complexity would be k^2
, which is much more space efficient. My questions are:
I implemented
foldPairProduct'
for a pair of list. However, seems like implementing it for arbitrary number of list is not straightforward.I don't mean to compare Haskell to imperative language, but is there an implementation that would use constant space (or in another word, without expressing the intermediate list of the mentioned length)? Maybe Monad would help but I very new to it.
Does the compiler actually do its magic? That is, it notice the list is intermediate and to be reduced, and indeed figure out a way to evaluate it space-efficiently. After all, that is what I thought the lazy evaluation and compiler optimization is designed for.
Any comment is welcomed. Thanks.
Update 1
A performance test confirmed the analysis on 'space complexity' of foldPairProduct
and foldCrossProduct
, based on varying the input size N1, N2, N3
, and on observing the Bytes Copied by GC.
The performance test dis-proof the analysis on foldPairProduct'
which surprisingly showed N1 * N2
or even worse space usage. This is probably due to the recursive call is inefficiently evaluated. The results is attached below (with ghc settings same as what Yuras did).
Update 2
Updated some further experiment as I learn from the comments and answers. For foldPairProduct
, the total memory in use is consistent with the space complexity Daniel Fischer's explained.
For Following Daniel's advice, swapped the foldCrossProduct
, though Daniel's complexity analysis make sense to me, the results does not show a linear memory usage. x <- xs
and y <- crossproduct ys
, and it indeed achieve a linear space complexity.
For foldCrossProduct (max) [[1..100],[1..n], [1..1000]]
, with n = 100, 1000, 10000, 100000, the memory used are 2, 2, 3, 14 MB.
foldPairProduct [1..n] [1..10000]
n = 100
120,883,320 bytes allocated in the heap
56,867,728 bytes copied during GC
428,384 bytes maximum residency (50 sample(s))
98,664 bytes maximum slop
3 MB total memory in use (0 MB lost due to fragmentation)
n = 1000
1,200,999,280 bytes allocated in the heap
569,837,360 bytes copied during GC
428,384 bytes maximum residency (500 sample(s))
99,744 bytes maximum slop
3 MB total memory in use (0 MB lost due to fragmentation)
n = 10000
12,002,152,040 bytes allocated in the heap
5,699,468,024 bytes copied during GC
428,384 bytes maximum residency (5000 sample(s))
99,928 bytes maximum slop
3 MB total memory in use (0 MB lost due to fragmentation)
n = 100000
120,013,672,800 bytes allocated in the heap
56,997,625,608 bytes copied during GC
428,384 bytes maximum residency (50000 sample(s))
99,984 bytes maximum slop
3 MB total memory in use (0 MB lost due to fragmentation)
foldPairProduct [1..10000] [1..n]
n = 100
121,438,536 bytes allocated in the heap
55,920 bytes copied during GC
32,408 bytes maximum residency (1 sample(s))
19,856 bytes maximum slop
1 MB total memory in use (0 MB lost due to fragmentation)
n = 1000
1,201,511,296 bytes allocated in the heap
491,864 bytes copied during GC
68,392 bytes maximum residency (1 sample(s))
20,696 bytes maximum slop
1 MB total memory in use (0 MB lost due to fragmentation)
n = 10000
12,002,232,056 bytes allocated in the heap
5,712,004,584 bytes copied during GC
428,408 bytes maximum residency (5000 sample(s))
98,688 bytes maximum slop
3 MB total memory in use (0 MB lost due to fragmentation)
n = 100000
120,009,432,816 bytes allocated in the heap
81,694,557,064 bytes copied during GC
4,028,408 bytes maximum residency (10002 sample(s))
769,720 bytes maximum slop
14 MB total memory in use (0 MB lost due to fragmentation)
foldPairProduct [1..n] [1..n]
n = 100
1,284,024 bytes allocated in the heap
15,440 bytes copied during GC
32,336 bytes maximum residency (1 sample(s))
19,920 bytes maximum slop
1 MB total memory in use (0 MB lost due to fragmentation)
n = 1000
120,207,224 bytes allocated in the heap
114,848 bytes copied during GC
68,336 bytes maximum residency (1 sample(s))
24,832 bytes maximum slop
1 MB total memory in use (0 MB lost due to fragmentation)
n = 10000
12,001,432,024 bytes allocated in the heap
5,708,472,592 bytes copied during GC
428,336 bytes maximum residency (5000 sample(s))
99,960 bytes maximum slop
3 MB total memory in use (0 MB lost due to fragmentation)
n = 100000
1,200,013,672,824 bytes allocated in the heap
816,574,713,664 bytes copied during GC
4,028,336 bytes maximum residency (100002 sample(s))
770,264 bytes maximum slop
14 MB total memory in use (0 MB lost due to fragmentation)
foldCrossProduct (max) [[1..n], [1..100], [1..1000]]
n = 100
105,131,320 bytes allocated in the heap
38,697,432 bytes copied during GC
427,832 bytes maximum residency (34 sample(s))
209,312 bytes maximum slop
3 MB total memory in use (0 MB lost due to fragmentation)
n = 1000
1,041,254,480 bytes allocated in the heap
374,148,224 bytes copied during GC
427,832 bytes maximum residency (334 sample(s))
211,936 bytes maximum slop
3 MB total memory in use (0 MB lost due to fragmentation)
n = 10000
10,402,479,240 bytes allocated in the heap
3,728,429,728 bytes copied during GC
427,832 bytes maximum residency (3334 sample(s))
215,936 bytes maximum slop
3 MB total memory in use (0 MB lost due to fragmentation)
foldCrossProduct (max) [[1..100], [1..n], [1..1000]]
n = 100
105,131,344 bytes allocated in the heap
38,686,648 bytes copied during GC
431,408 bytes maximum residency (34 sample(s))
205,456 bytes maximum slop
3 MB total memory in use (0 MB lost due to fragmentation)
n = 1000
1,050,614,504 bytes allocated in the heap
412,084,688 bytes copied during GC
4,031,456 bytes maximum residency (53 sample(s))
1,403,976 bytes maximum slop
15 MB total memory in use (0 MB lost due to fragmentation)
n = 10000
quit after over 1362 MB total memory in use (0 MB lost due to fragmentation)
foldPairProduct' [1..n] [1..n]
n = 100
4,351,176 bytes allocated in the heap
59,432 bytes copied during GC
74,296 bytes maximum residency (1 sample(s))
21,320 bytes maximum slop
1 MB total memory in use (0 MB lost due to fragmentation)
n = 1000
527,009,960 bytes allocated in the heap
45,827,176 bytes copied during GC
211,680 bytes maximum residency (1 sample(s))
25,760 bytes maximum slop
2 MB total memory in use (0 MB lost due to fragmentation)