I'll ask this with a Scala example, but it may well be that this affects other languages which allow hybrid imperative and functional styles.
Here's a short example (UPDATED, see below):
def method: Iterator[Int] {
// construct some large intermediate value
val huge = (1 to 1000000).toList
val small = List.fill(5)(scala.util.Random.nextInt)
// accidentally use huge in a literal
small.iterator filterNot ( huge contains _ )
}
Now iterator.filterNot
works lazily, which is great! As a result, we'd expect that the returned iterator won't consume much memory (indeed, O(1)). Sadly, however, we've made a terrible mistake: since filterNot
is lazy, it keeps a reference to the function literal huge contains _
.
Thus while we thought that the method would require a large amount of memory while it was running, and that that memory could be freed up immediately after the termination of the method, in fact that memory is stuck until we forget the returned Iterator
.
(I just made such a mistake, which took a long time to track down! You can catch such things looking at heap dumps ...)
What are best practices for avoiding this problem?
It seems that the only solution is to carefully check for function literals which survive the end of the scope, and which captured intermediate variables. This is a bit awkward if you're constructing a non-strict collection and planning on returning it. Can anyone think of some nice tricks, Scala-specific or otherwise, that avoid this problem and let me write nice code?
UPDATE: the example I'd given previously was stupid, as huynhjl's answer below demonstrates. It had been:
def method: Iterator[Int] {
val huge = (1 to 1000000).toList // construct some large intermediate value
val n = huge.last // do some calculation based on it
(1 to n).iterator map (_ + 1) // return some small value
}
In fact, now that I understand a bit better how these things work, I'm not so worried!