I have a Haskell program that generates ~280M of logging text data during a run inside the ST monad. This is where virtually all memory consumption goes (with logging disabled the program allocates a grand total of 3MB real memory).
Problem is, I run out of memory. While the program runs memory consumption exceeds 1.5GB, and it finally runs out when it tries to write the log string to a file.
The log function takes a String and accumulates the log data into a string builder stored in an STRef in the environment:
import qualified Data.ByteString.Lazy.Builder as BB
...
myLogFunction s = do
...
lift $ modifySTRef myStringBuilderRef (<> BB.stringUtf8 s)
I tried introducing strictness using bang patterns and modifySTRef', but this made memory consumption even worse.
I write the log string as recommended by the hPutBuilder documentation, like this:
hSetBinaryMode h True
hSetBuffering h $ BlockBuffering Nothing
BB.hPutBuilder h trace
This consumes several additional GBs of memory. I tried different buffering settings and converting to a lazy ByteString first (slightly better).
Qs:
How can I minimize memory consumption while the program runs? I'd expect given a tight ByteString representation and the appropriate amount of strictness I'd need little more memory than the ~280M of actual log data I'm storing.
How can I write the result to a file without allocating memory? I don't understand why Haskell needs GBs of memory to just stream some resident data into a file.
Edit:
Here's the memory profile for a small run (~42MB of log data). The total memory use is 3MB with logging disabled.
15,632,058,700 bytes allocated in the heap
4,168,127,708 bytes copied during GC
343,530,916 bytes maximum residency (42 sample(s))
7,149,352 bytes maximum slop
931 MB total memory in use (0 MB lost due to fragmentation)
Tot time (elapsed) Avg pause Max pause
Gen 0 29975 colls, 0 par 5.96s 6.15s 0.0002s 0.0104s
Gen 1 42 colls, 0 par 6.01s 7.16s 0.1705s 1.5604s
TASKS: 3 (1 bound, 2 peak workers (2 total), using -N1)
SPARKS: 0 (0 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled)
INIT time 0.00s ( 0.00s elapsed)
MUT time 32.38s ( 33.87s elapsed)
GC time 11.97s ( 13.31s elapsed)
RP time 0.00s ( 0.00s elapsed)
PROF time 0.00s ( 0.00s elapsed)
EXIT time 0.00s ( 0.00s elapsed)
Total time 44.35s ( 47.18s elapsed)
Alloc rate 482,749,347 bytes per MUT second
Productivity 73.0% of total user, 68.6% of total elapsed
Edit:
I ran a memory profile with a small log run as asked:
profile http://imageshack.us/a/img14/9778/6a5o.png
I tried adding bang patterns, $!, deepseq/$!!, force and such in the relevant places, but it doesn't seem to make any difference. How do I force Haskell to actually take my string / printf expression etc. and put it in a tight ByteString instead of keeping all those [Char] lists and unevaluated thunks around?
Edit:
Here's the actual full trace function
trace s = do
enable <- asks envTraceEnable
when (enable) $ do
envtrace <- asks envTrace
let b = B8.pack s
lift $ b `seq` modifySTRef' envtrace (<> BB.byteString b)
Is this 'strict' enough? Do I need to watch out for anything if I call this typeclass function inside my ReaderT/ST monad? Just so that it is actually called and not deferred in any way.
do
trace $ printf "%i" myint
is fine?
Thanks!
stringUtf8
, then my suspicion is that the resultingBuilder
holds a large number of references toString
, and that's where the memory goes. – UnconstitutionalbyteString
instead ofstringUtf8
(i.e. doingString -> ByteString
conversion beforemappend
ing to aBuilder
). – Unconstitutionalhp2ps
and generated with the-hy
flag. You'll almost certainly find that you're not releasing any memory at all until the end - your builder is holding on to entire object graphs, not just theByteString
s. But as @MikhailGlushenkov pointed out, we need more data - and what you've provided isn't it. – WeatheringBuilder
representation, by the way. Forcing aBuilder
does nothing. Forcing theByteString
being added to it is vital if theByteString
has an object graph in its construction. – WeatheringData.ByteString.Lazy.Builder
was renamed toData.ByteString.Builder
in bytestring-0.10.2.0. – UnconstitutionalString
s, as I suspected. I agree with @Weathering that this looks like excessive laziness. BTW, you can generate color output withhp2ps -c
. – Unconstitutional