Large Arrays, and LOH Fragmentation. What is the accepted convention?
Asked Answered
S

6

16

I have an other active question HERE regarding some hopeless memory issues that possibly involve LOH Fragmentation among possibly other unknowns.

What my question now is, what is the accepted way of doing things? If my app needs to be done in Visual C#, and needs to deal with large arrays to the tune of int[4000000], how can I not be doomed by the garbage collector's refusal to deal with the LOH?

It would seem that I am forced to make any large arrays global, and never use the word "new" around any of them. So, I'm left with ungraceful global arrays with "maxindex" variables instead of neatly sized arrays that get passed around by functions.

I've always been told that this was bad practice. What alternative is there?

Is there some kind of function to the tune of System.GC.CollectLOH("Seriously") ? Are there possibly some way to outsource garbage collection to something other than System.GC?

Anyway, what are the generally accepted rules for dealing with large (>85Kb) variables?

Sandbank answered 8/3, 2010 at 15:15 Comment(0)
O
34

Firstly, the garbage collector does collect the LOH, so do not be immediately scared by its prescence. The LOH gets collected when generation 2 gets collected.

The difference is that the LOH does not get compacted, which means that if you have an object in there that has a long lifetime then you will effectively be splitting the LOH into two sections — the area before and the area after this object. If this behaviour continues to happen then you could end up with the situation where the space between long-lived objects is not sufficiently large for subsequent assignments and .NET has to allocate more and more memory in order to place your large objects, i.e. the LOH gets fragmented.

Now, having said that, the LOH can shrink in size if the area at its end is completely free of live objects, so the only problem is if you leave objects in there for a long time (e.g. the duration of the application).

Starting from .NET 4.5.1, LOH could be compacted, see GCSettings.LargeObjectHeapCompactionMode property.

Strategies to avoid LOH fragmentation are:

  • Avoid creating large objects that hang around. Basically this just means large arrays, or objects which wrap large arrays (such as the MemoryStream which wraps a byte array), as nothing else is that big (components of complex objects are stored separately on the heap so are rarely very big). Also watch out for large dictionaries and lists as these use an array internally.
  • Watch out for double arrays — the threshold for these going into the LOH is much, much smaller — I can't remember the exact figure but its only a few thousand.
  • If you need a MemoryStream, considering making a chunked version that backs onto a number of smaller arrays rather than one huge array. You could also make custom version of the IList and IDictionary which using chunking to avoid stuff ending up in the LOH in the first place.
  • Avoid very long Remoting calls, as Remoting makes heavy use of MemoryStreams which can fragment the LOH during the length of the call.
  • Watch out for string interning — for some reason these are stored as pages on the LOH and can cause serious fragmentation if your application continues to encounter new strings to intern, i.e. avoid using string.Intern unless the set of strings is known to be finite and the full set is encountered early on in the application's life. (See my earlier question.)
  • Use Son of Strike to see what exactly is using the LOH memory. Again see this question for details on how to do this.
  • Consider pooling large arrays.

Edit: the LOH threshold for double arrays appears to be 8k.

Outspeak answered 8/3, 2010 at 15:59 Comment(4)
This is most likely a bad question... but how can I get CDB and SOS? Google doesn't seem to know. All links I follow to "download CDB" point me to some MSDN search result page listing "DDK3" among other things that don't sound like CDB.Sandbank
Download 'Debugging Tools for Windows'. CDB is the console debugger and WinDbg is the 'graphical' equivalent (it's still text-mode but rendered in an MDI window). SOS comes with .NET: '%systmeroot%\microsoft.net\framework\v2.0.50727\sos.dll' (.NET 3 uses the .NET 2 SOS.dll, .NET 4 comes with its own version.)Outspeak
Also, you can apparently load SOS from the Immediate window in Visual Studio (avoiding need for CDB), but I have never tried.Outspeak
Well... maybe LOH Fragmentation isn't my problem after all. I don't see the word "free" next to anything...Sandbank
S
8

It's an old question, but I figure it doesn't hurt to update answers with changes introduced in .NET. It is now possible to defragment the Large Object Heap. Clearly the first choice should be to make sure the best design choices were made, but it is nice to have this option now.

https://msdn.microsoft.com/en-us/library/xe0c2357(v=vs.110).aspx

"Starting with the .NET Framework 4.5.1, you can compact the large object heap (LOH) by setting the GCSettings.LargeObjectHeapCompactionMode property to GCLargeObjectHeapCompactionMode.CompactOnce before calling the Collect method, as the following example illustrates."

GCSettings can be found in the System.Runtime namespace

GCSettings.LargeObjectHeapCompactionMode = GCLargeObjectHeapCompactionMode.CompactOnce;
GC.Collect(); 
Sides answered 8/4, 2015 at 16:20 Comment(1)
This is very useful - our applications call it (and an induced GC) after they do their large 'cache reloads' which enables .NET GC to reclaim all the old long-lived cached objects and compact all the new caches into the LOH. Most of the caches are large and created up-front / infrequently. However, dynamic caches that might spill over to LOH don't benefit much as it's complicated to know when a "good time" is to request .NET to compact the LOH.Bengt
G
7

The first thing that comes to mind is to split the array up into smaller ones, so they don't reach the memory needed for the GC to put in it the LOH. You could spit the arrays into smaller ones of say 10,000, and build an object which would know which array to look in based on the indexer you pass.

Now I haven't seen the code, but I would also question why you need an array that large. I would potentially look at refactoring the code so all of that information doesn't need to be stored in memory at once.

Giule answered 8/3, 2010 at 15:18 Comment(2)
My application takes in images of potential size 1024x720, converts that to a matrix of "height" based on pixel intensity. Then it takes that matrix and renders a surface map in OpenGL. So, I really need all that data at the same time.Sandbank
FWIW: 10k elements is a 'good theshold value' because 10k*8 bytes (64-bit object refs) < 85k LOH threshold. Arrays of double are moved to LOH using separate calculations.Bengt
K
5

You get it wrong. You do NOT need to havean array size 4000000 and you definitely do not need to call the garbace collector.

  • Write your own IList implementation. Like "PagedList"
  • Store items in arrays of 65536 elements.
  • Create an array of arrays to hold the pages.

This allows you to access basically all your elements with ONE redirection only. And, as the individual arrays are smaller, fragmentation is not an issue...

...if it is... then REUSE pages. Dont throw them away on dispose, put them on a static "PageList" and pull them from there first. All this can be transparently done within your class.

The really good thing is that this List is pretty dynamic in the memory usage. You may want to resize the holder array (the redirector). Even when not, it is about 512kbdata per page only.

Second level arrays have basically 64k per byte - which is 8 byte for a class (512kb per page, 256kb on 32 bit), or 64kb per struct byte.

Technically:

Turn int[] into int[][]

Decide whether 32 or 64 bit is better as you want ;) Both ahve advantages and disadvantages.

Dealing with ONE large array like that is unwieldely in any langauge - if you ahve to, then... basically.... allocate at program start and never recreate. Only solution.

Kwei answered 8/3, 2010 at 15:29 Comment(3)
So, a jagged array int[][] doesn't need to use continuous memory? Does an array such as int[,] behave the same way?Sandbank
I believe new int[,] would allocate one contiguous block. Jagged arrays are allocated separately.Selaginella
Note: The LOH selection kicks in at memory size, not element count. The LOH threshold is 85k bytes, so about 10k (x8 byte object refs) elements on 64-bit. Even with int[], that is only ~20k. Both thresholds are less than 65k elements proposed, which would still end up on LOH.Bengt
P
1

This is an old question, but with .NET Standard 1.1 (.NET Core, .NET Framework 4.5.1+) there is another possible solution:

Using ArrayPool<T> in the System.Buffers package, we can pool arrays to avoid this problem.

Parboil answered 22/12, 2019 at 12:15 Comment(0)
O
0

Am adding an elaboration to the answer above, in terms of how the issue can arise. Fragmentation of the LOH is not only dependent on the objects being long lived, but if you've got the situation that there are multiple threads and each of them are creating big lists going onto the LOH then you could have the situation that the first thread needs to grow its List but the next contiguous bit of memory is already taken up by a List from a second thread, hence the runtime will allocate new memory for the first threads List - leaving behind a rather big hole. This is whats happening currently on one project I've inherited and so even though the LOH is approx 4.5 MB, the runtime has got a total of 117MB free memory but the largest free memory segment is 28MB.

Another way this could happen without multiple threads, is if you've got more than one list being added to in some kind of loop and as each expands beyond the memory initially allocated to it then each leapfrogs the other as they grow beyond their allocated spaces.

A useful link is: https://www.simple-talk.com/dotnet/.net-framework/the-dangers-of-the-large-object-heap/

Still looking for a solution for this, one option may be to use some kind of pooled objects and request from the pool when doing the work. If you're dealing with large arrays then another option is to develop a custom collection e.g. a collection of collections, so that you don't have just one huge list but break it up into smaller lists each of which avoid the LOH.

Olden answered 16/5, 2013 at 10:26 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.