Extremely high rates of paging active memory to disk but low constant memory usage
Asked Answered
G

2

11

As the title states, I have a problem with high page file activity.

I am developing a program that process a lot of images, which it loads from the hard drive. From every image it generates some data, that I save on a list. For every 3600 images, I save the list to the hard drive, its size is about 5 to 10 MB. It is running as fast as it can, so it max out one CPU Thread.

The program works, it generates the data that it is supposed to, but when I analyze it in Visual Studio I get a warning saying: DA0014: Extremely high rates of paging active memory to disk.

The memory comsumption of the program, according to Task Manager is about 50 MB and seems to be stable. When I ran the program I had about 2 GB left out of 4 GB, so I guess I am not running out of RAM. Memory usage of my programhttps://i.sstatic.net/TDAB0.png

The DA0014 rule description says "The number of Pages Output/sec is frequently much larger than the number of Page Writes/sec, for example. Because Pages Output/sec also includes changed data pages from the system file cache. However, it is not always easy to determine which process is directly responsible for the paging or why."

Does this mean that I get this warning simply because I read a lot of images from the hard drive, or is it something else? Not really sure what kind of bug I am looking for.

EDIT: Link to image inserted.

EDIT1: The images size is about 300 KB each. I dipose each one before loading the next.

UPDATE: Looks from experiments like the paging comes from just loading the large amount of files. As I am no expert in C# or the underlying GDI+ API, I don't know which of the answers are most correct. I chose Andras Zoltans answer as it was well explained and because it seems he did a lot of work to explain the reason to a newcomer like me:)

Gesellschaft answered 7/12, 2012 at 8:0 Comment(6)
You could try disabling the file cache and retesting to see if there is a difference. technet.microsoft.com/en-us/sysinternals/bb897561Worry
Could it be that the images are read using memory-mapped files? If yes, then the big number of page faults is the excepted behavior because that's how the images are effectively read.Hixson
@Worry Tried setting the maximum size to something very low, but the current size just outgrew it. And no change in performance of the program, but Windows certainly acted weird:)Gesellschaft
@Hixson To my knowledge the images are not memory-mapped. Just ordinary images placed in a folder on the hard drive.Gesellschaft
@AndersJørgensen: Memory-mapping is a technique for reading and writing files, not a special type of file. It can be done with any file. What functions or libraries are you using to read the files?Hixson
@Hixson I am using Emgu (OpenCV for C#). The constructer for the Image class takes a path for input. Here is the reference guide emgu.com/wiki/files/2.4.2/document/Index.html (sry can't link directly to the constructor).Gesellschaft
Y
4

Updated following more info

The working set of your application might not be very big - but what about the virtual memory size? Paging can occur because of this and not just because of it's physical size. See this screen shot from Process Explorer of VS2012 running on Windows 8:

VS 2012 Memory

And on task manager? Apparently the private working set for the same process is 305,376Kb.

We can take from this a) that Task Manager can't necessarily be trusted and b) an application's size in memory, as far as the OS is concerned, is far more complicated than we'd like to think.

You might want to take a look at this.

The paging is almost certainly because of what you do with the files and the high final figures almost certainly because of the number of files you're working with. A simple test of that would be experiment with different numbers of files and generate a dataset of final paging figures alongside those. If the number of files is causing the paging, then you'll see a clear correlation.

Then take out any processing (but keep the image-loading) you do and compare again - note the difference.

Then stub out the image-loading code completely - note the difference.

Clearly you'll see the biggest drop in faults when you take out the image loading.

Now, looking at the Emgu.CV Image code, it uses the Image class internally to get the image bits - so that's firing up GDI+ via the function GdipLoadImageFromFile (Second entry on this index)) to decode the image (using system resources, plus potentially large byte arrays) - and then it copies the data to an uncompressed byte array containing the actual RGB values.

This byte array is allocated using GCHandle.Alloc (also surrounded by GC.AddMemoryPressure and GC.RemoveMemoryPressure) to create a pinned byte array to hold the image data (uncompressed). Now I'm no expert on .Net memory management, but it seems to me that what we have a potential for heap fragmentation here, even if each file is loaded sequentially and not in parallel.

Whether that's causing the hard paging I don't know. But it seems likely.

In particular the in-memory representation of the image could be specifically geared around displaying as opposed to being the original file bytes. So if we're talking JPEGs, for example, then a 300Kb JPEG could be considerably larger in physical memory, depending on its size. E.g. a 1027x768 32 bit image is 3Mb - and that's been allocated twice for each image since it's loaded (first allocation) then copied (second allocation) into the EMGU image object before being disposed.

But you have to ask yourself if it's necessary to find a way around the problem. If your application is not consuming vast amounts of physical RAM, then it will have much less of an impact on other applications; one process hitting the page file lots and lots won't badly affect another process that doesn't, if there's sufficient physical memory.

Yearly answered 7/12, 2012 at 9:32 Comment(3)
Thanks for your help. The images are about 300 KB each and I load them with EMGU (OpenCV for C#), so I don't know the underlying structure. But I access the image pixels, so the bytes must be accessible somehow.Gesellschaft
@AndersJørgensen have updated - don't really have a different conclusion for you, but there's more detail.Yearly
I tried removing the image processing code and it seems the paging is strongly related to just loading the files. I will long into whether I need to find a fix or just buy an extra disk to split data and pagefile. Thank you for the answer.Gesellschaft
E
1

However, it is not always easy to determine which process is directly responsible for the paging or why.

The devil is in that cop-out note. Bitmaps are mapped into memory from the file that contains the pixel data using a memory-mapped file. That's an efficient way to avoid reading and writing the data directly into/from RAM, you only pay for what you use. The mechanism that keeps the file in sync with RAM is paging. So it is inevitable that if you process a lot of images then you'll see a lot of page faults. The tool you use just isn't smart enough to know that this is by design.

Feature, not a bug.

Elyssa answered 7/12, 2012 at 14:31 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.