Page Faults and Programming In .Net
Asked Answered
I

3

5

I'm reading a book about Operating Systems and it gives some examples in C that I mostly understand. The example I'm looking at now shows two nearly identical pieces of code that will run on a fictitious system...

int i, j;
int [128] [128] data;

for (j = 0; j < 128; j++)
    for (i = 0; i < 128; i++)
        data [i] [j] = 0;

And the second piece of code

int i, j;
int [128] [128] data;

for (i = 0; i < 128; i++)
    for (j = 0; j < 128; j++)
        data [i] [j] = 0;

On this particular system the first section of code would result in 16k page faults, while the second would result in only 128.

My apologies if this is a silly question, but in my experiences with .NET I've always been largely unaware of memory. I just create a variable and it is 'somewhere' but I don't know where and I don't care.

My question is, how would .NET compare to these C examples in this fictional system (pages are 128 words in size, each row of the array takes one full page. In the first example we set one int on page 1, then one int on page 2, etc....while the second example sets all ints on page 1, then all ints on page 2, etc...)

Also, while I think I understand why the code produces different levels of paging, is there anything meaningful I can do with it? Isn't the size of the page dependent on the operating system? Is the take away that, as a general rule of thumb to access memory in an array as contiguously as possible?

Intradermal answered 15/9, 2011 at 11:39 Comment(5)
Page size is property of CPU architecture AND current mode (mode is set by OS). x86 usually has 4096 bytes pages and allows 2 MB or 4 MB pages in some modes. x86_64 uses 4096 at most times and allows 2MB pages and 1GB pages (not all x86_64 cpus). Some rare archs allows OS to select page size from set like 2kb, 4kb, 8kb, 16kb, 32kb, 64kb.Gilman
Is it just me, or are those 2 code examples exactly identical rather than 'nearly'?Driven
is it just me or is there no difference between the 2 pieces of code?Hals
Agreed: There is not a single difference between the two code fragmentsDorian
Sorry! Code sample should be fixed. The difference is in the ordering of the For loopsIntradermal
M
7

Fictitious operating systems can be useful to demonstrate principles, but no real operating system I know actually works this way. You could only get 16K page faults if the page immediately gets unmapped after use. While it is technically possible for this to happen, the machine would have to be under major load, being desperately out of RAM to support the set of running processes. At that point you just don't care about perf anymore, the machine will have slowed down to a crawl anyway. The technical term for that is "thrashing".

There's something much more important going on in this code that you didn't mention. The level 1 CPU cache plays a major role in how quickly you can access array elements. A cache line (typically 64 bytes) gets filled from memory on the first access. Accessing the next element, at the next memory address, is very cheap, the data is already in the cache. Accessing the array with the outer index changing first is very expensive, it requires another cache line fill which can take hundreds of cycles, worst case. With the considerable risk of evicting an existing cache line that also contains array data, the 1st level cache is not very big. Getting this right requires knowing how the runtime orders array elements in memory.

Madian answered 15/9, 2011 at 12:59 Comment(1)
+1 for mentioning the caches. Arrays are rarely big enough for page faults to matter, but the CPU caches have a significant effect and are heavily influenced by iteration order.Diffract
S
2

Your two samples are identical, but presumably what you want is:

int i, j;
int [128] [128] data;

for (i = 0; i < 128; i++)
    for (j = 0; j < 128; j++)
        data [i] [j] = 0;  // <--- i,j

And the second piece of code

int i, j;
int [128] [128] data;

for (i = 0; i < 128; i++)
    for (j = 0; j < 128; j++)
        data [j] [i] = 0;   // <--- j,i

So assuming a page size of 128 * sizeof(int):

  • in one case you are iterating sequentially over the array. In this case the worst case number of page faults is the size of the array in pages = 1.

  • in the other case you are jumping to a new page on each iteration, so in the worst case you may have a page fault on every iteration of the loop.

The situation is exactly the same in .NET.

Statolatry answered 15/9, 2011 at 11:55 Comment(0)
M
0

The number of page faults you get will depend on the number of pages you access which are not in memory.

Pages on the stack are more likely to be in memory unless this is the first time the stack has been used to this degree.

Assuming your page size is 4 KB as it is on many systems, and none of the pages are in memory, you will be accessing 128*128*4 bytes or 64 KB which is 16 pages. Note: it is possible for this structure to be across 17 pages if it doesn't start on a page boundary, in which case the first page will be in memory.

As long as you access every page, it doesn't matter how you access them or in what order.


It could be that you are talking about cache misses, and the order you access the data can make a difference. In this case it is best to access data incrementally along the last index. i.e. for a[i][j] you want the inner loop to use j However if your cache is large enough, this may not matter. It can make a big different for really large structures which don't fit into the faster caches.

Myiasis answered 15/9, 2011 at 11:54 Comment(4)
Actually, it does matter what order you access the array members from a page faulting standpoint. If you don't access the memebers of the array contiguously, you end up jumping to a different page on each iteration of the inner loop, which makes it more likely that you will fault on the same page multiple times (i.e., the system unloads the first page as you move through the other 15 pages, and then faults when you access the first page again). On today's memory-rich systems, I'm guessing that it is unlikely to happen in practice with this very simple example program, however.Tod
The above logic also applies to the C version, however, which contradicts the problem conditions, which states that when run on a machine with a 128-word page size, there will be 16384 page faults. Clearly there is another assumption that the machine has only one free page remaining.Interdict
AFAIK, a program cannot progress until a faulted page is brought into memory. This means you can only fault on a page once. I am assuming you have much more 64 KB of memory. If you have less, you would be right.Myiasis
@downvoter, can you say what I have said is so different from Hans? Most real operating system will ensure far more than 64 KB is always free and the cache is far more likely to make a difference.Myiasis

© 2022 - 2024 — McMap. All rights reserved.