how do we calculate the number of reads/misses of the cache in this code snippet?
Asked Answered
F

1

5

Given this code snippet from this textbook that I am currently studying. Randal E. Bryant, David R. O’Hallaron - Computer Systems. A Programmer’s Perspective [3rd ed.] (2016, Pearson) (global edition, so the book's exercises could be wrong.)

for (i = 31; i >= 0; i--) {
    for (j = 31; j >= 0; j--) {
        total_x += grid[i][j].x;
    }
}

for (i = 31; i >= 0; i--) {
    for (j = 31; j >= 0; j--) {
         total_y += grid[i][j].y;
    }
}

and this is the information given

The heart of the recent hit game SimAquarium is a tight loop that calculates the average position of 512 algae. You are evaluating its cache performance on a machine with a 2,048-byte direct-mapped data cache with 32-byte blocks (B = 32).

  struct algae_position {
         int x;
         int y;
  };
 
  struct algae_position grid[32][32];
  int total_x = 0, total_y = 0;
  int i, j;

You should also assume the following:

  • sizeof(int) = 4.
  • grid begins at memory address 0.
  • The cache is initially empty.
  • The only memory accesses are to the entries of the array grid.
  • Variables i, j, total_x, and total_y are stored in registers

The book gives the following questions as practice:

A. What is the total number of reads?    
Answer given : 2048  
B. What is the total number of reads that miss in the cache?
Answer given : 1024    
C. What is the miss rate?  
Answer given: 50%
  1. I'm guessing for A, the answer is derived from 32*32 *2? 32*32 for the dimensions of the matrix and 2 because there are 2 separate loops for x and y vals. Is this correct? How should the total number of reads be counted?

  2. How do we calculate the total number of misses that happen in the cache and the miss rate? I read that the miss rate is (1- hit-rate)

Farad answered 10/7, 2021 at 6:48 Comment(1)
@4386427 okay thank you! have added that info. The answers are from the book, but they could be wrong because this book is notorious for having wrong answers.Farad
T
9

Question A

You are correct about 32 x 32 x 2 reads.

Question B

The loops counts down from 31 towards 0 but that doesn't matter for this question. The answer is the same for loops going from 0 to 31. Since that is a bit easier to explain I'll assume increasing loop counters.

When you read grid[0][0], you'll get a cache miss. This will bring grid[0][0], grid[0][1], grid[0][2] and grid[0][3] into the cache. This is because each element is 2x4 = 8 bytes and the block size is 32. In other words: 32 / 8 = 4 grid elements in one block.

So the next cache miss is for grid[0][4] which again will bring the next 4 grid elements into the cache. And so on... like:

miss
hit
hit
hit
miss
hit
hit
hit
miss
hit
hit
hit
...

So in the first loop you simply have:

"Number of grid elements" divided by 4.

or

32 * 32 / 4 = 256

In general in the first loop:

Misses = NumberOfElements / (BlockSize / ElementSize)

so here:

Misses = 32*32 / (32 / 8) = 256

Since the cache size is only 2048 and the whole grid is 32 x 32 x 8 = 8192, nothing read into the cache in the first loop will generate cache hit in the second loop. In other words - both loops will have 256 misses.

So the total number of cache misses are 2 x 256 = 512.

Also notice that there seem to be a bug in the book.

Here:

The heart of the recent hit game SimAquarium is a tight loop that calculates the
average position of 512 algae.
                    ^^^
                    Hmmm... 512 elements...

Here:

for (i = 31; i >= 0; i--) {
    for (j = 31; j >= 0; j--) {
         ^^^^^^
         hmmm... 32 x 32 is 1024

So the loop access 1024 elements but the text says 512. So something is wrong in the book.

Question C

Miss rate = 512 misses / 2048 reads = 25 %

note:

Being very strict we cannot say for sure that the element size is two times the integer size. The C standard allow that structs contain padding. So in principle there could be 8 bytes padding in the struct (i.e. element size being 16) and that would give the results that the book says.

Their answered 10/7, 2021 at 7:35 Comment(6)
So can I clarify like this line "Misses = NumberOfElements / (BlockSize / ElementSize)" is it a general formula for all misses or just this one specifically?Farad
also how do you get "each element is 2x4 = 8 bytes " ? Isn't the size of the int 4? So accessing a single element of the 2D array would give us 4bytes per element, instead of 8 right?Farad
@MeganDarcy It's a struct of 2 intergers.Their
@438647 AH I missed that, I thought it would be an int only because we are only accessing the x value in this part: "grid[i][j].x ".... But I suppose it would be the whole struct loaded in and then the x value is taken from that. thanks!!! :)Farad
@MeganDarcy Exactly. The whole struct is read into the cacheTheir
@MeganDarcy: It's an array of structs, not a struct of arrays. If it was a struct of arrays (i.e. two arrays of ints), the two separate loops would each have better spatial locality. With this layout, the best option would be to do both x and y averages in one loop, so you're using all the data while it's hot in cache. (Unless your cache is big enough to hold the whole data set, then you can loop over it again and get all cache hits. Although it's still better to space out your cache misses more, assuming pipelined memory access and/or HW prefetch.)Paley

© 2022 - 2024 — McMap. All rights reserved.