Measure L1 data cache miss with perf and papi
Asked Answered
P

0

6

What is the difference between PAPI_L1_LDM in papi and L1-dcache-load-misses in perf?

I've used the same setting, like this post here.

So, as a result I get for papi:

PAPI_L1_DCM: 515 <- L1 data cache miss (probably L1D_READ_MISSES_ALL + L1D_READ_MISSES_RETRIED?)
PAPI_L1_ICM: 300 <- L1 Instruction cache miss
PAPI_L1_LDM: 441 <- L1 Load data miss
PAPI_L1_TCM: 815 <- L1 Total cache miss

Unfortunately PAPI_L1_DCA is not supported at this machine.

And for perf (only in the user-space, since papi measures also only user-space and no kernel space): call: perf stat -B -e L1-dcache-load-misses:u,cache-misses:u ./perf

    16,539      L1-dcache-load-misses
       128      cache-misses:u  

16,539 seems to be more reasonable for N=1000000. What is the difference between a load-data-miss (PAPI_L1_LDM in papi) and a data cache miss (PAPI_L1_DCM in papi) and why do these numbers differ in papi and perf? Is the cache-misses:u in perf related to the L2 cache-misses?

edit: Hardware (Xeon E5-2600 v3 family, Haswell-EP 12 cores)

Phrenology answered 3/8, 2017 at 10:28 Comment(1)
Now I go with Vtune, since the GUI gives you also a command-line snippet.Phrenology

© 2022 - 2024 — McMap. All rights reserved.