I am trying to write a driver with custom mmap()
function for PCIe BAR, with the goal to make this BAR cacheable in the processor cache. I am aware this is not the best way to achieve highest bandwidth and that the order of writes is unpredictable (neither are the issues in this case).
This is similar to what is described in How would one prevent MMAP from caching values?
The processor is Sandy Bridge i7, PCIe device is Altera Stratix IV dev. board.
First, I tried to do it on CentOS 5 (2.6.18). I changed the MTRR settings to make sure the BAR is not within uncacheable MTRR and used io_remap_pfn_range()
with _PAGE_PCD
and _PAGE_PWT
bits cleared. Reads worked as expected: reads returned correct values and second read to the same address does not necessarily cause the read to go to PCIe (read counter was checked in FPGA). However, the writes caused the system to freeze and then reboot without any messages in the logs or on the screen.
Second, I tried to do it on CentOS 6 (2.6.32), which has PAT support. The result is the same: reads work correctly, writes cause system freeze and reboot. Interestingly, non-temporal/write-combining full cache line writes (AVX/SSE) work as expected, i.e. they always go to FPGA and FPGA observes full cache line writes, reads return correct values afterwards. However, simple 64-bit writes still cause system freeze/reboot.
I also tried to ioremap_cache()
and then iowrite32()
inside the driver code. The result is the same.
I think it is a hardware issue but I would appreciate if somebody can share any ideas about what's going on.
EDIT: I was able to capture MCE message on CentOS 6: Machine Check Exception: 5 Bank 5: be2000000003110a.
I also tried the same code on 2-socket Sandy Bridge (Romley): reads and non-temporal write behavior is the same, simple writes do not cause MCE/crash but have no effect on system state, i.e. value in memory does not change.
Also, I tried the same code on older 2-socket Nehalem system: simple writes also cause MCE, although the codes are different.