In Intel's optimization guide, section 2.1.3, they list a number of enhancements to the caches and memory subsystem in Skylake (emphasis mine):
The cache hierarchy of the Skylake microarchitecture has the following enhancements:
- Higher Cache bandwidth compared to previous generations.
- Simultaneous handling of more loads and stores enabled by enlarged buffers.
- Processor can do two page walks in parallel compared to one in Haswell microarchitecture and earlier generations.
- Page split load penalty down from 100 cycles in previous generation to 5 cycles.
- L3 write bandwidth increased from 4 cycles pe r line in previous generation to 2 per line.
- Support for the CLFLUSHOPT instruction to flush ca che lines and manage memory ordering of flushed data using SFENCE.
- Reduced performance penalty for a software prefetch that specifies a NULL pointer.
- L2 associativity changed from 8 ways to 4 ways.
The final one caught my eye. In what way is a reduction in the number of ways an enhancement? By itself, it seems that fewer ways is strictly worse than more ways. Of course, I get that there might be valid engineering reasons why a reduction in the number of ways could be a tradeoff that enables other enhancements, but here it is positioned, by itself, as an enhancement.
What am I missing?