The probability of selected EFLAGS bits
Asked Answered
W

1

6

We all know that when looking at source code it's a safe assumption that the direction flag will be clear. The probability of the direction flag is very low.

I wanted to find out about the probabilities of the other flags. That's why I wrote a test program that single steps some of my existing software, incrementing a counter for each of the first 12 EFLAGS bits.

Probability of x86 processor flags

Results confirm the assumption made about the direction flag (DF) and, not surprisingly, show that the probability of the overflow flag (OF) is very low.

But what about the other flags? The carry flag (CF), auxiliary flag (AF), zero flag (ZF), and sign flag (SF) seem to settle at 25%, but the parity flag (PF) jumps out at well over 50%.

I'd like to know why the probabilities of CF, AF, ZF, and SF are so low.

For the PF, my own two cents explanation tells me that, given the 50-50 distribution of parity even and parity odd in all possible 8-bit bitpatterns and realizing that a couple of the most frequently used numbers (0 and -1) have parity even, a more than 50% chance is reasonable.

Whitlow answered 28/1, 2018 at 19:3 Comment(8)
Cool research but I'm really unsure if there is an actual answer to your question.Nightcap
could be interesting to generate random numbers out of flag value.Metaplasia
There are just too many combinations and permutations that could skew your results even more. As an example, suppose in the RTC interrupt DF is inverted and then again just before IRET. Then the count in graphical would be 2,800,000,000 approx. One thing I did find interesting is the inverse proportionality of CF & ZF. There are 1/2 as many ZF's as CF's in Basic compiler, but twice as many ZF's as CF in IDE. That would suggest, implementers in BASIC favor one comparison paradigm over the other, and vise versa in graphical IDE.Genetics
@Genetics About the RTC interrupt. Mine was a simple test program. All code in interrupt handlers, hardware or software, was excluded since the trap flag is by design automatically cleared on entering the handler.Whitlow
Loops are an example where CF and ZF are usually set in the last iterations and are filled with instructions that don't affect them (mostly moves, pushes or arithmetic for pointers). Since tracing is at runtime, loops lower the odds for CF and ZF. When doing arithmetic SF and CF denote "critical" conditions that one usually tries to avoid. When doing logic ops, the odds should be 50-50. Note how AF and CF have similar scores (maybe set at the same with an OR/SBB/SUB). However, we cannot assume a uniform distribution of values and "affectability of a flag", so maybe the expectations are wrong?Holocaine
Is this per-instruction, even ones like mov, which don't affect flags? I got the idea it would be somehow more interesting if you would check the flag only when the instruction did affect it, although I have no idea why it sounds interesting and what would be benefit of that. Basically if you have CF=1 across larger area of mov, you count it several times, while the code is not relevant to the CF content. About PF - I agree, probably the zero alone will skew it enough to be above 50% most of the time.Nellanellda
Isn't this going to depend a lot on what your program is doing, as well as how your compiler chooses to do things? Which compiler, and what applications were you instrumenting? Extended-precision math with numbers that are actually large will set CF more. uint64_t with numbers that are not large will often clear CF (I think). Signed values crossing zero will set CF, but it can be rare to actually have negative numbers even when using signed types.Sluiter
FP comparisons set PF on NaN. But any bias or non-uniformity in the small numbers you handle will probably dominate that. Address math using add / sub may often be aligned, leaving the low couple bits of a reg clear, biasing PF somehow... Or maybe just some specific numbers come up a lot, and they happen to set PF.Sluiter
D
3

The fact that certain EFLAGS bits are getting changed often merely reflects the fact that early Intel 8086 instructions (and incidentally still very often used ones) were designed to update the flags unconditionally. That design decision did not pay out well, but it does not hurt modern x86 designs either until someone uses values of flags. Usage of a flag (as a predicate of a conditional branch) creates a dependency in the code stream and it may potentially affect performance.

If there are two (three, four…) instructions that consequently update the same flag bit but only the last value is used later by a third instruction, then all previous flag calculation can be omitted. Alternatively, the recalculation of EFLAGS can be delayed until something has requested its actual value.

Thus, it is a more interesting question how often individual EFLAG bits are used. And there are studies that answer it.

The following picture is taken from Yair Lifshitz et al. "Zsim: A Fast Architectural Simulator for ISA Design-Space Exploration" section 3.2.3 PDF:

EFLAGS use

As you can see, nobody cares about auxiliary flag, while carry and zero flags affect many decisions in code. Other authors came to similar conclusions, which are important in context of e.g. software simulation design, because they allow applying an important optimization of lazy flag evaluation in binary translators and interpreters.

Disease answered 22/2, 2018 at 22:28 Comment(3)
That design decision did not pay out well I wouldn't say that. The really problematic instructions for high-performance implementations are instructions that conditionally update flags, like shifts which leave them unmodified if the count is 0. This is why shl reg,cl is 3 uops on Sandybridge-family. Unconditional writes to some but not all flags are also problematic, like inc. Instructions like add are basically fine. See INC instruction vs ADD 1: Does it matter? for more about inc and shiftSluiter
Having most instructions set flags in fact was a good thing for 8086, I think, increasing code density by avoiding the need for test ax,ax or whatever before some jcc instructions. This is still an advantage for x86 code density. Out-of-order execution does have to rename the flags as well as integer registers to avoid output dependencies, but I think out-of-order implementations of ISAs that don't always set flags by default would do that, too. (e.g. ARM. Only the flag-setting version of some insns has a short Thumb2 encoding, because redundant flag writes don't hurt performance.)Sluiter
@PeterCordes I didn't know that old shifts were that bad, thanks! I do not think that it was bad for 8086-era, but now it is an annoying legacy (renaming mostly solves it).Disease

© 2022 - 2024 — McMap. All rights reserved.