learning sample of likely() and unlikely() compiler hints

Asked 29/4, 2010 at 16:3 Answered 6/7, 2020 at 9:59

Solved c gcc optimization likely-unlikely

How can I demonstrate for students the usability of likely and unlikely compiler hints (__builtin_expect)?

Can you write an sample code, which will be several times faster with these hints comparing the code without hints.

Quinze answered 29/4, 2010 at 16:3 Comment(9)

kerneltrap.org/node/4705 – Pozzy 29/4, 2010 at 16:11

@T.J. Crowder, yes. but I want a program sample in which students can FEEL the difference, not to read it in assembler. – Quinze 29/4, 2010 at 16:35

@osgx: You won't get such a sample, because there is no difference to be felt. This is the worst, ugliest kind of useless micro-optimization. – Piloting 6/2, 2011 at 8:14

@R.., but Linux kernel actively using It. Also, the programmer knows a lot about his program and can mark rare used code better than compiler can detect this. – Quinze 7/2, 2011 at 0:8

Linux kernel is actively writing all sorts of things that are horrible C. I agree the programmer can better know which branch is more likely, but that doesn't mean I support ugly premature micro-optimization all over the place. It's really rare that these ugly likely macros produce any perceivable performance benefit. – Piloting 7/2, 2011 at 0:27

@R.., If you think that likely/unlikely can't give any benefit, you cat open a new Question. But this question was opened upon request from my friend. He is a CS teacher of first-year students and he wanted to show them a likely/unlikely example. He wasn't able to produce a example, so I asked this q. I don't know is this example required by course or he adds it by himself. But the goal of this is to give basic understanding of likely to students. – Quinze 7/2, 2011 at 0:34

What kind of first-year CS instructor teaches students premature micro-optimization? He should be teaching them the real language first, and higher-level efficiency considerations (such as working with data in-place rather than allocating copies, efficient algorithms with respect to time and space, etc.). If likely and unlikely are ever taught in a CS curriculum, it should be a senior year topic. – Piloting 7/2, 2011 at 4:13

Its russia. First year of the high school. What do you mean as "senior year"? – Quinze 7/2, 2011 at 16:16

This teacher is the coach of ACM team, which passed to half-final or even to final. So, I think, he does teach a lot of high-level optimisation. Also, this unlikely/likely was not the first lesson, but it was only a part of lesson in middle of the year. – Quinze 7/2, 2011 at 16:19

Here is the one I use, a really inefficient implementation of the Fibonacci numbers:

#include <stdio.h>
#include <inttypes.h>
#include <time.h>
#include <assert.h>

#define likely(x) __builtin_expect((x),1)
#define unlikely(x) __builtin_expect((x),0)

uint64_t fib(uint64_t n)
{
    if (opt(n == 0 || n == 1)) {
        return n;
    } else {
        return fib(n - 2) + fib(n - 1);
    }
}

int main(int argc, char **argv)
{
    int i, max = 45;
    clock_t tm;

    if (argc == 2) {
        max = atoi(argv[1]);
        assert(max > 0);
    } else {
        assert(argc == 1);
    }

    tm = -clock();
    for (i = 0; i <= max; ++i)
        printf("fib(%d) = %" PRIu64 "\n", i, fib(i));
    tm += clock();

    printf("Time elapsed: %.3fs\n", (double)tm / CLOCKS_PER_SEC);
    return 0;
}

To demonstrate, using GCC:

~% gcc -O2 -Dopt= -o test-nrm test.c
~% ./test-nrm
...
fib(45) = 1134903170
Time elapsed: 34.290s

~% gcc -O2 -Dopt=unlikely -o test-opt test.c
~% ./test-opt
...
fib(45) = 1134903170
Time elapsed: 33.530s

A few hundred milliseconds less. This gain is due to the programmer-aided branch prediction.

But now, for what the programmer should really be doing instead:

~% gcc -O2 -Dopt= -fprofile-generate -o test.prof test.c
~% ./test.prof 
...
fib(45) = 1134903170
Time elapsed: 77.530s  /this run is slowed down by profile generation.

~% gcc -O2 -Dopt= -fprofile-use -o test.good test.c
~% ./test.good
fib(45) = 1134903170
Time elapsed: 17.760s

With compiler-aided runtime profiling, we managed to reduce from the original 34.290s to 17.760s. Much better than with programmer-aided branch prediction!

Googly answered 29/4, 2010 at 16:56 Comment(7)

profile use is the good option, but I need to demonstrate likely and unlikely – Quinze 29/4, 2010 at 17:14

aah, tm = -clock(); tm += clock(); is beautiful, I haven't seen it before, thank you! – Uniplanar 29/4, 2010 at 18:0

I think what this demonstrates is that likely and unlikely are not very useful. Also perhaps that this is a really bad implementation of fib()... – Piloting 7/2, 2011 at 0:29

R.., This is only a studying example, and it show that sometimes likely can help a bit. – Quinze 7/2, 2011 at 0:35

Where are likely and unlikely actually used in the above code? – Peignoir 22/2, 2011 at 8:45

@joh hanson: you pass it as a compiler define in -Dopt= – Googly 22/2, 2011 at 13:16

On my laptop: -Dopt= ==> 44.284s, -Dopt=unlikely ==> 43.833s | -Dopt= -fprofile-generate ==> 27.563s, -Dopt= -fprofile-use ==> 21.878s | -Dopt=unlikely -fprofile-generate ==> 27.245s, -Dopt=unlikely -fprofile-use ==> 21.863s (gcc version 4.8.4) I guess the benefit is there anyway, if you are writing a linux scheduler... – Lucrecialucretia 5/2, 2016 at 14:48

-2

From this blog post. I think likely and unlikely are mostly obsolete. Very cheap CPUs (ARM Cortex A20 in the example) have branch predictors and there is no penalty regardless of jump is taken / jump is not taken. When you introduce likely/unlikely the results will be either the same or worse (because compiler has generated more instructions).

Myrick answered 6/7, 2020 at 9:59 Comment(0)

Recommended topics

Hot tags