Is else slower than elsif?
Asked Answered
H

3

6

Why is here the sub eins with the else slower than the sub zwei with the elsif?

#!/usr/bin/env perl
use warnings;
use 5.012;
use Benchmark qw(:all);

my $d = 0;
my $c = 2;

sub eins {
    if ( $c == 1) {
        $d = 1;
    }
    else {
        $d = 2;
    }
}

sub zwei {
    if ( $c == 1) {
        $d = 1;
    }
    elsif ( $c == 2 ) {
        $d = 2;
    }
}

sub drei {
    $d = 1;
    $d = 2 if $c == 2;
}

cmpthese( -5, {
    eins => sub{ eins() },
    zwei => sub{ zwei() },
    drei => sub{ drei() },
} );

        Rate eins drei zwei
eins 4167007/s   --  -1% -16%
drei 4207631/s   1%   -- -15%
zwei 4972740/s  19%  18%   --

        Rate eins drei zwei
eins 4074356/s   --  -8% -16%
drei 4428649/s   9%   --  -9%
zwei 4854964/s  19%  10%   --

        Rate eins drei zwei
eins 3455697/s   --  -6% -19%
drei 3672628/s   6%   -- -14%
zwei 4250826/s  23%  16%   --

        Rate eins drei zwei
eins 2832634/s   --  -8% -19%
drei 3088931/s   9%   -- -12%
zwei 3503197/s  24%  13%   --

        Rate eins zwei drei
eins 3053821/s   -- -17% -26%
zwei 3701601/s  21%   -- -10%
drei 4131128/s  35%  12%   --

        Rate eins drei zwei
eins 3033041/s   --  -2% -12%
drei 3092511/s   2%   -- -10%
zwei 3430837/s  13%  11%   --

Summary of my perl5 (revision 5 version 16 subversion 0) configuration:

Platform:
    osname=linux, osvers=3.1.10-1.9-desktop, archname=x86_64-linux
    uname='linux linux1 3.1.10-1.9-desktop #1 smp preempt thu apr 5 18:48:38 utc 2012 (4a97ec8) x86_64 x86_64 x86_64 gnulinux '
    config_args='-de'
    hint=recommended, useposix=true, d_sigaction=define
    useithreads=undef, usemultiplicity=undef
    useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef
    use64bitint=define, use64bitall=define, uselongdouble=undef
    usemymalloc=n, bincompat5005=undef
Compiler:
    cc='cc', ccflags ='-fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
    optimize='-O2',
    cppflags='-fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include'
    ccversion='', gccversion='4.6.2', gccosandvers=''
    intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16
    ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
    alignbytes=8, prototype=define
Linker and Libraries:
    ld='cc', ldflags =' -fstack-protector -L/usr/local/lib'
    libpth=/usr/local/lib /lib/../lib64 /usr/lib/../lib64 /lib /usr/lib /lib64 /usr/lib64 /usr/local/lib64
    libs=-lnsl -lndbm -lgdbm -ldb -ldl -lm -lcrypt -lutil -lc -lgdbm_compat
    perllibs=-lnsl -ldl -lm -lcrypt -lutil -lc
    libc=/lib/libc-2.14.1.so, so=so, useshrplib=false, libperl=libperl.a
    gnulibc_version='2.14.1'                                                                                                                                                                 
Dynamic Linking:                                                                                                                                                                           
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E'                                                                                                                        
    cccdlflags='-fPIC', lddlflags='-shared -O2 -L/usr/local/lib -fstack-protector'                                                                                                           


Characteristics of this binary (from libperl): 
Compile-time options: HAS_TIMES PERLIO_LAYERS PERL_DONT_CREATE_GVSV
                        PERL_MALLOC_WRAP PERL_PRESERVE_IVUV USE_64_BIT_ALL
                        USE_64_BIT_INT USE_LARGE_FILES USE_LOCALE
                        USE_LOCALE_COLLATE USE_LOCALE_CTYPE
                        USE_LOCALE_NUMERIC USE_PERLIO USE_PERL_ATOF
Built under linux
Compiled at May 24 2012 20:53:15
%ENV:
    PERL_HTML_DISPLAY_COMMAND="/usr/bin/firefox -new-window %s"
@INC:
    /usr/local/lib/perl5/site_perl/5.16.0/x86_64-linux
    /usr/local/lib/perl5/site_perl/5.16.0
    /usr/local/lib/perl5/5.16.0/x86_64-linux
    /usr/local/lib/perl5/5.16.0
    .
Horseflesh answered 11/6, 2012 at 9:41 Comment(5)
I'm putting this as a comment because I don't know for sure how the guts of perl work, but I'd guess that zwei's elsif construct can be found more quickly because it's explicitly tied to $c being 2, and $c is always 2. If the interpreter is smart enough to figure out that $c never changes, that could explain it. Without that though you would expect zwei to be the slower, as it would surely have to test $c twice.Bioscopy
The interpreter might also turn the zwei conditional into a jump table like C does with switch.Helse
@Matthew Walton, $c can change simply by fetching it, so the <strike>interpreter</strike> can't and doesn't make any such optimisation.Boelter
@msw, It does not. See perl -MO=Concise,-exec -e'...'Boelter
@Matthew Walton, That should say "<strike>interpreter</strike> compiler"Boelter
B
4

[ This is an answer per say, but it is useful information that doesn't fit in a comment. ]

First, let's look at the compiled form side by side, If $c == 2, the execution path of "zwei" is a pure superset of "eins". (Marked with "*".)

*1  <0> enter                            *1  <0> enter
*2  <;> nextstate(main 4 -e:2) v:{       *2  <;> nextstate(main 4 -e:2) v:{
*3  <#> gvsv[*c] s                       *3  <#> gvsv[*c] s
*4  <$> const[IV 1] s                    *4  <$> const[IV 1] s
*5  <2> eq sK/2                          *5  <2> eq sK/2
*6  <|> cond_expr(other->7) vK/1         *6  <|> cond_expr(other->7) vK/1
 7      <0> enter v                       7      <0> enter v
 8      <;> nextstate(main 1 -e:3) v:{    8      <;> nextstate(main 1 -e:3) v:{
 9      <$> const[IV 1] s                 9      <$> const[IV 1] s
 a      <#> gvsv[*d] s                    a      <#> gvsv[*d] s
 b      <2> sassign vKS/2                 b      <2> sassign vKS/2
 c      <@> leave vKP                     c      <@> leave vKP
            goto d                                   goto d
                                         *e  <#> gvsv[*c] s
                                         *f  <$> const[IV 2] s
                                         *g  <2> eq sK/2
                                         *h  <|> and(other->i) vK/1
*e  <0> enter v                          *i      <0> enter v
*f  <;> nextstate(main 2 -e:6) v:{       *j      <;> nextstate(main 2 -e:6) v:{
*g  <$> const[IV 2] s                    *k      <$> const[IV 2] s
*h  <#> gvsv[*d] s                       *l      <#> gvsv[*d] s
*i  <2> sassign vKS/2                    *m      <2> sassign vKS/2
*j  <@> leave vKP                        *n      <@> leave vKP
*d  <@> leave[1 ref] vKP/REFC            *d  <@> leave[1 ref] vKP/REFC

The thing is, I can reproduce your results! (v5.16.0 built for x86_64-linux-thread-multi)

           Rate drei eins zwei
drei  8974033/s   --  -3% -19%
eins  9263260/s   3%   -- -16%
zwei 11034175/s  23%  19%   --

           Rate drei eins zwei
drei  8971868/s   --  -1% -21%
eins  9031677/s   1%   -- -20%
zwei 11333871/s  26%  25%   --

This isn't a small different (that could be the result of CPU caching), and it's reproduceable between different runs (so it's not another application affecting the benchmark). I'm stumped.

Per iteration, it's taking 22 ns (1/9031677 s - 1/11333871 s) more to do 4 fewer ops. I would expect it to take roughly 100 ns less.

Boelter answered 11/6, 2012 at 17:46 Comment(4)
Are you sure this couldn't be accounted to caches or something? Maybe you can check oprofile to see if there's any difference (although I'm a bit skeptical, I usually fail to make sense of oprofile data).Fevre
@jpalecek, Can cache misses account for a loss equivalent to a hundred machine opcodes?Boelter
Yes, it could. However, I'm really puzzled by that behavior - in any reasonable environment, running a small routine over and over shouldn't produce many cache misses, so maybe branch mispredictions? I don't know.Fevre
BTW it couldn't have reasonably take 100 ns less in the to-be faster version, since the slowest version actually takes some 110 ns/iteration.Fevre
B
3

You'd have to check the generated opcodes of your specific Perl version. Interestingly, when run from a debugger, the outcome is different:

 Win64, Activeperl 5.14.2, from debugger environment:
           Rate zwei eins drei
 zwei 130806/s   --  -0%  -1%
 eins 130957/s   0%   --  -0%
 drei 131612/s   1%   1%   --

without debugger, eins is fastest (as one would expect from the code):

 Win64, Activeperl 5.14.2 :
           Rate drei zwei eins
 drei 3402015/s   --  -5% -13%
 zwei 3585171/s   5%   --  -8%
 eins 3916856/s  15%   9%   --

Actually, I didn't find a system here where zwei is faster than eins:

 Linux x64, Perl 5.12.1:
           Rate drei zwei eins
 drei 2439279/s   -- -13% -15%
 zwei 2797316/s  15%   --  -3%
 eins 2875184/s  18%   3%   --


Addendum:

After reading ikegami's posting, i tried Strawberry 5.16 on Win64, and voilá:

 Perl v5.16.0, MSWin32-x64-multi-t
           Rate eins drei zwei
 eins 3954005/s   --  -3% -10%
 drei 4084178/s   3%   --  -7%
 zwei 4406707/s  11%   8%   --

here we go, the elsif/zwei is faster.

So, this seem to be an issue connected to 5.16.0?

Bobbiebobbin answered 11/6, 2012 at 9:50 Comment(0)
M
0

I think the best answer may well be: "Who cares?"

This is micro-benchmarking of the most premature kind. If profiling finds that the biggest reason for real code being slow is using else instead of elsif then I have a very fine hat that I would be loath to lose, but which I would devour in a heartbeat.

Personally I'd write $d = ($c == 1) ? 1 : 2 and not care tuppence about its performance.

Mitman answered 19/6, 2012 at 8:22 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.