Why sqrt in global scope is much slower than std::sqrt in MinGW?

About

Asked 24/10, 2014 at 11:35 Answered 24/10, 2014 at 14:9

Solved c++performance function mingw sqrt

Consider the following code:

#include <cmath>
#include <cstdio>

const int COUNT = 100000000;

int main()
{
    double sum = 0;
    for (int i = 1; i <= COUNT; ++i)
        sum += sqrt(i);
    printf("%f\n", sum);
    return 0;
}

It runs 5.5s on my computer. However, if I change sqrt into std::sqrt, It will run only 0.7s.

I know that if I use sqrt, I'm using the function from C library, and if I use std::sqrt, I'm using the one in <cmath>.

But <cmath> doesn't define one for int, and if I change the type of i into double, they will run for equal speed. So the compiler isn't optimizing for int. This seems to only happen to sqrt in Windows.

So why is std::sqrt much faster than sqrt, but not other functions? And why in Linux they are not?

Clavius answered 24/10, 2014 at 11:35 Comment(15)

What is the compiler version and is this MinGW(.org) or MinGW-w64? – Nahshunn 24/10, 2014 at 11:39

C++11 adds support for integer arguments to std:sqrt: en.cppreference.com/w/cpp/numeric/math/sqrt – Roily 24/10, 2014 at 11:43

Obviously if you are doing math with doubles, it's going to take a noticeably longer time to run than with just ints. – Todd 24/10, 2014 at 11:45

@PaulR but doesn't that just cast the integer parameter to double (at least according to cppreference, or maybe I misunderstand it) which is equivalent to what would happen with ::sqrt? – Wellread 24/10, 2014 at 11:46

Have you looked at the generated machine code? – Edmundson 24/10, 2014 at 11:47

And your compiler compilation options are ... ? When you post that a particular piece of code is slow/fast, you should post the compiler and compiler options used to build your example. Otherwise it's a waste of time speculating what the issue is. – Ishtar 24/10, 2014 at 11:51

I compiled it with g++ test.cpp -o test, no optimization flag. I've tested it with the compiler from mingw.org, both 32bit and 64bit. I'm not using C++11. – Clavius 24/10, 2014 at 11:55

@yyt16384 - No optimizations == meaningless results. – Ishtar 24/10, 2014 at 11:57

@yyt16384 How about you try it with optimisations and if the difference persists show some assembly (-S for gcc)? – Conceited 24/10, 2014 at 12:1

@yyt16384: "I'm not using C++11" - that doesn't mean your library isn't. My version of GCC includes the new overload whether or not you specify C++11. – Hairsplitter 24/10, 2014 at 12:5

@rsethc: Honestly, it's only "obvious" with the right knowledge. My wife couldn't tell, for example. At the same time a lot of stuff that is "obvious" to her is egyptian hieroglyphs to me. As such, I consequently removed "obviously" from my grammar, because "obviousness" is always and always dependant on context and knowledge that is not in your genes, and it's never a reason at all. I go so far to say that using "obvious" often indicates lack of knowledge, actually, or lack of explanation skills. – Phocis 24/10, 2014 at 15:47

@phresnel: I would consider it blatantly obvious compared to "it's the way the machine handles instructions generated in a namespace versus compiled to be optimized for the global scope" which is nonsense. (Things in namespaces are not, to my knowledge, treated any differently than things not in namespaces.) – Todd 24/10, 2014 at 20:21

@rsethc: It is obvious that if this is universally obvious, then the question asked here does not exist. Btw, it's totally, utterly and blatantly obvious to me that things are handled equally, never mind how deeply they are nested w.r.t. namespaces. To you, it's obviously not. Get my point? – Phocis 24/10, 2014 at 20:41

@phresnel Alright good point that the question wouldn't be here, but no what I said was that they are treated the same. But yes I get the point, I suppose it's obvious to me but not the asker. – Todd 25/10, 2014 at 15:35

@rsethc: More specifically, you wrote that to your knowledge, they are treated the same. So obviously, it's not that obvious. Anyways, have a nice sunday :) – Phocis 26/10, 2014 at 10:52

This is the typical situation in which the -fdump-tree-* switch can give some insight into what is going on:

g++ -fdump-tree-optimized example1.cc

and you get the example1.cc.165t.optimized file. Somewhere inside:

<bb 3>:
_5 = (double) i_2;
_6 = sqrt (_5);
sum_7 = sum_1 + _6;
i_8 = i_2 + 1;

The compiler (gcc v4.8.3) is doing the math with doubles.

Replacing sqrt with std::sqrt what you get is:

<bb 3>:
_5 = std::sqrt<int> (i_2);
sum_6 = sum_1 + _5;
i_7 = i_2 + 1;

Now it uses a different sqrt overload for integers (i_2 is int and sum_6 is double).

As Mike Seymour says in a comment, GCC uses the new overload whether or not you specify C++11.

Anyway under Linux there isn't a sensible performance difference between the two implementations.

Under Windows (MinGW) this is different since sqrt(double) calls into msvcrt.

Lorollas answered 24/10, 2014 at 14:9 Comment(0)

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags