I'm not sure if the following code can cause redundant calculations, or is it compiler-specific?
for (int i = 0; i < strlen(ss); ++i)
{
// blabla
}
Will strlen()
be calculated every time when i
increases?
I'm not sure if the following code can cause redundant calculations, or is it compiler-specific?
for (int i = 0; i < strlen(ss); ++i)
{
// blabla
}
Will strlen()
be calculated every time when i
increases?
Yes, strlen()
will be evaluated on each iteration. It's possible that, under ideal circumstances, the optimiser might be able to deduce that the value won't change, but I personally wouldn't rely on that.
I'd do something like
for (int i = 0, n = strlen(ss); i < n; ++i)
or possibly
for (int i = 0; ss[i]; ++i)
as long as the string isn't going to change length during the iteration. If it might, then you'll need to either call strlen()
each time, or handle it through more complicated logic.
strlen
anyway. –
Robinia strlen
is marked as __attribute__((pure))
allowing the compiler to elide multiple calls. GCC Attributes –
Solomon strcpy
(e.g. reading several bytes at once); the second might play more nicely with the cache since it isn't reading the whole string right away. You'd have to measure if it were important. –
Enquire ss
or pass it to a function it won't optimize the call, even if you don't change the length. Some kinds of aliasing may break it as well. –
Pedraza strlen
as an intrinsic nowadays and will optimize away multiple calls. Regardless, the latter form will perform better for reasons mentioned by @R (except for the word "coherency"). –
Turbulent strlen
is simply not a pure function, and it is not marked with __attribute__((pure))
in glibc (look it up). However, it’s marked as a compiler intrinsic and will get special treatment. This special treatment needs to be emphasised here. All the answers just saying “yes” are wrong, and even this answer misstates the probability of strlen
being hoisted out of the loop. –
Jazminejazz strlen()
is marked __attribute__((pure))
in string.h in the latest Ubuntu libc6-dev (quantal). And gcc.gnu.org/onlinedocs/gcc/Function-Attributes.html says "Some of common examples of pure functions are strlen or memcmp". They shouldn't be (unless gcc means something else by "pure"), but they are. –
Cashandcarry pure
is a lie. –
Jazminejazz for( const char *p = ss ; p ; ++p )
? –
Hus *p
, not p
). –
Enquire Yes, every time you use the loop. Then it will every time calculate the length of the string. so use it like this:
char str[30];
for ( int i = 0; str[i] != '\0'; i++)
{
//Something;
}
In the above code str[i]
only verifies one particular character in the string at location i
each time the loop starts a cycle, thus it will take less memory and is more efficient.
See this Link for more information.
In the code below every time the loop runs strlen
will count the length of the whole string which is less efficient, takes more time and takes more memory.
char str[];
for ( int i = 0; i < strlen(str); i++)
{
//Something;
}
strlen
call, and if you're running that tight, you probably should be thinking about eliding a few other function calls as well... –
Enhance int strlen(char *s) { int len = 0; while(s[len] != '\0') len++; return len; }
which is pretty much exactly what you are doing in the code in your answer. I'm not arguing that iterating over the string once rather than twice is more time-efficient, but I don't see one or the other using more or less memory. Or are you referring to the variable used to hold the string length? –
Enhance A good compiler may not calculate it every time, but I don't think you can be sure, that every compiler does it.
In addition to that, the compiler has to know, that strlen(ss)
does not change. This is only true if ss
is not changed in for
loop.
For example, if you use a read-only function on ss
in for
loop but don't declare the ss
-parameter as const
, the compiler cannot even know that ss
is not changed in the loop and has to calculate strlen(ss)
in every iteration.
ss
not be changed in the for
loop; it must not be accessible from and changed by any function called in the loop (either because it is passed as an argument, or because it a global variable or a file-scope variable). Const-qualification may also be a factor, too. –
Taeniacide restrict
is for in C99. –
Corvette restrict
qualifier. –
Melitta ss
in the for-loop, then even if its parameter is declared const char*
, the compiler still needs to recalculate the length unless either (a) it knows that ss
points to a const object, as opposed to just being a pointer-to-const, or (b) it can inline the function or otherwise see that it is read-only. Taking a const char*
parameter is not a promise not to modify the data pointed to, because it is valid to cast to char*
and modify provided that the object modified isn't const and isn't a string literal. –
Corvette If ss
is of type const char *
and you're not casting away the const
ness within the loop the compiler might only call strlen
once, if optimizations are turned on. But this is certainly not behavior that can be counted upon.
You should save the strlen
result in a variable and use this variable in the loop. If you don't want to create an additional variable, depending on what you're doing, you may be ale to get away with reversing the loop to iterate backwards.
for( auto i = strlen(s); i > 0; --i ) {
// do whatever
// remember value of s[strlen(s)] is the terminating NULL character
}
strlen
at all. Just loop until you hit the end. –
Selfoperating i > 0
? Should that not be i >= 0
here? Personally, I would also start at strlen(s) - 1
if iterating over the string backwards, then the terminating \0
needs no special consideration. –
Enhance i >= 0
works only if you initialize to strlen(s) - 1
, but then if you have a string on zero length the initial value underflows –
Uncovered i > 0
expression on initial loop entry? If it doesn't, then you're right, the zero length case will definitely break the loop. If it does, you "simply" get a signed i
== -1 < 0 so no loop entry if the conditional is i >= 0
. –
Enhance strlen
's return type is unsigned, so (strlen(s)-1) >= 0
evaluates to true for zero length strings. –
Uncovered Formally yes, strlen()
is expected to be called for every iteration.
Anyway I do not want to negate the possibility of the existance of some clever compiler optimisation, that will optimise away any successive call to strlen() after the first one.
The predicate code in it's entirety will be executed on every iteration of the for
loop. In order to memoize the result of the strlen(ss)
call the compiler would need to know that at least
strlen
was side effect freess
doesn't change for the duration of the loopThe compiler doesn't know either of these things and hence can't safely memoize the result of the first call
ss
into a size_t
or divide it up amongst several byte
values. My devious thread could then just write bytes into that address and the compiler would have know way of understanding that it related to ss
. –
Messmate strlen
not only has to be side-effect free, it also has to be pure (returns the same value every time for the same input). But the points stands -- in general the compiler doesn't know either of those things, in specific cases it might know either or both. For example if it sees: const char ss[] = "hi";
, then although in principle some devious other thread might alter the length of the string contained in ss
, that would be UB and so the optimizer can assume it doesn't happen. –
Corvette int
I failed to initialize that is pointing to random memory that just so happens to be the address of ss
. The compiler can't infer the relationship between the write to that random memory and ss
. It could only do so if it marked all writes as potentially modifying ss
which would render the optimization mute. –
Messmate ss
points to (or is) an array of const objects, is escape analysis. If ss
is either an automatic variable or is declared restrict
in C, and if no reference is taken to it that leaves code the optimizer can see, then it "hasn't escaped", and so could potentially be proven unmodified even if non-const. This works in multi-threaded environments as well as single-. –
Corvette int
is UB (in C++ anyway: C might require a more sophisticated argument), so the optimizer is not required to account for the possibility. –
Corvette int a = 0; do_something(); printf("%d",a);
cannot be optimized, on the basis that do_something()
could do your uninitialized int thing, or could crawl back up the stack and modify a
deliberately. In point of fact, gcc 4.5 does optimize it to do_something(); printf("%d",0);
with -O3 –
Corvette a
). While compilers do this optimization it's not 100% safe. It's a bit of a pathological case but do_something
could modify the local a
via stack walking (and disturbingly I've encountered situations where developers intentionally do this and even more disturbing is seeing it in managed code). Yes that behavior of the developer is completely unsafe, insane and yet they still do it. So while a sane optimization it's not a 100% safe one (which is the point I was trying to get at) –
Messmate do_something
crawls up the stack into code optimized with O3 (or whatever level introduces this optimization), then it's do_something
which is <100% safe, not the compiler. That's why we all laughed like drains over that elided null pointer check in the linux kernel: some l33t kernel hacker tripped over his feet by allowing his UB code to be compiled with flags that broke it. –
Corvette a
, a compiler would be entitled to expect that it won't magically become initialized. The ARM version of GCC on Godbolt fails to properly handle some cases where a local variable has its address taken, however. Using memcpy
to copy the contents of an uninitialized uint16_t
variable to another should result in the latter one holding some number in the range 0-65535, but gcc converts the memcpy into an assignment and fails to regard that it must store a value in the 0-65535 range. –
Mohan Yes. strlen will be calculated everytime when i increases.
If you didn't change ss with in the loop means it won't affect logic otherwise it will affect.
It is safer to use following code.
int length = strlen(ss);
for ( int i = 0; i < length ; ++ i )
{
// blabla
}
Yes, strlen(ss)
will be calculated every time the code runs.
Yes, the strlen(ss)
will calculate the length at each iteration. If you are increasing the ss
by some way and also increasing the i
; there would be infinite loop.
Yes, the strlen()
function is called every time the loop is evaluated.
If you want to improve the efficiency then always remember to save everything in local variables... It will take time but it's very useful ..
You can use code like below:
String str="ss";
int l = strlen(str);
for ( int i = 0; i < l ; i++ )
{
// blablabla
}
Not common nowadays but 20 years ago on 16 bit platforms, I'd recommend this:
for ( char* p = str; *p; p++ ) { /* ... */ }
Even if your compiler isn't very smart in optimization, the above code can result in good assembly code yet.
Yes. The test doesn't know that ss doesn't get changed inside the loop. If you know that it won't change then I would write:
int stringLength = strlen (ss);
for ( int i = 0; i < stringLength; ++ i )
{
// blabla
}
As of today (January 2018), and gcc 7.3 and clang 5.0, let us compile:
#include <string.h>
void bar(char c);
void foo(const char* __restrict__ ss)
{
for (int i = 0; i < strlen(ss); ++i)
{
bar(*ss);
}
}
So, we have:
ss
is a constant pointer.ss
is marked __restrict__
ss
(well, unless it violates the __restrict__
).and still, both compilers execute strlen()
every single iteration of that loop. Amazing.
This also means the allusions/wishful thinking of @Praetorian and @JaredPar don't pan out.
strlen
every time because there is no way to know if bar
can modify memory pointer by ss
. Make it static function in the same file and compiler might change its mind –
Holman __restrict
and that is sufficient. –
Premiere __restrict
hint for the compiler, but there might be other reasons that prevent the optimization: strlen
and/or bar
could have side effects. If you put bar inline most likely output will change –
Holman __restrict
is a promise those side effects don't involve accesses to the memory pointed to by ss
. –
Premiere ss
. There could be other side effects. Compilers could treat strlen as an intrinsic, might as well inline it, but in general it comes from somewhere else (eg call strlen
in godbolt). It could be overriden at runtime (LD_PRELOAD or whatever it's called in unix world). If these issues resolved, then output most likely would be different. –
Holman restrict
doesn't match your understanding :) Perhaps, even if there is a restrict
on ss
it still can be changed? Anyways, latest clang calls strlen once if bar is visible and doesn't take pointer to ss –
Holman YES, in simple words.
And there is small no in rare condition in which compiler is wishing to, as an optimization step if it finds that there is no changes made in ss
at all. But in safe condition you should think it as YES. There are some situation like in multithreaded
and event driven program, it may get buggy if you consider it a NO.
Play safe as it is not going to improve the program complexity too much.
Yes.
strlen()
calculated everytime when i
increases and does not optimized.
Below code shows why the compiler should not optimize strlen()
.
for ( int i = 0; i < strlen(ss); ++i )
{
// Change ss string.
ss[i] = 'a'; // Compiler should not optimize strlen().
}
strlen
. –
Validate We can easily test it :
char nums[] = "0123456789";
size_t end;
int i;
for( i=0, end=strlen(nums); i<strlen(nums); i++ ) {
putchar( nums[i] );
num[--end] = 0;
}
Loop condition evaluates after each repetition, before restarting the loop .
Also be careful about the type you use to handle length of strings . it should be size_t
which has been defined as unsigned int
in stdio. comparing and casting it to int
might cause some serious vulnerability issue.
well, I noticed that someone is saying that it is optimized by default by any "clever" modern compiler. By the way look at results without optimization. I tried:
Minimal C code:
#include <stdio.h>
#include <string.h>
int main()
{
char *s="aaaa";
for (int i=0; i<strlen(s);i++)
printf ("a");
return 0;
}
My compiler: g++ (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3
Command for generation of assembly code: g++ -S -masm=intel test.cpp
Gotten assembly code at the output:
...
L3:
mov DWORD PTR [esp], 97
call putchar
add DWORD PTR [esp+40], 1
.L2:
THIS LOOP IS HERE
**<b>mov ebx, DWORD PTR [esp+40]
mov eax, DWORD PTR [esp+44]
mov DWORD PTR [esp+28], -1
mov edx, eax
mov eax, 0
mov ecx, DWORD PTR [esp+28]
mov edi, edx
repnz scasb</b>**
AS YOU CAN SEE it's done every time
mov eax, ecx
not eax
sub eax, 1
cmp ebx, eax
setb al
test al, al
jne .L3
mov eax, 0
.....
restrict
-qualified. While there are some cases where such optimization would be legitimate, the effort required to reliably identify such cases in the absence of restrict
would, by any reasonable measure, almost certainly exceed the benefit. If the string's address had a const restrict
qualifier, however, that would be sufficient in and of itself to justify the optimization without having to look at anything else. –
Mohan Elaborating on Prætorian's answer I recommend the following:
for( auto i = strlen(s)-1; i > 0; --i ) {foo(s[i-1];}
auto
because you don't want to care about which type strlen returns. A C++11 compiler (e.g. gcc -std=c++0x
, not completely C++11 but auto types work) will do that for you.i = strlen(s)
becuase you want to compare to 0
(see below)i > 0
because comparison to 0 is (slightly) faster that comparison to any other number.disadvantage is that you have to use i-1
in order to access the string characters.
© 2022 - 2024 — McMap. All rights reserved.
ss
inside the loop. – Haematinicss
is never modified, it can hoist the computation out of the loop. – Kaufmannstrlen
does, and would only work if it could prove that the pointer couldn't be aliased. In practice, I'd be rather surprised to see that optimisation. – Enquirestrlen
is tagged as__attribute_pure__
identifying that whatever the implementation is, the side effects of the function are only the returned value and that value depends only on the arguments (and possibly globals). By analyzing the loop the compiler can then infer that the function does not need to be called multiple times. – Solomonstrlen
can depend on mutable globals then the optimizer is stuck. – Corvetteconst
that ispure
with no dependency on globals, but thestrlen
is just tagged aspure
--no idea why, as it seems thatconst
would be applicable here (i.e. I don't see any need forstrlen
to check globals anywhere!) – Solomonstrlen
is notconst
is because the contents of the string is considered a global. If it wereconst
, then it would only be able to examine the string pointer, and not the memory it points to. Theconst
attribute is for functions likesqrt
. – Rineestrlen("Hello")
, you'll often see the number 5 hard coded, with no call tostrlen
. This goes beyond simple inlining. – Rineestrlen
. However, what I said is still true: there do exist compilers (such as GCC and Clang) which optimize out calls tostrlen
based on knowledge of exactly what it does. If your function does something different, you have to pass extra flags to the compiler. See gcc.gnu.org/onlinedocs/gcc-4.7.1/gcc/Other-Builtins.html – Rinee