One big factor is that it makes loop logic harder: Imagine you want to iterate over all but the last element of an array (which does happen in the real world). So you write your function:
void fun (const std::vector<int> &vec) {
for (std::size_t i = 0; i < vec.size() - 1; ++i)
do_something(vec[i]);
}
Looks good, doesn't it? It even compiles cleanly with very high warning levels! (Live) So you put this in your code, all tests run smoothly and you forget about it.
Now, later on, somebody comes along an passes an empty vector
to your function. Now with a signed integer, you hopefully would have noticed the sign-compare compiler warning, introduced the appropriate cast and not have published the buggy code in the first place.
But in your implementation with the unsigned integer, you wrap and the loop condition becomes i < SIZE_T_MAX
. Disaster, UB and most likely crash!
I want to know how they lead to security bugs?
This is also a security problem, in particular it is a buffer overflow. One way to possibly exploit this would be if do_something
would do something that can be observed by the attacker. They might be able to find what input went into do_something
, and that way data the attacker should not be able to access would be leaked from your memory. This would be a scenario similar to the Heartbleed bug. (Thanks to ratchet freak for pointing that out in a comment.)
int
is 32 bits, calculations onint
that overflow must yield values which are congruent to the correct results mod 2³², but were not required to behave as values within... – Riotint32_t x=INT32_MAX; x++; int64_t y1=x,y2=x;
a compiler would not be required to assign the same value toy1
andy2
, but castingy1
andy2
both touin32_t
would be required to give the same value, i.e. (INT32_MAX + 1u). I would expect explicit checked-integer semantics could allow some very useful optimizations if the compiler were allowed to hold correct calculations beyond specified precision, and only had to trap when precision was lost. Givenicheck32_t w,x,y,z;
, the expressionw=x+y+z;
... – Riotx+y
and(x+y)+z
were representable inicheck32_t
, but the compiler would be free to trap or not at its leisure ifx+y
was not representable butx+y+z
was. – Riot