In my experiments this expression
double d = strtod("3ex", &end);
initializes d
with 3.0
and places end
pointer at 'e'
character in the input string. This is exactly as I would expect it to behave. The 'e'
character might look as a beginning of the exponent part, but since the actual exponent value (required by 6.4.4.2) is missing, that 'e'
should be treated as a completely independent character.
However, when I do
double d;
char c;
sscanf("3ex", "%lf%c", &d, &c);
I notice that sscanf
consumes both '3'
and 'e'
for the %lf
format specifier. Variable d
receives 3.0
value. Variable c
ends up with 'x'
in it. This look strange to me for two reasons.
Firstly, since the language specification refers to strtod
when describing the behavior of %f
format specifier, I intuitively expected %lf
to treat the input the same way strtod
does (i.e. choose the same position as the termination point). However, I know that historically scanf
was supposed to return no more than one character back to the input stream. That limits the distance of any look-ahead scanf
can perform by one character. And the example above requires at least two character look-ahead. So, let's say I accept the fact that %lf
consumed both '3'
and 'e'
from the input stream.
But then we run into the second issue. Now sscanf
has to convert that "3e"
to type double
. "3e"
is not a valid representation of a floating-point constant (again, according to 6.4.4.2 the exponent value is not optional). I would expect sscanf
to treat this input as erroneous: terminate during %lf
conversion, return 0
and leave d
and c
unchanged. However, the above sscanf
completes successfully (returning 2
).
This behavior is consistent between GCC and MSVC implementations of standard library.
So, my question is, where exactly in the C language standard document does it allow sscanf
to behave as described above, referring to the above two points: consuming more than strtod
does and successfully converting such sequences as "3e"
?
By looking at my experiment results I can probably "reverse engineer" the sscanf
's behavior: consume as much as "looks right" never stepping back and then just pass the consumed sequence to strtod
. That way that 'e'
gets consumed by %lf
and then just ignored by strtod
. But were exactly is all that in the language specification?
sscanf
is instdio
andstrtod
is instdlib
. – Sizeablec
should attain the value'e'
and not the value'x'
. Or perhaps it should not attain any value at all, and functionsscanf
should return 1 instead of 2 (so it accurately emulates the behavior ofstrtod
). – Sizeablesscanf
format requirements and behavior to be in sync withstrto...
format requirements and behavior. The language standard actually states that, but apparently I saw more in it that there really was. For example, I expectedsscanf
to stop at exactly the same point wherestrto...
would stop. Now I kinda "see" that the standard probably does not require that and allowssscanf
to consume more. – Pontonesscanf
decided to "consume more" and the result of that consumption does not match the syntax requirements. – Pontonesscanf
andstrtod
should exhibit similar (or equivalent) behaviour.strto.
*scanf()
needs to scanf left to right. Butstrtod()
may "look ahead" and decide where to put endptr. – Lenardf
format specifier by simply referring tostrtod
. If there's a difference betweenf
specifier andstrtod
, the standard should describe it somewhere. My questions is: where? Which specific wording? – Pontone...scanf()
is defined to take the longest possible sequence that is, or is a prefix of, a matching input, whilestrto...()
takes the longest valid sequence. (The difference being a result of streams supporting only one character of guaranteed put-back, i.e....scanf()
cannot step back as much asstrto...()
can.) – Cracy