The specification for strtol
conceptually divides the input string into "initial whitespace", a "subject sequence", and a "final string", and defines the "subject sequence" as:
the longest initial subsequence of the input string, starting with the first non-white-space character that is of the expected form. The subject sequence shall contain no characters if the input string is empty or consists entirely of white-space characters, or if the first non-white-space character is other than a sign or a permissible letter or digit.
At one time I thought the "longest initial subsequence" business was akin to the way scanf
works, where "0x@"
would scan as "0x"
, a failed match, followed by "@"
as the next unread character. However, after some discussion, I'm mostly convinced that strtol
processes the longest initial subsequence that is of the expected form, not the longest initial string which is the initial subsequence of some possible string of the expected form.
What's still confusing me is this language in the specification:
If the subject sequence is empty or does not have the expected form, no conversion is performed; the value of str is stored in the object pointed to by endptr, provided that endptr is not a null pointer.
If we accept what seems to be the correct definition of "subject sequence", there is no such thing as a non-empty subject sequence that does not have the expected form, and instead (to avoid redundancy and confusion) the text should just read:
If the subject sequence is empty, no conversion is performed; the value of str is stored in the object pointed to by endptr, provided that endptr is not a null pointer.
Can anyone clarify these issues for me? Perhaps a link to past discussions or any relevant defect reports would be useful.