What's the C++ way of parsing a string (given as char *) into an int? Robust and clear error handling is a plus (instead of returning zero).
In the new C++11 there are functions for that: stoi, stol, stoll, stoul and so on.
int myNr = std::stoi(myString);
It will throw an exception on conversion error.
Even these new functions still have the same issue as noted by Dan: they will happily convert the string "11x" to integer "11".
See more: http://en.cppreference.com/w/cpp/string/basic_string/stol
size_t
isn't equal to the length of the string, then it stopped early. It'll still return 11 in that case, but pos
will be 2 instead of the string length 3. coliru.stacked-crooked.com/a/cabe25d64d2ffa29 –
Gamp What not to do
Here is my first piece of advice: do not use stringstream for this. While at first it may seem simple to use, you'll find that you have to do a lot of extra work if you want robustness and good error handling.
Here is an approach that intuitively seems like it should work:
bool str2int (int &i, char const *s)
{
std::stringstream ss(s);
ss >> i;
if (ss.fail()) {
// not an integer
return false;
}
return true;
}
This has a major problem: str2int(i, "1337h4x0r")
will happily return true
and i
will get the value 1337
. We can work around this problem by ensuring there are no more characters in the stringstream
after the conversion:
bool str2int (int &i, char const *s)
{
char c;
std::stringstream ss(s);
ss >> i;
if (ss.fail() || ss.get(c)) {
// not an integer
return false;
}
return true;
}
We fixed one problem, but there are still a couple of other problems.
What if the number in the string is not base 10? We can try to accommodate other bases by setting the stream to the correct mode (e.g. ss << std::hex
) before trying the conversion. But this means the caller must know a priori what base the number is -- and how can the caller possibly know that? The caller doesn't know what the number is yet. They don't even know that it is a number! How can they be expected to know what base it is? We could just mandate that all numbers input to our programs must be base 10 and reject hexadecimal or octal input as invalid. But that is not very flexible or robust. There is no simple solution to this problem. You can't simply try the conversion once for each base, because the decimal conversion will always succeed for octal numbers (with a leading zero) and the octal conversion may succeed for some decimal numbers. So now you have to check for a leading zero. But wait! Hexadecimal numbers can start with a leading zero too (0x...). Sigh.
Even if you succeed in dealing with the above problems, there is still another bigger problem: what if the caller needs to distinguish between bad input (e.g. "123foo") and a number that is out of the range of int
(e.g. "4000000000" for 32-bit int
)? With stringstream
, there is no way to make this distinction. We only know whether the conversion succeeded or failed. If it fails, we have no way of knowing why it failed. As you can see, stringstream
leaves much to be desired if you want robustness and clear error handling.
This leads me to my second piece of advice: do no use Boost's lexical_cast
for this. Consider what the lexical_cast
documentation has to say:
Where a higher degree of control is required over conversions, std::stringstream and std::wstringstream offer a more appropriate path. Where non-stream-based conversions are required, lexical_cast is the wrong tool for the job and is not special-cased for such scenarios.
What?? We've already seen that stringstream
has a poor level of control, and yet it says stringstream
should be used instead of lexical_cast
if you need "a higher level of control". Also, because lexical_cast
is just a wrapper around stringstream
, it suffers from the same problems that stringstream
does: poor support for multiple number bases and poor error handling.
The best solution
Fortunately, somebody has already solved all of the above problems. The C standard library contains strtol
and family which have none of these problems.
enum STR2INT_ERROR { SUCCESS, OVERFLOW, UNDERFLOW, INCONVERTIBLE };
STR2INT_ERROR str2int (int &i, char const *s, int base = 0)
{
char *end;
long l;
errno = 0;
l = strtol(s, &end, base);
if ((errno == ERANGE && l == LONG_MAX) || l > INT_MAX) {
return OVERFLOW;
}
if ((errno == ERANGE && l == LONG_MIN) || l < INT_MIN) {
return UNDERFLOW;
}
if (*s == '\0' || *end != '\0') {
return INCONVERTIBLE;
}
i = l;
return SUCCESS;
}
Pretty simple for something that handles all the error cases and also supports any number base from 2 to 36. If base
is zero (the default) it will try to convert from any base. Or the caller can supply the third argument and specify that the conversion should only be attempted for a particular base. It is robust and handles all errors with a minimal amount of effort.
Other reasons to prefer strtol
(and family):
- It exhibits much better runtime performance
- It introduces less compile-time overhead (the others pull in nearly 20 times more SLOC from headers)
- It results in the smallest code size
There is absolutely no good reason to use any other method.
strtol
to be thread-safe. POSIX also requires errno
to use thread-local storage. Even on non-POSIX systems, nearly all implementations of errno
on multithreaded systems use thread-local storage. The latest C++ standard requires errno
to be POSIX compliant. The latest C standard also requires errno
to have thread-local storage. Even on Windows, which is definitely not POSIX compliant, errno
is thread-safe and, by extension, so is strtol
. –
Bleareyed OVERFLOW
and UNDERFLOW
are used for macros by gcc
(and therefore g++
) for compatibility with System V. To disable extensions to the standard such as this one, pass -ansi
to g++
at the command-line or makefile. sourceware.org/bugzilla/show_bug.cgi?id=5407 gcc.gnu.org/onlinedocs/gcc-4.7.1/gcc/C-Dialect-Options.html –
Elledge llvm-gcc 4.2
) and had to define _POSIX_C_SOURCE
to disable the extension. This is likely not necessary on other systems. –
Elledge std::stol
for this, which will appropriately throw exceptions rather than returning constants. –
Skyeskyhigh std::stol
was even added to the C++ language. That said, I don't think it's fair to say that this is "C coding within C++". It's silly to say that std::strtol
is C coding when it is explicitly part of the C++ language. My answer applied perfectly to C++ when it was written and it still does apply even with the new std::stol
. Calling functions that may throw exceptions isn't always the best for every programming situation. –
Bleareyed std::stol
. –
Rumania strtol()
wrapper is very similar to BSD's strtonum()
. –
Hudnall STR2INT_ERROR
enumerators (i.e. enum STR2INT_ERROR { S2I_SUCCESS, S2I_OVERFLOW, S2I_UNDERFLOW, S2I_INCONVERTIBLE };
? Would this not be a simpler solution to the conflict? –
Trebizond errno
for error discrimination, changing code to if ((errno == ERANGE && l == LONG_MAX) || l > INT_MAX) { errno = ERANGE; return OVERFLOW; }
would be a small amount of extra code to accommodate that. –
Waste strtol
. I'm just confused by the str2int
code sample given. Is this supposed to show the implementation of strtol? If so, then why is it named str2int
? –
Globulin strtol
will allow arbitrary amounts of leading whitespace, but no trailing whitespace. I think we should probably disallow leading whitespace (add isspace(*s)
to the inconvertible case) unless we consciously see a good reason for that not being handled at a higher level, and I think if we allow leading whitespace we should also allow trailing whitespace (do for(; isspace(*end); end += 1);
after the call to strtol
) unless we see a good reason for that inconsistency. –
Of strtol
is that it treats leading zero as signalling octal when base is zero. By now humanity has enough experience with it to know that zero implicitly switching interpretation from decimal to octal causes far more bad than good. Some things are naturally octal (umask for example) but notably those things you never want to interpret as decimal. So I'd also do if(base == 0) for(; s[0] == '0' && s[1] != '\0’; s += 1);
before the call to strtol
. –
Of int saved_errno = errno:
before calling strtol
and errno = saved_errno;
in the success return path. This would make the behavior more consistent with the standard library functions, which might clobber a previous errno
but never clear it back to zero. –
Of In the new C++11 there are functions for that: stoi, stol, stoll, stoul and so on.
int myNr = std::stoi(myString);
It will throw an exception on conversion error.
Even these new functions still have the same issue as noted by Dan: they will happily convert the string "11x" to integer "11".
See more: http://en.cppreference.com/w/cpp/string/basic_string/stol
size_t
isn't equal to the length of the string, then it stopped early. It'll still return 11 in that case, but pos
will be 2 instead of the string length 3. coliru.stacked-crooked.com/a/cabe25d64d2ffa29 –
Gamp This is a safer C way than atoi()
const char* str = "123";
int i;
if(sscanf(str, "%d", &i) == EOF )
{
/* error */
}
C++ with standard library stringstream: (thanks CMS )
int str2int (const string &str) {
stringstream ss(str);
int num;
if((ss >> num).fail())
{
//ERROR
}
return num;
}
With boost library: (thanks jk)
#include <boost/lexical_cast.hpp>
#include <string>
try
{
std::string str = "123";
int number = boost::lexical_cast< int >( str );
}
catch( const boost::bad_lexical_cast & )
{
// Error
}
Edit: Fixed the stringstream version so that it handles errors. (thanks to CMS's and jk's comment on original post)
The good 'old C way still works. I recommend strtol or strtoul. Between the return status and the 'endPtr', you can give good diagnostic output. It also handles multiple bases nicely.
std::strtol
's case, you have no way of knowing if you've successfully parsed a 0
or if the function failed unless you manually check if the string resolves to 0
, and by the time you've done that you're unnecessarily repeating work. The more modern approach (std::from_chars
) not only tells you when the function fails, but why it failed as well, which helps provide feedback to the end user. –
Subserve (char)0
with checking for a string that is "0"
. –
Piliform '\0'
with '0'
. Saying "resolves to" was just the most succinct thing I could think to say at the time without going over the character limit. What I meant is that it's impossible to know if a 0
returned by std::strtol
is indicating an invalid parse or a valid parse with a value of 0
, and that to be certain you'd have to do extra work, some of which would repeat work already performed by std::strtol
. By "resolves to 0
" I mean "validly parses as 0
" as opposed to "does not parse". –
Subserve endp
. If it is equal to the pointer passed in, you got zero because parsing failed. Otherwise the zero was a successful parsing result. If you want to distinguish parsing the entire string from parsing a prefix ("0zork"
) then check whether *endp
is a NUL character. At no point do you have to look at any of the digits inside the input string. –
Piliform str_end
was a pointer to a pointer and thus designed to be an 'out' parameter, I thought str
and str_end
were intended to act more like the begin
and end
iterators commonly used in <algorithm>
functions. (Incidentally I find it strange that it's a char * *
instead of a const char * *
considering str
is a const char *
. That suggests the function may internally be casting away the const
.) At any rate, std::from_chars
is still more useful because its error reporting approach is more sane. –
Subserve You can use Boost's lexical_cast
, which wraps this in a more generic interface.
lexical_cast<Target>(Source)
throws bad_lexical_cast
on failure.
You can use the a stringstream from the C++ standard libraray:
stringstream ss(str);
int x;
ss >> x;
if(ss) { // <-- error handling
// use x
} else {
// not a number
}
The stream state will be set to fail if a non-digit is encountered when trying to read an integer.
See Stream pitfalls for pitfalls of errorhandling and streams in C++.
From C++17 onwards you can use std::from_chars
from the <charconv>
header as documented here.
For example:
#include <iostream>
#include <charconv>
#include <array>
int main()
{
char const * str = "42";
int value = 0;
std::from_chars_result result = std::from_chars(std::begin(str), std::end(str), value);
if(result.error == std::errc::invalid_argument)
{
std::cout << "Error, invalid format";
}
else if(result.error == std::errc::result_out_of_range)
{
std::cout << "Error, value too big for int range";
}
else
{
std::cout << "Success: " << result;
}
}
As a bonus, it can also handle other bases, like hexadecimal.
You can use stringstream's
int str2int (const string &str) {
stringstream ss(str);
int num;
ss >> num;
return num;
}
I think these three links sum it up:
- http://tinodidriksen.com/2010/02/07/cpp-convert-int-to-string-speed/
- http://tinodidriksen.com/2010/02/16/cpp-convert-string-to-int-speed/
- http://www.fastformat.org/performance.html
stringstream and lexical_cast solutions are about the same as lexical cast is using stringstream.
Some specializations of lexical cast use different approach see http://www.boost.org/doc/libs/release/boost/lexical_cast.hpp for details. Integers and floats are now specialized for integer to string conversion.
One can specialize lexical_cast for his/her own needs and make it fast. This would be the ultimate solution satisfying all parties, clean and simple.
Articles already mentioned show comparison between different methods of converting integers <-> strings. Following approaches make sense: old c-way, spirit.karma, fastformat, simple naive loop.
Lexical_cast is ok in some cases e.g. for int to string conversion.
Converting string to int using lexical cast is not a good idea as it is 10-40 times slower than atoi depending on the platform/compiler used.
Boost.Spirit.Karma seems to be the fastest library for converting integer to string.
ex.: generate(ptr_char, int_, integer_number);
and basic simple loop from the article mentioned above is a fastest way to convert string to int, obviously not the safest one, strtol() seems like a safer solution
int naive_char_2_int(const char *p) {
int x = 0;
bool neg = false;
if (*p == '-') {
neg = true;
++p;
}
while (*p >= '0' && *p <= '9') {
x = (x*10) + (*p - '0');
++p;
}
if (neg) {
x = -x;
}
return x;
}
The C++ String Toolkit Library (StrTk) has the following solution:
static const std::size_t digit_table_symbol_count = 256;
static const unsigned char digit_table[digit_table_symbol_count] = {
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, // 0xFF - 0x07
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, // 0x08 - 0x0F
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, // 0x10 - 0x17
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, // 0x18 - 0x1F
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, // 0x20 - 0x27
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, // 0x28 - 0x2F
0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, // 0x30 - 0x37
0x08, 0x09, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, // 0x38 - 0x3F
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, // 0x40 - 0x47
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, // 0x48 - 0x4F
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, // 0x50 - 0x57
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, // 0x58 - 0x5F
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, // 0x60 - 0x67
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, // 0x68 - 0x6F
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, // 0x70 - 0x77
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, // 0x78 - 0x7F
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, // 0x80 - 0x87
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, // 0x88 - 0x8F
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, // 0x90 - 0x97
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, // 0x98 - 0x9F
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, // 0xA0 - 0xA7
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, // 0xA8 - 0xAF
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, // 0xB0 - 0xB7
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, // 0xB8 - 0xBF
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, // 0xC0 - 0xC7
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, // 0xC8 - 0xCF
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, // 0xD0 - 0xD7
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, // 0xD8 - 0xDF
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, // 0xE0 - 0xE7
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, // 0xE8 - 0xEF
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, // 0xF0 - 0xF7
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF // 0xF8 - 0xFF
};
template<typename InputIterator, typename T>
inline bool string_to_signed_type_converter_impl_itr(InputIterator begin, InputIterator end, T& v)
{
if (0 == std::distance(begin,end))
return false;
v = 0;
InputIterator it = begin;
bool negative = false;
if ('+' == *it)
++it;
else if ('-' == *it)
{
++it;
negative = true;
}
if (end == it)
return false;
while(end != it)
{
const T digit = static_cast<T>(digit_table[static_cast<unsigned int>(*it++)]);
if (0xFF == digit)
return false;
v = (10 * v) + digit;
}
if (negative)
v *= -1;
return true;
}
The InputIterator can be of either unsigned char*, char* or std::string iterators, and T is expected to be a signed int, such as signed int, int, or long
v = (10 * v) + digit;
overflows needlessly with string input with the text value of INT_MIN
. Table is of questionable value vs simply digit >= '0' && digit <= '9'
–
Waste If you have C++11, the appropriate solutions nowadays are the C++ integer conversion functions in <string>
: stoi
, stol
, stoul
, stoll
, stoull
. They throw appropriate exceptions when given incorrect input and use the fast and small strto*
functions under the hood.
If you are stuck with an earlier revision of C++, it would be forward-portable of you to mimic these functions in your implementation.
I like Dan Moulding's answer, I'll just add a bit of C++ style to it:
#include <cstdlib>
#include <cerrno>
#include <climits>
#include <stdexcept>
int to_int(const std::string &s, int base = 0)
{
char *end;
errno = 0;
long result = std::strtol(s.c_str(), &end, base);
if (errno == ERANGE || result > INT_MAX || result < INT_MIN)
throw std::out_of_range("toint: string is out of range");
if (s.length() == 0 || *end != '\0')
throw std::invalid_argument("toint: invalid string");
return result;
}
It works for both std::string and const char* through the implicit conversion. It's also useful for base conversion, e.g. all to_int("0x7b")
and to_int("0173")
and to_int("01111011", 2)
and to_int("0000007B", 16)
and to_int("11120", 3)
and to_int("3L", 34);
would return 123.
Unlike std::stoi
it works in pre-C++11. Also unlike std::stoi
, boost::lexical_cast
and stringstream
it throws exceptions for weird strings like "123hohoho".
NB: This function tolerates leading spaces but not trailing spaces, i.e. to_int(" 123")
returns 123 while to_int("123 ")
throws exception. Make sure this is acceptable for your use case or adjust the code.
Such function could be part of STL...
I know three ways of converting String into int:
Either use stoi(String to int) function or just go with Stringstream, the third way to go individual conversion, Code is below:
1st Method
std::string s1 = "4533";
std::string s2 = "3.010101";
std::string s3 = "31337 with some string";
int myint1 = std::stoi(s1);
int myint2 = std::stoi(s2);
int myint3 = std::stoi(s3);
std::cout << s1 <<"=" << myint1 << '\n';
std::cout << s2 <<"=" << myint2 << '\n';
std::cout << s3 <<"=" << myint3 << '\n';
2nd Method
#include <string.h>
#include <sstream>
#include <iostream>
#include <cstring>
using namespace std;
int StringToInteger(string NumberAsString)
{
int NumberAsInteger;
stringstream ss;
ss << NumberAsString;
ss >> NumberAsInteger;
return NumberAsInteger;
}
int main()
{
string NumberAsString;
cin >> NumberAsString;
cout << StringToInteger(NumberAsString) << endl;
return 0;
}
3rd Method - but not for an individual conversion
std::string str4 = "453";
int i = 0, in=0; // 453 as on
for ( i = 0; i < str4.length(); i++)
{
in = str4[i];
cout <<in-48 ;
}
I like Dan's answer, esp because of the avoidance of exceptions. For embedded systems development and other low level system development, there may not be a proper Exception framework available.
Added a check for white-space after a valid string...these three lines
while (isspace(*end)) {
end++;
}
Added a check for parsing errors too.
if ((errno != 0) || (s == end)) {
return INCONVERTIBLE;
}
Here is the complete function..
#include <cstdlib>
#include <cerrno>
#include <climits>
#include <stdexcept>
enum STR2INT_ERROR { SUCCESS, OVERFLOW, UNDERFLOW, INCONVERTIBLE };
STR2INT_ERROR str2long (long &l, char const *s, int base = 0)
{
char *end = (char *)s;
errno = 0;
l = strtol(s, &end, base);
if ((errno == ERANGE) && (l == LONG_MAX)) {
return OVERFLOW;
}
if ((errno == ERANGE) && (l == LONG_MIN)) {
return UNDERFLOW;
}
if ((errno != 0) || (s == end)) {
return INCONVERTIBLE;
}
while (isspace((unsigned char)*end)) {
end++;
}
if (*s == '\0' || *end != '\0') {
return INCONVERTIBLE;
}
return SUCCESS;
}
" "
. strtol()
is not specified to set errno
when no conversion occurs. Better to use if (s == end) return INCONVERTIBLE;
to detect no conversion. And then if (*s == '\0' || *end != '\0')
can simplify to if (*end)
2) || l > LONG_MAX
and || l < LONG_MIN
serve no purpose - they are never true. –
Waste You could use this defined method.
#define toInt(x) {atoi(x.c_str())};
And if you were to convert from String to an Integer, you would just do the following.
int main()
{
string test = "46", test2 = "56";
int a = toInt(test);
int b = toInt(test2);
cout<<a+b<<endl;
}
The output would be 102.
atoi
doesn't seem like "the C++ way," in light of other answers like the accepted std::stoi()
. –
Rumania I know this is an older question, but I've come across it so many times and, to date, have still not found a nicely templated solution having the following characteristics:
- Can convert any base (and detect base type)
- Will detect erroneous data (i.e. ensure the entire string, less leading/trailing whitespace, is consumed by the conversion)
- Will ensure that, regardless of the type converted to, the range of the string's value is acceptable.
So, here is mine, with a test strap. Because it uses the C functions strtoull/strtoll under the hood, it always converts first to the largest type available. Then, if you are not using the largest type, it will perform additional range checks to verify your type was not over(under)flowed. For this, it is a little less performant than if one properly chose strtol/strtoul. However, it also works for shorts/chars and, to the best of my knowledge, there exists no standard library function that does that, too.
Enjoy; hopefully someone finds it useful.
#include <cstdlib>
#include <cerrno>
#include <limits>
#include <stdexcept>
#include <sstream>
static const int DefaultBase = 10;
template<typename T>
static inline T CstrtoxllWrapper(const char *str, int base = DefaultBase)
{
while (isspace(*str)) str++; // remove leading spaces; verify there's data
if (*str == '\0') { throw std::invalid_argument("str; no data"); } // nothing to convert
// NOTE: for some reason strtoull allows a negative sign, we don't; if
// converting to an unsigned then it must always be positive!
if (!std::numeric_limits<T>::is_signed && *str == '-')
{ throw std::invalid_argument("str; negative"); }
// reset errno and call fn (either strtoll or strtoull)
errno = 0;
char *ePtr;
T tmp = std::numeric_limits<T>::is_signed ? strtoll(str, &ePtr, base)
: strtoull(str, &ePtr, base);
// check for any C errors -- note these are range errors on T, which may
// still be out of the range of the actual type we're using; the caller
// may need to perform additional range checks.
if (errno != 0)
{
if (errno == ERANGE) { throw std::range_error("str; out of range"); }
else if (errno == EINVAL) { throw std::invalid_argument("str; EINVAL"); }
else { throw std::invalid_argument("str; unknown errno"); }
}
// verify everything converted -- extraneous spaces are allowed
if (ePtr != NULL)
{
while (isspace(*ePtr)) ePtr++;
if (*ePtr != '\0') { throw std::invalid_argument("str; bad data"); }
}
return tmp;
}
template<typename T>
T StringToSigned(const char *str, int base = DefaultBase)
{
static const long long max = std::numeric_limits<T>::max();
static const long long min = std::numeric_limits<T>::min();
long long tmp = CstrtoxllWrapper<typeof(tmp)>(str, base); // use largest type
// final range check -- only needed if not long long type; a smart compiler
// should optimize this whole thing out
if (sizeof(T) == sizeof(tmp)) { return tmp; }
if (tmp < min || tmp > max)
{
std::ostringstream err;
err << "str; value " << tmp << " out of " << sizeof(T) * 8
<< "-bit signed range (";
if (sizeof(T) != 1) err << min << ".." << max;
else err << (int) min << ".." << (int) max; // don't print garbage chars
err << ")";
throw std::range_error(err.str());
}
return tmp;
}
template<typename T>
T StringToUnsigned(const char *str, int base = DefaultBase)
{
static const unsigned long long max = std::numeric_limits<T>::max();
unsigned long long tmp = CstrtoxllWrapper<typeof(tmp)>(str, base); // use largest type
// final range check -- only needed if not long long type; a smart compiler
// should optimize this whole thing out
if (sizeof(T) == sizeof(tmp)) { return tmp; }
if (tmp > max)
{
std::ostringstream err;
err << "str; value " << tmp << " out of " << sizeof(T) * 8
<< "-bit unsigned range (0..";
if (sizeof(T) != 1) err << max;
else err << (int) max; // don't print garbage chars
err << ")";
throw std::range_error(err.str());
}
return tmp;
}
template<typename T>
inline T
StringToDecimal(const char *str, int base = DefaultBase)
{
return std::numeric_limits<T>::is_signed ? StringToSigned<T>(str, base)
: StringToUnsigned<T>(str, base);
}
template<typename T>
inline T
StringToDecimal(T &out_convertedVal, const char *str, int base = DefaultBase)
{
return out_convertedVal = StringToDecimal<T>(str, base);
}
/*============================== [ Test Strap ] ==============================*/
#include <inttypes.h>
#include <iostream>
static bool _g_anyFailed = false;
template<typename T>
void TestIt(const char *tName,
const char *s, int base,
bool successExpected = false, T expectedValue = 0)
{
#define FAIL(s) { _g_anyFailed = true; std::cout << s; }
T x;
std::cout << "converting<" << tName << ">b:" << base << " [" << s << "]";
try
{
StringToDecimal<T>(x, s, base);
// get here on success only
if (!successExpected)
{
FAIL(" -- TEST FAILED; SUCCESS NOT EXPECTED!" << std::endl);
}
else
{
std::cout << " -> ";
if (sizeof(T) != 1) std::cout << x;
else std::cout << (int) x; // don't print garbage chars
if (x != expectedValue)
{
FAIL("; FAILED (expected value:" << expectedValue << ")!");
}
std::cout << std::endl;
}
}
catch (std::exception &e)
{
if (successExpected)
{
FAIL( " -- TEST FAILED; EXPECTED SUCCESS!"
<< " (got:" << e.what() << ")" << std::endl);
}
else
{
std::cout << "; expected exception encounterd: [" << e.what() << "]" << std::endl;
}
}
}
#define TEST(t, s, ...) \
TestIt<t>(#t, s, __VA_ARGS__);
int main()
{
std::cout << "============ variable base tests ============" << std::endl;
TEST(int, "-0xF", 0, true, -0xF);
TEST(int, "+0xF", 0, true, 0xF);
TEST(int, "0xF", 0, true, 0xF);
TEST(int, "-010", 0, true, -010);
TEST(int, "+010", 0, true, 010);
TEST(int, "010", 0, true, 010);
TEST(int, "-10", 0, true, -10);
TEST(int, "+10", 0, true, 10);
TEST(int, "10", 0, true, 10);
std::cout << "============ base-10 tests ============" << std::endl;
TEST(int, "-010", 10, true, -10);
TEST(int, "+010", 10, true, 10);
TEST(int, "010", 10, true, 10);
TEST(int, "-10", 10, true, -10);
TEST(int, "+10", 10, true, 10);
TEST(int, "10", 10, true, 10);
TEST(int, "00010", 10, true, 10);
std::cout << "============ base-8 tests ============" << std::endl;
TEST(int, "777", 8, true, 0777);
TEST(int, "-0111 ", 8, true, -0111);
TEST(int, "+0010 ", 8, true, 010);
std::cout << "============ base-16 tests ============" << std::endl;
TEST(int, "DEAD", 16, true, 0xDEAD);
TEST(int, "-BEEF", 16, true, -0xBEEF);
TEST(int, "+C30", 16, true, 0xC30);
std::cout << "============ base-2 tests ============" << std::endl;
TEST(int, "-10011001", 2, true, -153);
TEST(int, "10011001", 2, true, 153);
std::cout << "============ irregular base tests ============" << std::endl;
TEST(int, "Z", 36, true, 35);
TEST(int, "ZZTOP", 36, true, 60457993);
TEST(int, "G", 17, true, 16);
TEST(int, "H", 17);
std::cout << "============ space deliminated tests ============" << std::endl;
TEST(int, "1337 ", 10, true, 1337);
TEST(int, " FEAD", 16, true, 0xFEAD);
TEST(int, " 0711 ", 0, true, 0711);
std::cout << "============ bad data tests ============" << std::endl;
TEST(int, "FEAD", 10);
TEST(int, "1234 asdfklj", 10);
TEST(int, "-0xF", 10);
TEST(int, "+0xF", 10);
TEST(int, "0xF", 10);
TEST(int, "-F", 10);
TEST(int, "+F", 10);
TEST(int, "12.4", 10);
TEST(int, "ABG", 16);
TEST(int, "10011002", 2);
std::cout << "============ int8_t range tests ============" << std::endl;
TEST(int8_t, "7F", 16, true, std::numeric_limits<int8_t>::max());
TEST(int8_t, "80", 16);
TEST(int8_t, "-80", 16, true, std::numeric_limits<int8_t>::min());
TEST(int8_t, "-81", 16);
TEST(int8_t, "FF", 16);
TEST(int8_t, "100", 16);
std::cout << "============ uint8_t range tests ============" << std::endl;
TEST(uint8_t, "7F", 16, true, std::numeric_limits<int8_t>::max());
TEST(uint8_t, "80", 16, true, std::numeric_limits<int8_t>::max()+1);
TEST(uint8_t, "-80", 16);
TEST(uint8_t, "-81", 16);
TEST(uint8_t, "FF", 16, true, std::numeric_limits<uint8_t>::max());
TEST(uint8_t, "100", 16);
std::cout << "============ int16_t range tests ============" << std::endl;
TEST(int16_t, "7FFF", 16, true, std::numeric_limits<int16_t>::max());
TEST(int16_t, "8000", 16);
TEST(int16_t, "-8000", 16, true, std::numeric_limits<int16_t>::min());
TEST(int16_t, "-8001", 16);
TEST(int16_t, "FFFF", 16);
TEST(int16_t, "10000", 16);
std::cout << "============ uint16_t range tests ============" << std::endl;
TEST(uint16_t, "7FFF", 16, true, std::numeric_limits<int16_t>::max());
TEST(uint16_t, "8000", 16, true, std::numeric_limits<int16_t>::max()+1);
TEST(uint16_t, "-8000", 16);
TEST(uint16_t, "-8001", 16);
TEST(uint16_t, "FFFF", 16, true, std::numeric_limits<uint16_t>::max());
TEST(uint16_t, "10000", 16);
std::cout << "============ int32_t range tests ============" << std::endl;
TEST(int32_t, "7FFFFFFF", 16, true, std::numeric_limits<int32_t>::max());
TEST(int32_t, "80000000", 16);
TEST(int32_t, "-80000000", 16, true, std::numeric_limits<int32_t>::min());
TEST(int32_t, "-80000001", 16);
TEST(int32_t, "FFFFFFFF", 16);
TEST(int32_t, "100000000", 16);
std::cout << "============ uint32_t range tests ============" << std::endl;
TEST(uint32_t, "7FFFFFFF", 16, true, std::numeric_limits<int32_t>::max());
TEST(uint32_t, "80000000", 16, true, std::numeric_limits<int32_t>::max()+1);
TEST(uint32_t, "-80000000", 16);
TEST(uint32_t, "-80000001", 16);
TEST(uint32_t, "FFFFFFFF", 16, true, std::numeric_limits<uint32_t>::max());
TEST(uint32_t, "100000000", 16);
std::cout << "============ int64_t range tests ============" << std::endl;
TEST(int64_t, "7FFFFFFFFFFFFFFF", 16, true, std::numeric_limits<int64_t>::max());
TEST(int64_t, "8000000000000000", 16);
TEST(int64_t, "-8000000000000000", 16, true, std::numeric_limits<int64_t>::min());
TEST(int64_t, "-8000000000000001", 16);
TEST(int64_t, "FFFFFFFFFFFFFFFF", 16);
TEST(int64_t, "10000000000000000", 16);
std::cout << "============ uint64_t range tests ============" << std::endl;
TEST(uint64_t, "7FFFFFFFFFFFFFFF", 16, true, std::numeric_limits<int64_t>::max());
TEST(uint64_t, "8000000000000000", 16, true, std::numeric_limits<int64_t>::max()+1);
TEST(uint64_t, "-8000000000000000", 16);
TEST(uint64_t, "-8000000000000001", 16);
TEST(uint64_t, "FFFFFFFFFFFFFFFF", 16, true, std::numeric_limits<uint64_t>::max());
TEST(uint64_t, "10000000000000000", 16);
std::cout << std::endl << std::endl
<< (_g_anyFailed ? "!! SOME TESTS FAILED !!" : "ALL TESTS PASSED")
<< std::endl;
return _g_anyFailed;
}
StringToDecimal
is the user-land method; it is overloaded so it can be called either like this:
int a; a = StringToDecimal<int>("100");
or this:
int a; StringToDecimal(a, "100");
I hate repeating the int type, so prefer the latter. This ensures that if the type of 'a' changes one does not get bad results. I wish the compiler could figure it out like:
int a; a = StringToDecimal("100");
...but, C++ does not deduce template return types, so that's the best I can get.
The implementation is pretty simple:
CstrtoxllWrapper
wraps both strtoull
and strtoll
, calling whichever is necessary based on the template type's signed-ness and providing some additional guarantees (e.g. negative input is disallowed if unsigned and it ensures the entire string was converted).
CstrtoxllWrapper
is used by StringToSigned
and StringToUnsigned
with the largest type (long long/unsigned long long) available to the compiler; this allows the maximal conversion to be performed. Then, if it is necessary, StringToSigned
/StringToUnsigned
performs the final range checks on the underlying type. Finally, the end-point method, StringToDecimal
, decides which of the StringTo* template methods to call based on the underlying type's signed-ness.
I think most of the junk can be optimized out by the compiler; just about everything should be compile-time deterministic. Any commentary on this aspect would be interesting to me!
long long
instead of intmax_t
? –
Waste if (ePtr != str)
. Further, use isspace((unsigned char) *ePtr)
to properly handle negative values of *ePtr
. –
Waste I was looking for a parsing function that parses a complex number from a string which can be in a form 3+i4. Since I didn't find any best solution that handles all the combinations, I decided to write a parser of my own using overloaded >> operator on std::string.
struct complex
{
int a = 0, b = 0;
complex& operator+=(const complex& other)
{
a += other.a;
b += other.b;
return *this;
}
complex operator+(const complex& other)
{
complex cmp(other);
cmp += *this;
return cmp;
}
complex operator+(const complex& other) const
{
complex cmp(other);
cmp += *this;
return cmp;
}
friend inline std::ostream& operator<<(std::ostream& os, const complex& cmp);
friend inline std::istream& operator>>(std::istream& is, complex& cmp);
};
inline std::ostream& operator<<(std::ostream& os, const complex& cmp)
{
os << cmp.a << "+i" << cmp.b;
return os;
}
inline std::string operator>>(const std::string& stream, int& value)
{
int temp = 0;
bool b = false;
std::string::const_iterator it = stream.cbegin();
for (; it != stream.cend(); ++it)
{
if (std::isdigit(*it))
{
temp *= 10;
temp += static_cast<int>(*it - '0');
b = true;
}
else if (b)
{
++it;
break;
}
else
{
continue;
}
}
value = temp;
return stream.substr(it - stream.cbegin(), stream.cend() - it);
}
inline std::istream& operator>>(std::istream& is, complex& cmp)
{
std::string s;
std::getline(is >> std::ws, s);
s >> cmp.a >> cmp.b;
return is;
}
int main(int argc, char** argv)
{
std::cout << static_cast<int>('0') << std::endl;
complex x, y;
std::cin >> x;
std::cin >> y;
complex z = x + y;
std::cout << z << std::endl;
return 0;
}
In C, you can use int atoi (const char * str)
,
Parses the C-string str interpreting its content as an integral number, which is returned as a value of type int.
atoi
in the question, I'm aware of it. The question is clearly not about C, but about C++. -1 –
Rumania © 2022 - 2024 — McMap. All rights reserved.