how to convert hex string to unsigned 64bit (uint64_t) integer in a fast and safe way?
Asked Answered
U

3

9

I tried

sscanf(str, "%016llX", &int64 );

but seems not safe. Is there a fast and safe way to do the type casting?

Thanks~

Unshod answered 9/11, 2010 at 9:49 Comment(7)
What do you mean by "not safe" ?Twinge
I don't know. I tried to use this to do the casting among a large number of hex strings, and sometimes it would report segment faultUnshod
Could you show the declaration and initialization of str?Jenifer
Either your header or your source is wrong... (see intro of my answer)Subsumption
I tried to add some custom functions to redis. I modified redis.c to add a function : pastebin.com/Dus3ip0PUnshod
Sorry I corrected my typo in titleUnshod
See my answer. The "016" in combination with sscanf() doesn't make sense. Given that code snippet, strtol() or strtoll() (depending on platform) are the safer option. Check for success!Subsumption
L
13

Don't bother with functions in the scanf family. They're nearly impossible to use robustly. Here's a general safe use of strtoull:

char *str, *end;
unsigned long long result;
errno = 0;
result = strtoull(str, &end, 16);
if (result == 0 && end == str) {
    /* str was not a number */
} else if (result == ULLONG_MAX && errno) {
    /* the value of str does not fit in unsigned long long */
} else if (*end) {
    /* str began with a number but has junk left over at the end */
}

Note that strtoull accepts an optional 0x prefix on the string, as well as optional initial whitespace and a sign character (+ or -). If you want to reject these, you should perform a test before calling strtoull, for instance:

if (!isxdigit(str[0]) || (str[1] && !isxdigit(str[1])))

If you also wish to disallow overly long representations of numbers (leading zeros), you could check the following condition before calling strtoull:

if (str[0]=='0' && str[1])

One more thing to keep in mind is that "negative numbers" are not considered outside the range of conversion; instead, a prefix of - is treated the same as the unary negation operator in C applied to an unsigned value, so for example strtoull("-2", 0, 16) will return ULLONG_MAX-1 (without setting errno).

Lungi answered 9/11, 2010 at 10:59 Comment(3)
Generally speaking I agree with your sentiments regarding the scanf() familiy. They do make sense in controlled environments, e.g. reading in files that your application has written itselfSubsumption
@DevSolar: Indeed, my comment there was intended as an answer to OP's question on safe usage, not general advice. The scanf family works fine for reading Linux /proc files, for example.Lungi
Wouldn't strtoull() return an 'unsigned long long' type, which isn't necessarily a 'uint64_t' type?Spock
A
2

Your title (at present) contradicts the code you provided. If you want to do what your title was originally (convert a string to an integer), then you can use this answer.


You could use the strtoull function, which unlike sscanf is a function specifically geared towards reading textual representations of numbers.

const char *test = "123456789abcdef0";

errno = 0;
unsigned long long result = strtoull(test, NULL, 16);

if (errno == EINVAL)
{
    // not a valid number
}
else if (errno == ERANGE)
{
    // does not fit in an unsigned long long
}
Animated answered 9/11, 2010 at 10:0 Comment(4)
Under my understanding, he wants to convert from integer to string, not vice versa. ;)Jenifer
@Flinsch, the title changed, but his original code suggests he wants to convert from string to integer. I'm not sure you can see the question's edit history, but the title originally stated that he wanted to convert a string to an integer.Animated
Actually, scanf() et al. are defined in terms of strtol(). For halfway decent input, they are functionally identical as far as parsing numbers is concerned. I admit strtol() handles failure more gracefully.Subsumption
Since accessing errno might be mildly expensive, I prefer only checking the value of errno if result was ULLONG_MAX. And EINVAL is optional on conversion failure, so for robustness you need to use the end-pointer argument and check whether it's equal to the starting pointer you passed in to know if any conversion was performed.Lungi
S
0

At the time I wrote this answer, your title suggested you'd want to write an uint64_t into a string, while your code did the opposite (reading a hex string into an uint64_t). I answered "both ways":

The <inttypes.h> header has conversion macros to handle the ..._t types safely:

#include <stdio.h>
#include <inttypes.h>

sprintf( str, "%016" PRIx64, uint64 );

Or (if that is indeed what you're trying to do), the other way round:

#include <stdio.h>
#include <inttypes.h>

sscanf( str, "%" SCNx64, &uint64 );

Note that you cannot enforce widths etc. with the scanf() function family. It parses what it gets, which can yield undesired results when the input does not adhere to expected formatting. Oh, and the scanf() function family only knows (lowercase) "x", not (uppercase) "X".

Subsumption answered 9/11, 2010 at 10:2 Comment(6)
This would be a good answer, but you shouldn't use scanf or sscanf to read signed integers because there's a risk of signed integer overflow... and that's a legitimate issue.Kenogenesis
@autistic: So now you're looking through my profile to see if you can find other scanf-related answers of mine you can revenge-downvote? You're really making friends here. Note that there is no risk of signed integer overflow if you use scanf() on known-good input. As opposed to user input, which is what that other question (and comment) was about.Subsumption
Actually, as I said, that is a legitimate issue... you said it yourself, in response to a question where it's barely relevant... in this question, on the other hand, you shouldn't even use a signed integer... the fact that you mentioned signed integers here makes this answer invalid.Kenogenesis
@autistic: Note how the OP explicitly asserted that his int64 -- and thus mine as well -- is of type uint64_t, and thus unsigned.Subsumption
What is the title of the question? "how to convert hex string to unsigned 64bit (uint64_t) integer in a fast and safe way?" So why do you say "your title suggested you'd want to write an int64" and "reading ... into an int64_t"?Kenogenesis
I'm quoting your words, here... this is a problem... SCNx64 expects a uint64_t *, similar for PRIx64... That's a lowercase x, by the way. Another error to correct in this "would be good" answer.Kenogenesis

© 2022 - 2024 — McMap. All rights reserved.