How do I determine the number of digits of an integer in C?

R

21

93

for instance,

n = 3432, result 4

n = 45, result 2

n = 33215, result 5

n = -357, result 3

I guess I could just turn it into a string then get the length of the string but that seems convoluted and hack-y.

Raynard answered 1/7, 2009 at 12:21 Comment(11)

Getting the string length would fail in case of negative numbers. So get the length of the absolute value instead. ;-) – Machismo 1/7, 2009 at 12:50

char buff[100]; int r = sprintf(buff,"%s",n) - (r<0); – Barramunda 1/7, 2009 at 13:25

you mean decimal digits? decimal places are something that real numbers have, and integers don't, by definition. – Eternal 1/7, 2009 at 13:38

Uh ... Pax, is that a legal expression? Since r doesn't have a value before the assignment, the "(r < 0)" part seems scary. Or perhaps you meant that it should bne done as a second step, so it's just the notation that I'm not getting (I'm reading it as if it were C). – Jahdiel 1/7, 2009 at 13:41

@Will, yeah I was going to say "return 0;" – Turnstile 1/7, 2009 at 13:42

You're right, @unwind, it should have been n, not r: char buff[100]; int r = sprintf(buff,"%s",n) - (n<0); – Barramunda 1/7, 2009 at 13:58

@Will, 'decimal digits' it is. – Raynard 1/7, 2009 at 14:2

Actually the real flaw is the %s instead of %d. – Vaca 2/7, 2009 at 2:12

Must ... remember ... to ... unit ... test! char buff[100]; int r = sprintf(buff,"%d",n) - (n<0); – Barramunda 2/7, 2009 at 4:33

+1 This has been a very fun and educational question ! – Lallans 2/7, 2009 at 15:51

Many years later, reading this question, I notice that no one asked why you need the number of digits in a number. The usual reason is to allocate space when serializing a number to characters. If that is the reason, then all the following answers are inefficient. :) – Transilient 5/8, 2020 at 18:2

H

119

floor (log10 (abs (x))) + 1

http://en.wikipedia.org/wiki/Logarithm

Hobble answered 1/7, 2009 at 12:24 Comment(20)

I'd say ceil(...) instead of floor(...)+1. And of course a check for zero, as said. – Cha 1/7, 2009 at 12:49

@Len: ceil won't work for powers of 10: ceil(log10 (100)) == 2 – Hobble 1/7, 2009 at 12:57

My ancient understanding of C-type math libraries was that log and similar functions were fairly expensive, so I would tend to favor simple division, a la Pax's solution. Anyone know/comment? – Consolidation 1/7, 2009 at 13:8

@John, it is substantially slower but it doesn't really make a difference until you get into the hundred million iteration range (where the difference is seven seconds for floating point and 1 second for worst case optimized if statements). See my update. – Barramunda 1/7, 2009 at 13:20

This would be needlessly slow. Don't use expensive functions such as log10() without a reason. The fast, integer function is simple enough to bother writing it. – Glazier 1/7, 2009 at 14:31

Also, it would fail on x==0 AND x==INT_MIN, as usually abs(INT_MIN)==INT_MIN which is negative :). Cast the result of abs() to unsigned int to make the code slightly slower and more correct :). – Glazier 1/7, 2009 at 15:0

Geez .. are you people still running an 8088? Who cares about few extra clock cycles. It took Paz 100,000,000 iterations to make a measurable difference, and even that was negligible! 6 seconds! Whoop-dee-do .. get on with your life. Next year it'll be 3 seconds. – Hobble 1/7, 2009 at 15:7

@eduffy: A millisecond here, a millisecond there... and suddenly, the user feels a noticable delay after clicking a button. Seriously, those small inefficiencies add up. Why waste clock cycles, when you don't gain anything by it? – Glazier 1/7, 2009 at 15:30

@eduffy: if this were running on an embedded processor, there might not be any floating point support, let alone log functions, and the clock speed may only be in the tens of MHz - so an entirely integer-based solution would definitely be the preferred option. – Ence 1/7, 2009 at 16:12

In any case, we're mostly geeks here; surely faster is better, no matter what? ;-) – Ence 1/7, 2009 at 16:13

It turns out that although simple division is faster for small values, logarithm scales much better. If you call the division algorithms with every int from MIN_INT to MAX_INT (and repeat that the same 100m times as Paz's examples), you end up with an average of 13.337 seconds per call. Doing the same with Logarithm is an average of 8.143 seconds, the recursion takes 11.971 seconds, and the cascading If statements ends up taking an average of 0.953 seconds. So, the Daily-WTF-looking solution is an order of magnitude faster, but in the long run, this is in second place. – Traditor 1/7, 2009 at 18:55

Why not go with this if you think this is the clearest way (which in my opinion is), and if you really find it actually makes a difference you can easily switch to any of the other solutions, which are all small and nifty. – Patron 1/7, 2009 at 19:34

@Matt Poush: What? Either you don't understand something, or have a really braindead compiler. You repeat the division at most n times, where n is the number of digits in the number (10 with 32-bit int). And the division is actually implemented as a multiplication by a good compiler. That's WAY faster than doing a single FP logarithm. By the way, ALL compilers are braindead if you don't turn optimization on :). – Glazier 2/7, 2009 at 0:35

@stormsoul, are we still doing intensive calculations on a single thread? That was so last year. – Episcopate 24/7, 2009 at 21:51

@SteveMelnikoff But this isn't necessarily being run on an embedded processor. There are numerous restrictions if you go far enough into embedded systems so lets not just assume all code is embedded otherwise what's the point of all of these nice CPU features that Intel and AMD are pumping out? If you assume that every problem is a nail you naturally tend to pick the hammer to solve it! – Beatty 31/12, 2013 at 0:37

No need to compute log or divide at all with this see my edit in answer using this for ages – Giorgione 3/5, 2015 at 9:40

My two cents on a 6 year old question: when doing thousands of these operations using parallel processes, those few seconds really add up in the long term for large data sets. – Mateusz 30/11, 2015 at 14:28

@Hobble your comment is ignorant of the fact that he was asking a C question. It did not deserve 23 upvotes. C is being used today for more microprocessors than you would guess and you guessed wrong: the vast majority is MUCH MUCH slower than an 8088 which clocked up to 10MHZ. So for the sacke of reason, do not comment that on a general question. Such computers might not even have formatstring libraries, they likely won't even have a log() function available without writing it. Your comment would have been fine if it's about a modern PC program, but we don't know. – Hundredpercenter 29/9, 2017 at 21:37

This doesn't work for 999999999999998, 999999999999999 and longer versions of these — even if we replace abs with fabs or llabs (without this replacement it couldn't work even in principle), at least on the common systems where double is IEEE 754 binary64. – Flavory 6/4, 2020 at 19:8

My two cents on a now 12 year old question: Efficiency is important, but how important it is -- when we need to strive for speed, versus when we need to strive for convenience or obviousness or other virtues -- is endlessly subjective. Me, even on a high-powered processor, I would usually not haul out a full-boown floating-point logarithm just to compute the number of digits in an integer -- but at the same time, this is one way to do it, and it's a way that any competent programmer should know, so it was quite proper of eduffy to post it. And the upvotes prove that others agree! – Navaho 18/6, 2021 at 1:6

B

163

The recursive approach :-)

int numPlaces (int n) {
    if (n < 0) return numPlaces ((n == INT_MIN) ? INT_MAX: -n);
    if (n < 10) return 1;
    return 1 + numPlaces (n / 10);
}

Or iterative:

int numPlaces (int n) {
    int r = 1;
    if (n < 0) n = (n == INT_MIN) ? INT_MAX: -n;
    while (n > 9) {
        n /= 10;
        r++;
    }
    return r;
}

Or raw speed:

int numPlaces (int n) {
    if (n < 0) n = (n == INT_MIN) ? INT_MAX : -n;
    if (n < 10) return 1;
    if (n < 100) return 2;
    if (n < 1000) return 3;
    if (n < 10000) return 4;
    if (n < 100000) return 5;
    if (n < 1000000) return 6;
    if (n < 10000000) return 7;
    if (n < 100000000) return 8;
    if (n < 1000000000) return 9;
    /*      2147483647 is 2^31-1 - add more ifs as needed
       and adjust this final return as well. */
    return 10;
}

Those above have been modified to better process MININT. On any weird systems that don't follow sensible 2ⁿ two's complement rules for integers, they may need further adjustment.

The raw speed version actually outperforms the floating point version, modified below:

int numPlaces (int n) {
    if (n == 0) return 1;
    return floor (log10 (abs (n))) + 1;
}

With a hundred million iterations, I get the following results:

Raw speed with 0:            0 seconds
Raw speed with 2^31-1:       1 second
Iterative with 2^31-1:       5 seconds
Recursive with 2^31-1:       6 seconds
Floating point with 1:       6 seconds
Floating point with 2^31-1:  7 seconds

That actually surprised me a little - I thought the Intel chips had a decent FPU but I guess general FP operations still can't compete with hand-optimized integer code.

Update following stormsoul's suggestions:

Testing the multiply-iterative solution by stormsoul gives a result of 4 seconds so, while it's much faster than the divide-iterative solution, it still doesn't match the optimized if-statement solution.

Choosing the arguments from a pool of 1000 randomly generated numbers pushed the raw speed time out to 2 seconds so, while it appears there may have been some advantage to having the same argument each time, it's still the fastest approach listed.

Compiling with -O2 improved the speeds but not the relative positions (I increased the iteration count by a factor of ten to check this).

Any further analysis is going to have to get seriously into the inner workings of CPU efficiency (different types of optimization, use of caches, branch prediction, which CPU you actually have, the ambient temperature in the room and so on) which is going to get in the way of my paid work :-). It's been an interesting diversion but, at some point, the return on investment for optimization becomes too small to matter. I think we've got enough solutions to have answered the question (which was, after all, not about speed).

Further update:

This will be my final update to this answer barring glaring errors that aren't dependent on architecture. Inspired by stormsoul's valiant efforts to measure, I'm posting my test program (modified as per stormsoul's own test program) along with some sample figures for all methods shown in the answers here. Keep in mind this is on a particular machine, your mileage may vary depending on where you run it (which is why I'm posting the test code).

Do with it as you wish:

#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <limits.h>
#include <time.h>

#define numof(a) (sizeof(a) / sizeof(a[0]))

/* Random numbers and accuracy checks. */

static int rndnum[10000];
static int rt[numof(rndnum)];

/* All digit counting functions here. */

static int count_recur (int n) {
    if (n < 0) return count_recur ((n == INT_MIN) ? INT_MAX : -n);
    if (n < 10) return 1;
    return 1 + count_recur (n / 10);
}

static int count_diviter (int n) {
    int r = 1;
    if (n < 0) n = (n == INT_MIN) ? INT_MAX : -n;
    while (n > 9) {
        n /= 10;
        r++;
    }
    return r;
}

static int count_multiter (int n) {
    unsigned int num = abs(n);
    unsigned int x, i;
    for (x=10, i=1; ; x*=10, i++) {
        if (num < x)
            return i;
        if (x > INT_MAX/10)
            return i+1;
    }
}

static int count_ifs (int n) {
    if (n < 0) n = (n == INT_MIN) ? INT_MAX : -n;
    if (n < 10) return 1;
    if (n < 100) return 2;
    if (n < 1000) return 3;
    if (n < 10000) return 4;
    if (n < 100000) return 5;
    if (n < 1000000) return 6;
    if (n < 10000000) return 7;
    if (n < 100000000) return 8;
    if (n < 1000000000) return 9;
    /*      2147483647 is 2^31-1 - add more ifs as needed
    and adjust this final return as well. */
    return 10;
}

static int count_revifs (int n) {
    if (n < 0) n = (n == INT_MIN) ? INT_MAX : -n;
    if (n > 999999999) return 10;
    if (n > 99999999) return 9;
    if (n > 9999999) return 8;
    if (n > 999999) return 7;
    if (n > 99999) return 6;
    if (n > 9999) return 5;
    if (n > 999) return 4;
    if (n > 99) return 3;
    if (n > 9) return 2;
    return 1;
}

static int count_log10 (int n) {
    if (n < 0) n = (n == INT_MIN) ? INT_MAX : -n;
    if (n == 0) return 1;
    return floor (log10 (n)) + 1;
}

static int count_bchop (int n) {
    int r = 1;
    if (n < 0) n = (n == INT_MIN) ? INT_MAX : -n;
    if (n >= 100000000) {
        r += 8;
        n /= 100000000;
    }
    if (n >= 10000) {
        r += 4;
        n /= 10000;
    }
    if (n >= 100) {
        r += 2;
        n /= 100;
    }
    if (n >= 10)
        r++;

    return r;
}

/* Structure to control calling of functions. */

typedef struct {
    int (*fnptr)(int);
    char *desc;
} tFn;

static tFn fn[] = {
    NULL,                              NULL,
    count_recur,    "            recursive",
    count_diviter,  "     divide-iterative",
    count_multiter, "   multiply-iterative",
    count_ifs,      "        if-statements",
    count_revifs,   "reverse-if-statements",
    count_log10,    "               log-10",
    count_bchop,    "          binary chop",
};
static clock_t clk[numof (fn)];

int main (int c, char *v[]) {
    int i, j, k, r;
    int s = 1;

    /* Test code:
        printf ("%11d %d\n", INT_MIN, count_recur(INT_MIN));
        for (i = -1000000000; i != 0; i /= 10)
            printf ("%11d %d\n", i, count_recur(i));
        printf ("%11d %d\n", 0, count_recur(0));
        for (i = 1; i != 1000000000; i *= 10)
            printf ("%11d %d\n", i, count_recur(i));
        printf ("%11d %d\n", 1000000000, count_recur(1000000000));
        printf ("%11d %d\n", INT_MAX, count_recur(INT_MAX));
    /* */

    /* Randomize and create random pool of numbers. */

    srand (time (NULL));
    for (j = 0; j < numof (rndnum); j++) {
        rndnum[j] = s * rand();
        s = -s;
    }
    rndnum[0] = INT_MAX;
    rndnum[1] = INT_MIN;

    /* For testing. */
    for (k = 0; k < numof (rndnum); k++) {
        rt[k] = (fn[1].fnptr)(rndnum[k]);
    }

    /* Test each of the functions in turn. */

    clk[0] = clock();
    for (i = 1; i < numof (fn); i++) {
        for (j = 0; j < 10000; j++) {
            for (k = 0; k < numof (rndnum); k++) {
                r = (fn[i].fnptr)(rndnum[k]);
                /* Test code:
                    if (r != rt[k]) {
                        printf ("Mismatch error [%s] %d %d %d %d\n",
                            fn[i].desc, k, rndnum[k], rt[k], r);
                        return 1;
                    }
                /* */
            }
        }
        clk[i] = clock();
    }

    /* Print out results. */

    for (i = 1; i < numof (fn); i++) {
        printf ("Time for %s: %10d\n", fn[i].desc, (int)(clk[i] - clk[i-1]));
    }

    return 0;
}

Remember that you need to ensure you use the correct command line to compile it. In particular, you may need to explicitly list the math library to get log10() working. The command line I used under Debian was gcc -o testprog testprog.c -lm.

And, in terms of results, here's the leader-board for my environment:

Optimization level 0:

Time for reverse-if-statements:       1704
Time for         if-statements:       2296
Time for           binary chop:       2515
Time for    multiply-iterative:       5141
Time for      divide-iterative:       7375
Time for             recursive:      10469
Time for                log-10:      26953

Optimization level 3:

Time for         if-statements:       1047
Time for           binary chop:       1156
Time for reverse-if-statements:       1500
Time for    multiply-iterative:       2937
Time for      divide-iterative:       5391
Time for             recursive:       8875
Time for                log-10:      25438

Barramunda answered 1/7, 2009 at 12:21 Comment(28)

Recursive version seems to me like the cleanest, simplest, best self-documenting solution posted. – Consolidation 1/7, 2009 at 13:6

@sharptooth: But recursive traversal of lists is standard practice in functional languages! – Playacting 1/7, 2009 at 14:0

You can't assume you can get the abs of n if n is negative; if n is the minimum allowed value, you can't negate it. The opposite of -(2^31) is 2^31, which can't be represented as an int. – Traditor 1/7, 2009 at 14:52

@Brian: But C is not a functional language! – Glazier 1/7, 2009 at 15:3

@Matt Poush: Yes, by the ISO C standard, the result of abs(MIN_INT) is undefined. But in reality, nearly all machines will return MIN_INT, which is the correct result, if you cast it to unsigned int. – Glazier 1/7, 2009 at 15:7

@stormsoul: I wasn't trying to be that pedantic; I was assuming abs(MIN_INT) would return MIN_INT. The problem is that a number of the above functions assume that calling abs(MIN_INT) will return a positive number. For example, the iterative example negates a negative number, and then loops while the value is greater than 10. So, MIN_INT should return 10, but it returns 1, since it falls right through the while conditional. A better solution would be to create your own internal abs method that returns MAX_INT if the value is MIN_INT, or the actual abs value if the value isn't. – Traditor 1/7, 2009 at 15:22

What compiler did you use? Did you turn optimization on? I find it very hard to believe that a well-written iterative approach would be 5x slower than the unrolled version. Not with a good compiler. Also, a better measurement granularity would be much appreciated :) – Glazier 1/7, 2009 at 15:35

@Matt Poush: Matt, it is sufficient to cast the result of abs() to unsigned int. The cast will flip MIN_INT to MAX_INT+1. – Glazier 1/7, 2009 at 15:38

I have now read the code you tested. It is no wonder the iterative version you gave is slow, because it uses divisions, which are slow. Could you please time the multiplication-based iterative approach I have given? – Glazier 1/7, 2009 at 15:41

gcc on Ubuntu 9.04, no particular flags so I don't know what the default optimization was (but it was in one executable anyway so they had the same optimization). The grnularity was 1 second (using time(0)) and averaged over 10 runs. The improvement was, I suspect, due to the fact that the raw speed version is doing only comparisons, not divisions. Your multiplication one may well do better (that was a nifty way around the speed issue but I can't test it now since the machine's in the bedroom and the wife's gone off to sleep). I'll give it a shot in the morning if you wish. – Barramunda 1/7, 2009 at 15:42

Re the MININT problem, you could just add detection for that up front: (1) if they're power-of-two based, use "n = (n == MININT) ? MAXINT : -n". There's no power-of-two within one of a power-of ten so that would still return the right number of digits. (2) If they're wildly different (MAXINT = 100, MININT = -7), multiplex to two different raw-speed functions, one for +ve, the other for -ve. No doubt there are other solutions as well, they're just the ones I thought of off the top of my head. That particular problem affects all of the n=-n and n=abs(n) solutions. – Barramunda 1/7, 2009 at 15:48

Sorry for comment-spamming, but I just got another idea ;). If you repeatedly time your function on THE SAME ARGUMENT, then no wonder your unrolled version will beat the asses out of every other. This is because the CPU will remember which branches are taken, and which are not, and rush through the code WITHOUT WAITING for the checks. Make an array of ~10000 random values (not more, to keep it in the L1 cache) and time the functions on that. – Glazier 1/7, 2009 at 15:51

@Pax: You do not need to test for MININT - just cast it to unsigned int, which will give you the right result for NO cost. The (newer revisions of) ISO C standard guarantees integers are power-of-two based with a representation of signed integers being one of 3 possibilities allowed by the standard. The only one of those 3 possibilities where MININT != -MAXINT would be the most common one: the two's complement, where MININT == -(MAXINT+1). So the cast to unsigned int is pretty much a sure way to get the correct result everywhere. – Glazier 1/7, 2009 at 16:6

I don't think you can cast blindly (Pax hesitates...) - in two's complement, -1 becomes 2^32-1 which is 10 digits long, not one as required. – Barramunda 1/7, 2009 at 16:14

@Pax: GCC by default gives you no optimization. Pass -O2 or even -O3 to turn it on. If you additionally pass -funroll-loops you may even get the loop unrolled automatically by the compiler :). Also, optimization gives VERY different gains on different code, so it would change the ranking considerably. Your version and the float version would probably not gain much - if anything. The loops - quite on the contrary! – Glazier 1/7, 2009 at 16:16

@Pax: Just HOW IN THE WORLD would you like abs() to return -1? The only negative value it can return is MIN_INT :). – Glazier 1/7, 2009 at 16:18

@storm, I'm not talking about abs() returning -1, I'm stating that "int i = -1; unsinged int j = (unsigned int)i;" will set j to a big honkin' positive number. Or did you mean something else by your cast without cost? – Barramunda 1/7, 2009 at 16:26

Never mind I think I got it it, you mean (unsigned int)(abs(i)) – Barramunda 1/7, 2009 at 16:28

this is too premature optimization. it assumes that the user WOULD need to run this operation a hundred million times. if i'm going to run it only a few dozen times, the time spent optimizing this would be better spent elsewhere. – Compelling 2/7, 2009 at 2:4

@moogs, you can use any of the solutions presented in this, or any other, answer here. The speed testing was really just an aside (that got out of hand). And in any case, you still have that time available to you - it's only my time that was possibly wasted here so feel free to use the fruits of my labor as you wish :-) – Barramunda 2/7, 2009 at 2:34

You DO realize that rand() never returns negative values, do you? :) – Glazier 2/7, 2009 at 10:34

A small performance tip - when you know some value is always non-negative, use unsigned types. They are slightly faster to multiply and divide. The compiler might guess for you that some variable is never negative and make this optimization automatically, but again, it might not. In more complex situations it never does. – Glazier 2/7, 2009 at 10:46

Right. Someone in IRC made some performance tests, then he used unsigned and he god some really great boost. I urge you to try the unsigned world! :) – Olsewski 2/7, 2009 at 14:5

Nice answer =) I like the raw-speed version, but I think it can be improved by branching in a binary fashion to reduce the worst-case number of comparisons to four (disregarding the negative-test), obviously at the expense of readability. In that respect, one can tune a version of it specifically to the intended data range. – Impostor 26/10, 2012 at 3:13

On my system, for that test snippet, you need to add #include <time.h> and #include <stdio.h> to the top, and the log10 one doesn't work. – Contrapuntist 12/5, 2016 at 3:37

@QPaysTaxes, it compiled fine for me without time.h but, since it's good practice, I've added it in. The stdio.h one was already there. For the non-working log10(), it's probably due to not linking with the math library so I've added a note for that as well. Thanks for the heads-up. – Barramunda 12/5, 2016 at 9:7

Huh, when I copied/pasted it, it didn't have stdio. I guess I did that wrong. Thanks for fixing it! – Contrapuntist 12/5, 2016 at 12:7

the native floating-point log10 will be very slow. It's much better to use an integer log10 like this. And the abs in the log10 case won't work for INT_MIN like in your raw speed case – Hooker 11/3, 2018 at 2:9

H

119

floor (log10 (abs (x))) + 1