Cross Platform Support for sprintf's Format '-Flag
Asked Answered
P

2

2

The Single UNIX Specification Version 2 specifies the sprintf's format '-flag behavior as:

The integer portion of the result of a decimal conversion (%i, %d, %u, %f, %g or %G) will be formatted with thousands' grouping characters[1]

I can't find the format '-flag in the or the specifications. g++ even warns:

ISO C++11 does not support the ' printf flag

The flag is not recognized to even warn about in Visual C; printf("%'d", foo) outputs:

'd

I'd like to be able to write C-standard compliant code that uses the behavior of the format '-flag. Thus the answer I'm looking for one of the following:

  1. C-Standard specification of the format '-flag
  2. A cross platform compatible extrapolation of gcc's format '-flag
  3. Demonstration that a cross platform extrapolation is not possible
Penrose answered 13/6, 2017 at 14:3 Comment(12)
Not clear what you are asking. "I can't find the format '-flag in the c or the c++ specifications.". You have your answer already. What is your problem?Freak
Oh man, I thought it was just C++ that had the mandatory downvote on all questions. I guess C also? Or was there an actual reason for the downvote on this carefully researched question?Penrose
And that flag is not gcc, but the libc on your system.Freak
@Olaf If you're agreeing with me that it is not in the C-standard, and that I didn't just over look the specification, I'm looking for a way to replicate libc's behavior in a cross platform manner. If I just missed it in the C-standard and this is an instance of Microsoft failing to fully implement the standard, I'd be OK using the format '-flag.Penrose
As you mention the specifications, I have to assume you mean the standards, not whatever Microsoft says. Last time I checked, MS was not the one defining the C specifications (printf etc. are not C++!). If that assumption is wrong, you have to be more specific. For the rest: we are not a coding service. If you want such a feater, the solution is obvious.Freak
Possible duplicate of How to format a number from 1123456789 to 1,123,456,789 in C?Untidy
@Olaf I'm not certain that I understand your last comment. Is there an action that I need to take here? I understand from your comments that an answer of type 1 is out. I'll need to hope that either someone can recommend a workaround for 2, or 3 explain why that's not possible.Penrose
@AndreKampling I looked through these answers before asking, but on second glance I noticed Jerry Coffin's solution it looks like it may provide a cross platform solution that correctly implements locale based numeric separation... I'm inspecting now, thanks.Penrose
There isn't a cross-platform way to use the POSIX extension to the C standard. It isn't necessarily universally implemented on POSIX systems, let alone elsewhere. If you need the functionality, you'll have to implement it. The information you need is available in the struct lconv available from standard C localeconv(). Decide which of many functions you will use to do the formatting, but portability dictates you use a custom function to do the formatting, implemented as needed.Pave
@JonathanLeffler Thanks, this gives me a great starting point. I'll look at this and see what I can cook up...Penrose
I still can't see your problem. What do you do if there is no standard function for something you need? Obviously you write it on your own. That's actually what programming is about! Sorry, if I missed something more obvious, but from your reps, I had to assume it was clear which action you have to take.Freak
@Olaf No, no you're not wrong. I wanted in 1 to make sure that writing this code was required by the standard, 3 to make sure that writing this code was possible, and 2 to ask if there was an effective workaround that I had missed. I have to admit, the downvote and closevote is frustrating. This is clearly a better question than possible dupe, which asked for a solution without even understanding the problem. But hey that was a more positive time in the life of stackoverflow.com I get it.Penrose
P
2

Standard C doesn't provide the formatting capability directly, but it does provide the ability to retrieve a...specification of what the formatting should be, on a locale-specific basis. So, it's up to you to retrieve the locale's specification of proper formatting, then put it to use to format your data (but even then, it's somewhat non-trivial). For example, here's a version for formatting long data:

#include <stdlib.h>
#include <locale.h>
#include <string.h>
#include <limits.h>

static int next_group(char const **grouping) {
    if ((*grouping)[1] == CHAR_MAX)
        return 0;
    if ((*grouping)[1] != '\0')
        ++*grouping;
    return **grouping;
}

size_t commafmt(char   *buf,            /* Buffer for formatted string  */
                int     bufsize,        /* Size of buffer               */
                long    N)              /* Number to convert            */
{
    int i;
    int len = 1;
    int posn = 1;
    int sign = 1;
    char *ptr = buf + bufsize - 1;

    struct lconv *fmt_info = localeconv();
    char const *tsep = fmt_info->thousands_sep;
    char const *group = fmt_info->grouping;
    // char const *neg = fmt_info->negative_sign;
    size_t sep_len = strlen(tsep);
    size_t group_len = strlen(group);
    // size_t neg_len = strlen(neg);
    int places = (int)*group;

    if (bufsize < 2)
    {
ABORT:
        *buf = '\0';
        return 0;
    }

    *ptr-- = '\0';
    --bufsize;
    if (N < 0L)
    {
        sign = -1;
        N = -N;
    }

    for ( ; len <= bufsize; ++len, ++posn)
    {
        *ptr-- = (char)((N % 10L) + '0');
        if (0L == (N /= 10L))
            break;
        if (places && (0 == (posn % places)))
        {
            places = next_group(&group);
            for (int i=sep_len; i>0; i--) {
                *ptr-- = tsep[i-1];
                if (++len >= bufsize)
                    goto ABORT;
            }
        }
        if (len >= bufsize)
            goto ABORT;
    }

    if (sign < 0)
    {
        if (len >= bufsize)
            goto ABORT;
        *ptr-- = '-';
        ++len;
    }

    memmove(buf, ++ptr, len + 1);
    return (size_t)len;
}

#ifdef TEST
#include <stdio.h>

#define elements(x) (sizeof(x)/sizeof(x[0]))

void show(long i) {
    char buffer[32];

    commafmt(buffer, sizeof(buffer), i);
    printf("%s\n", buffer);
    commafmt(buffer, sizeof(buffer), -i);
    printf("%s\n", buffer);
}


int main() {

    long inputs[] = {1, 12, 123, 1234, 12345, 123456, 1234567, 12345678 };

    for (int i=0; i<elements(inputs); i++) {
        setlocale(LC_ALL, "");
        show(inputs[i]);
    }
    return 0;
}

#endif

This does have a bug (but one I'd consider fairly minor). On two's complement hardware, it won't convert the most-negative number correctly, because it attempts to convert a negative number to its equivalent positive number with N = -N; In two's complement, the maximally negative number doesn't have a corresponding positive number, unless you promote it to a larger type. One way to get around this is by promoting the number the corresponding unsigned type (but it's is somewhat non-trivial).

Implementing the same for other integer types is fairly trivial. For floating point types is a bit more work. Converting floating point types (even without formatting) correctly is enough more work that for them, I'd at least consider using something like sprintf to do the conversion, then inserting the formatting into the string that produced.

Piping answered 14/6, 2017 at 14:26 Comment(6)
Another note from lconv it appears that the negative_sign is "a string used to indicate negative monetary quantity" so I think just '-' should be used for non-monetary integers.Penrose
@JonathanMee: Hm...quite right. Not sure how I missed that (or maybe I just didn't look).Piping
So I ported the concepts in this answer to C++ and linked this question. Hopefully this meets with your approval. https://mcmap.net/q/89394/-comma-separate-number-without-stringstreamPenrose
@JonathanMee: It doesn't bother me but it seems a lot less useful in C++. C++ has required that iostreams be locale-aware for a long time, so a stream will punctuate a number according to the locale with which it's been imbued. std::cout.imbue(std::locale("")); std::cout << "1234567.89; will print (on my machine) 1,234,567.89, but if (since I'm using the anonymous locale) that would vary depending on how your environment was configured. At least in my experience, all the typical compilers/libraries have supported this for a long time.Piping
Yeah I agree. The question was actually spawned by the requirement that I do this without using stringstream. Thus the need to jump through all these hoops.Penrose
@JonathanMee: Ah, I see--understandable, I guess--a stringstream does impose quite a bit of overhead if this is all you really want.Piping
I
0

Here is an improved version. It takes a (double) value and (int) fix, and returns a string with the formatted integer part, rounded to 'fix' decimal places. The returned string should be freed after use (unless it is NULL).

Example:

char *fmtd = commafmt(5467729.3784, 2);
printf("Result is %s\n", fmtd);
free(fmtd);

Output: $> Result is 5,467,729.38

#include <stdlib.h>
#include <locale.h>
#include <string.h>
#include <limits.h>
#include <inttypes.h>
#include <stdint.h>

static inline int next_group(char const **grouping) {
  if ((*grouping)[1] == CHAR_MAX) return 0;
  if ((*grouping)[1] != '\0') ++*grouping;
  return **grouping;
}

static size_t commafmt_inner(char *buf, int bufsize, double val, int fix) {
  struct lconv *fmt_info = localeconv();
  char const *tsep = fmt_info->thousands_sep;
  char const *group = fmt_info->grouping;
  int neg = val < 0.;
  uint64_t N = neg ? (uint64_t)-val : (uint64_t)val;
  size_t sep_len = strlen(tsep);
  int places = (int)*group, len = 0, posn = 0;
  char *ptr;

  if (--bufsize < 1) return 0;
  ptr = buf + bufsize;
  *ptr-- = '\0';

  while (1) {
    *ptr-- = (char)((N % 10L) + '0'
    if (++len >= bufsize) return 0;    
    if (!(N /= 10L)) break;
    if (places && !(++posn % places)) {
      places = next_group(&group);
      for (int i = sep_len; i--; *ptr-- = tsep[i])
        if (++len >= bufsize) return 0;
    }
  }

  if (neg) {
    if (++len >= bufsize) return 0;
    *ptr-- = '-';
  }

  memmove(buf, ++ptr, len + 2);

  if (fix > 0) {
    int fixlen;
    val = !neg ? val - (double)((uint64_t)val)
     : -(val + (double)((uint64_t)-val));
    // append fix part after int part
    snprintf(buf + len, bufsize - len, "%.*f", fix, val);
    fixlen = strlen(buf + len);
    // move fix part to eliminate leading 0
    memmove(buf + len, buf + len + 1, fixlen);
  }

  return len;
}

char *commafmt(double val, int fix) {
  char buff[1024];
  size_t len = commafmt_inner(buff, 1024, val, fix);
  return len ? strdup(buff) : NULL;
}
Impersonalize answered 28/10, 2023 at 2:32 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.