German number separators using format language on OSX?
Asked Answered
F

5

15

Update: The answers show so far that it seems to be a platform-related bug on OSX that has to do with the specific locale settings as they don't fully support grouping numbers.

Update 2: I have just opened an issue on Python's bug tracker. Let's see if there is a solution to this problem.


I want to format integer and float numbers according to the German numbering convention. This is possible using the format language and the presentation type n but fails on my platform.

  • Platform: OS X 10.8.2 (Mountain Lion)
  • Python: 2.7.3 64-bit (v2.7.3:70274d53c1dd, Apr 9 2012, 20:52:43) [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin

Examples:

  • 1234 => 1.234
  • 1234.56 => 1.234,56
  • 1000000 => 1.000.000

What I have tried so far:

  1. Setting the German locale

    import locale
    locale.setlocale(locale.LC_ALL, 'de_DE')
    
  2. The format specification option , only recognizes the English format.

    '{:,}'.format(1234)
    '1,234'
    
    '{:,}'.format(1234.56)
    '1,234.56'
    
    '{:,}'.format(1000000)
    '1,000,000'
    
  3. According to the Python docs, the integer and float presentation type n is supposed to do what I want but it doesn't.

     '{:n}'.format(1234)
     '1234'
    
     '{:n}'.format(1234.56)
     '1234,56'  # at least the comma was set correctly here
    
     '{:n}'.format(1000000)
     '1000000'
    
     '{:n}'.format(12345769.56)
     '1,23458e+07'  # it's doing weird things for large floats
    
  4. Some more examples and comparisons inspired by @J.F.Sebastian:

    for n in [1234, 1234.56, 1000000, 12345769.56]:
        print('{0:,} {0:n}'.format(n))
        fmt, val = "%d %f", (n, n)
        print(fmt % val)
        print(locale.format_string(fmt, val))
        print(locale.format_string(fmt, val, grouping=True))
        print('-'*60)
    

    This yields the following incorrect results on my platform:

        1,234 1234
        1234 1234.000000
        1234 1234,000000
        1234 1234,000000
        ------------------------------------------------------------
        1,234.56 1234,56
        1234 1234.560000
        1234 1234,560000
        1234 1234,560000
        ------------------------------------------------------------
        1,000,000 1000000
        1000000 1000000.000000
        1000000 1000000,000000
        1000000 1000000,000000
        ------------------------------------------------------------
        12,345,769.56 1,23458e+07
        12345769 12345769.560000
        12345769 12345769,560000
        12345769 12345769,560000
        ------------------------------------------------------------
    

    The correct results which I'm not getting would look like that:

        1,234 1.234
        1234 1234.000000
        1234 1234,000000
        1.234 1.234,000000
        ------------------------------------------------------------
        1,234.56 1.234,56
        1234 1234.560000
        1234 1234,560000
        1.234 1.234,560000
        ------------------------------------------------------------
        1,000,000 1.000.000
        1000000 1000000.000000
        1000000 1000000,000000
        1.000.000 1.000.000,000000
        ------------------------------------------------------------
        12,345,769.56 1,23458e+07 
        12345769 12345769.560000
        12345769 12345769,560000
        12.345.769 12.345.769,560000
        ------------------------------------------------------------
    

Do you have a solution for me using the format language only? Is there any way to trick the locale settings on my platform to accept grouping?

Fishwife answered 11/1, 2013 at 21:37 Comment(14)
It seems silly to put arbitrary limits on answers - remember that people can give different answers. Saying "don't solve this with regexes" may mean you miss out on the best solution, and it doesn't make it more or less likely people will give the answer you are looking for.Simultaneous
@Lattyware You misunderstand me. I know how to do it using regular expressions. I simply want to know whether this is possible using the Python format language only.Fishwife
Then what I'd do is post the question without that limitation, and then post an answer yourself with the regex solution. You might want to make a note in the question you are still looking for the formatting based answer, but that way everyone wins. Either way - this is just my thoughts, I don't think there is anything official, just might be worth doing.Simultaneous
@Lattyware How is that necessary for others to come up with an answer? I want to know whether there is a regex-free way of doing it. Nothing else matters.Fishwife
I'm just saying that SO is a resource as well as a place to get an answer to your question. People searching for how to do this might come across your question, and having the solution you know there too might be beneficial, it makes the question more general and complete.Simultaneous
@Lattyware I get your point but I just want to focus on what I'm looking for at the moment. Keeping it simple, you know. Maybe I add that later. But thanks for your thoughts. :)Fishwife
As a note, it might be worth taking a look to see how Django does it - it's a mature framework that handles localization well, so I would imagine it has mature code for this kind of thing.Simultaneous
have you tried to call locale.format_string() directly after setting desired locale?Entomo
@J.F.Sebastian Yes, doesn't work for me either.Fishwife
@PeterStahl: have you tried to set grouping=True?Entomo
@J.F.Sebastian Yes, your same code gives me incorrect results. Please look at my updated question.Fishwife
btw, It works fine on Ubuntu Python 2.6, 2.7, 3.3 except it doesn't accept 'de_DE' but only 'de_DE.UTF-8' locale. Try to reinstall your locale packages.Entomo
@J.F.Sebastian I don't think this would solve the problem because Lattyware gets the same wrong results.Fishwife
{:n} does not allow to specify the number precision.Quagmire
P
11

Super ugly, but technically answers the question:

From PEP 378:

'{:,}'.format(1234.56).replace(",", "X").replace(".", ",").replace("X", ".")
'1.234,56'
Phrixus answered 11/1, 2013 at 21:50 Comment(4)
Wow, that's... horrible. Surprised, the format stuff is so flexible, it seems an odd limitaitonSimultaneous
The solution using replace() came to my mind as well but this is ugly. Is there no other way?Fishwife
If it's recommended by the PEP, it means it's the 'official' way - not that it means much, but it's a bad sign if you are looking for something better.Simultaneous
I wouldn't say it's recommended by the PEP, just that it happens to be included (as an example of what users would have to resort to if the fancier "Proposal II" wasn't accepted, which it wasn't.)Phrixus
P
6

Python's locale module's implementation unfortunately varies quite a bit across platforms. It's really just a light wrapper around the C library vendor's notion of locales.

So, on Windows 7, with Python 2.7.3 64-bit, this happens to work (note: locales have different names in Windows):

>>> import locale
>>> locale.setlocale(locale.LC_ALL, 'deu_deu')
'German_Germany.1252'
>>> '{0:n}'.format(1234.56)
'1.234,56'

Whether the thousands separator will be used can be determined by examining the "local conventions":

>>> locale.localeconv()['grouping'] # On Windows, 'deu_deu'.
[3, 0] # Insert separator every three digits.

>>> locale.localeconv()['grouping'] # On OS X, 'de_DE'.
[127] # No separator (locale.CHAR_MAX == 127).

>>> locale.localeconv()['grouping'] # Default C locale.
[] # Also no separator.
Phrixus answered 11/1, 2013 at 22:31 Comment(6)
What does the number 127 stand for?Fishwife
@PeterStahl locale.CHAR_MAX. See the documentation for localeconv(), 'grouping'.Phrixus
Is there any way of modifying the grouping setting?Fishwife
Is your last locale truly English, or is it the default C locale? I find I get separators if I use the 'en_US' locale.Bisk
@FredLarson What do you mean by last locale? I'm using de_DE as the locale for all examples.Fishwife
@PeterStahl: I meant the last locale in this answer, which originally said "English locale". Jon-Eric has it right now. The default "C" locale has no separators.Bisk
B
5

This worked for me when used with the German locale:

>>> import locale
>>> locale.setlocale(locale.LC_ALL, 'de_DE')
'de_DE'
>>> '{0:n}'.format(1234.56)
'1.234,56'

This is in Cygwin under Windows 7:

>>> import sys
>>> print sys.version
2.6.5 (r265:79063, Jun 12 2010, 17:07:01)
[GCC 4.3.4 20090804 (release) 1]
Bisk answered 11/1, 2013 at 22:6 Comment(8)
What platform is this from?Phrixus
Running on Windows 7, Python version 2.6.5 (r265:79063, Jun 12 2010, 17:07:01) [GCC 4.3.4 20090804 (release) 1]Bisk
Using locale.setlocale(locale.LC_ALL, 'de_DE') this doesn't work for me on OSX 10.8.2 using Python 2.7.3 64-bit (v2.7.3:70274d53c1dd, Apr 9 2012, 20:52:43)[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin. What are your exact locale settings?Fishwife
Interesting, on OS X both 2.7.3 and 3.3.0 don't do this for me, I get the same as the asker.Simultaneous
@PeterStahl: I used the same locale you did.Bisk
@Lattyware This is weird. Is this really supposed to be a platform problem? Such a simple task? Oh my...Fishwife
@PeterStahl Sounds like either a bug or something we are missing going on - such a change in behaviour based on OS seems wrong. It could be a 2.6.5 vs 2.7.3+ thing, but that seems unlikely to me (as it seems like a step back).Simultaneous
@Lattyware It's not a 2.6 vs 2.7 issue. I get the same incorrect results on OSX using 2.6.7 Python 2.6.7 (r267:88850, Jun 20 2012, 16:23:38) [GCC 4.2.1 Compatible Apple Clang 4.0 (tags/Apple/clang-418.0.60)] on darwin.Fishwife
D
2

Even more ugly with split, join and replace:

>>> amount = '{0:,}'.format(12345.67)
>>> amount
'12,345.67'
>>> ','.join([s.replace(',','.') for s in amount.split('.')])
'12.345,67'
Doronicum answered 13/3, 2018 at 13:40 Comment(0)
F
1

I was asked by @Lattyware to provide my own solution for including separators according to the German numbering convention without using the format language. Here is the best solution that I can come up with:

import re

def group_num(num):
    if isinstance(num, (int, float)):
        if isinstance(num, float):
            head, tail = str(num).split('.')
        elif isinstance(num, int):
            head, tail = str(num), ''
        digit_parts = re.findall(r'\d{1,3}\-?', ''.join(head[::-1]))
        num = '.'.join(part[::-1] for part in digit_parts[::-1])
        if tail:
            num = ','.join((num, tail))
        return num
    else:
        raise TypeError(num, 'is not of type int or float')

>>> group_num(1234)
'1.234'
>>> group_num(123456.7890)
'123.456,789'
>>> group_num(-1000000000.12)
'-1.000.000.000,12'

The performance is also quite okay, compared to the solution given by @Jon-Eric.

%timeit group_num(1000000000.12)
10000 loops, best of 3: 20.6 us per loop

# For integers, it's faster since several steps are not necessary
%timeit group_num(100000000012)
100000 loops, best of 3: 18.2 us per loop

%timeit '{:,}'.format(1000000000.12).replace(",", "X").replace(".", ",").replace("X", ".")
100000 loops, best of 3: 2.63 us per loop

%timeit '{:,}'.format(100000000012).replace(",", "X").replace(".", ",").replace("X", ".")
100000 loops, best of 3: 2.01 us per loop

If you know how my solution could be optimized, please let me know.

Fishwife answered 12/1, 2013 at 20:35 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.