locale.getlocale() problems on OSX
Asked Answered
M

6

3

I need to get the system locale to do a number of things, ultimately I want to translate my app using gettext. I am going to distribute it on both Linux and OSX, but I ran into problems on OSX Snow Leopard:

$ python
Python 2.5.2 (r252:60911, Jan  4 2009, 17:40:26) 
[GCC 4.3.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import locale
>>> locale.setlocale(locale.LC_ALL, '')
'sv_SE.UTF-8'
>>> locale.getlocale()
('sv_SE', 'UTF8')

$ python
Python 2.6.1 (r261:67515, Jul  7 2009, 23:51:51) 
[GCC 4.2.1 (Apple Inc. build 5646)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import locale
>>> locale.setlocale(locale.LC_ALL, '')
'C'
>>> locale.getlocale()
(None, None)

Both systems are using Swedish languages. On Linux, the environment variable LANG is already set to "sv_SE.UTF-8". If I pass that variable to python on OSX (LANG="sv_SE.UTF-8" python instead), locale is detected nicely. But shouldn't locale.getlocale()be able to fetch whatever language the operating system has? I don't want to force users to set LANG, LC_ALL or any environment variable at all.

Output of locale command:

$ locale
LANG=
LC_COLLATE="C"
LC_CTYPE="C"
LC_MESSAGES="C"
LC_MONETARY="C"
LC_NUMERIC="C"
LC_TIME="C"
LC_ALL=
Mescal answered 27/10, 2009 at 9:42 Comment(2)
what's your output of locale (in shell) in the same terminal window?Archaeopteryx
Added locale output to original post.Mescal
K
4

Odd on OSX (Smow Leopard 10.6.1) I get

$ python
Python 2.6.1 (r261:67515, Jul  7 2009, 23:51:51) 
[GCC 4.2.1 (Apple Inc. build 5646)] on darwin
Type "help", "copyright", "credits" or "license" for more information.  
>>> import locale
>>> locale.getlocale()
(None, None)
>>> locale.setlocale(locale.LC_ALL, '')
'en_GB.UTF-8'
>>> locale.getlocale()
('en_GB', 'UTF8')

Edit:

I just found this on the apple python mailing list

Basically it depends on what is set in your environment at run time (one of LANG, LANGUAGE, LC_ALL) I had LANG=en_GB.UTF-8 in my shell environment

Kohl answered 27/10, 2009 at 10:29 Comment(4)
Strange. In the original post I was using iTerm, but if I use Terminal.app I get an error (ValueError: unknown locale: UTF-8). The locale looks weird: 'C/UTF-8/C/C/C/C'. Maybe my system is messed up somehow, but it's a fairly fresh install of Snow Leopard.Mescal
See my edit for why the change appears - your system is not messed up (well no more that all OSX python) - Sorry should have added this when I editedKohl
I saw your link now, and from what I can gather, "it can't be done" since OSX doesn't make use of LANG or LC_ALL. I was intrigued by the __CF_USER_TEXT_ENCODING variable, but it seems kind of stupid to parse that. IMO getlocale() should call the appropriate API:s and parse that for you, not rely on some environment variables.Mescal
@pojo, and you were rather right: this is a python bug, and looks like they might end up using the native locale APIs instead of the environment vars. Alas, still unfixed. bugs.python.org/issue18378Peeress
U
3

Looks like you can change locale by changing environment variable LC_ALL.

$ export LC_ALL=C
$ python
Python 2.5.1 (r251:54863, Feb  6 2009, 19:02:12) 
[GCC 4.0.1 (Apple Inc. build 5465)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import locale
>>> locale.getlocale()
(None, None)
>>> locale.setlocale(locale.LC_ALL, "")
'C'
>>> locale.getlocale()
(None, None)    

$ export LC_ALL=en_GB.UTF-8
$ python
Python 2.5.1 (r251:54863, Feb  6 2009, 19:02:12) 
[GCC 4.0.1 (Apple Inc. build 5465)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import locale
>>> locale.getlocale()
(None, None)
>>> locale.setlocale(locale.LC_ALL, "")
'en_GB.UTF-8'
>>> locale.getlocale()
('en_GB', 'UTF8')
Unparalleled answered 27/10, 2009 at 13:59 Comment(1)
But I don't see the point of having to set LC_ALL explicitly this way to get my application to detect language properly.Mescal
A
3

Addmittedly a horrible hack, but I inserted this:

import platform

# ...

# XXX horrendous OS X invalid locale hack
if platform.system() == 'Darwin':
    import locale
    if locale.getlocale()[0] is None:
        locale.setlocale(locale.LC_ALL, 'en_US.UTF-8')

at an early point in a program of mine. After that I could run my program using unmodified shell environment on all OS'es relevant to me (my program figures out the language to be used later in it's processing anyway).

Apostolate answered 11/1, 2012 at 12:27 Comment(1)
+1. It is a horrible hack, but it is necessary given that this is a Python bug. (More details at my answer ;P)Peeress
Y
1

From here: Try adding or editing the ~/.profile or ~/.bash_profile file for it to correctly export your locale settings upon initiating a new session.

export LC_ALL=en_US.UTF-8  
export LANG=en_US.UTF-8
Ylla answered 31/7, 2013 at 9:38 Comment(1)
1) The locale vars might be set by the very terminal emulation program, so this is only hiding the problem. 2) The locale vars are already being set correctly, but python is not understanding them.Peeress
P
1

Old question, but this may help others: this is a Python bug that as of March 2016 is still unresolved in either Python 2 or 3: https://bugs.python.org/issue18378 .

The summary is that Python assumes GNU-like locales and balks on (POSIXly correct) divergences like those in BSD environments (as is OS X). And the UTF8 locale exists in BSD, not in Linux, hence the problem.

As for solutions or debugging: the local environment variables can be set by Terminal.app (see Preferences - Profiles - Advanced - International; similarly so for iTerm or whatever). So one can find the locale environment variables set when in a terminal window, but find the variables NOT set when running a packaged application.

For some cases (like Sphinx in python 2.7 and 3.5 dying in OS X because of "ValueError: unknown locale: UTF-8"), disabling the preference checkbox to set locale environment variables is the solution.

But that can cause problems in other programs: if the locale vars are not set, bash 4.3 (from MacPorts) will complain at every prompt with "warning: setlocale: LC_CTYPE: cannot change locale (): No such file or directory" ...

So, given that the bug is in Python, the workaround should be probably done in the python program (as in @Jacob Oscarson's answer) or python invocation (by setting the locale vars to some adequate value).

Peeress answered 1/3, 2016 at 12:29 Comment(0)
B
0

my setup

$ system_profiler SPSoftwareDataType
Software:

    System Software Overview:

      System Version: macOS 11.4 (20F71)
      Kernel Version: Darwin 20.5.0

When the current language in System Preferences > Language & Reginon is English, then

Python 3.9.5 (v3.9.5:0a7dcbdb13, May  3 2021, 13:17:02) 
[Clang 6.0 (clang-600.0.57)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import locale
>>> locale.getdefaultlocale()
(None, 'UTF-8')
>>> 

and the low-level shell command output

$ locale
LANG=""
LC_COLLATE="C"
LC_CTYPE="UTF-8"
LC_MESSAGES="C"
LC_MONETARY="C"
LC_NUMERIC="C"
LC_TIME="C"
LC_ALL=

When the current language is Chinese, Simplified, then

$ py3
Python 3.9.5 (v3.9.5:0a7dcbdb13, May  3 2021, 13:17:02) 
[Clang 6.0 (clang-600.0.57)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import locale
>>> locale.getdefaultlocale()
('zh_CN', 'UTF-8')
>>> 

and the low-level shell command output

$ locale
LANG="zh_CN.UTF-8"
LC_COLLATE="zh_CN.UTF-8"
LC_CTYPE="zh_CN.UTF-8"
LC_MESSAGES="zh_CN.UTF-8"
LC_MONETARY="zh_CN.UTF-8"
LC_NUMERIC="zh_CN.UTF-8"
LC_TIME="zh_CN.UTF-8"
LC_ALL=

Note that whenever we switch the system language in System Preferences, we must restart Terminal to see the differences.

Belamy answered 4/7, 2021 at 9:31 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.