This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: locale fails if LANGUAGE has multiple locales
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.4, Python 3.5, Python 2.7
process
Status: closed Resolution: out of date
Dependencies: Superseder:
Assigned To: lemburg Nosy List: BreamoreBoy, ber, bernhard, lemburg, meatballhat, mixedpuppy, serhiy.storchaka, sorlov, vstinner
Priority: low Keywords: easy, patch

Created on 2005-03-07 19:11 by mixedpuppy, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
remove-support-for-LANGUAGE--in-locale.patch meatballhat, 2010-08-01 03:30 removes LANGUAGE from envvars kwarg, adds tests review
Messages (13)
msg24492 - (view) Author: mixedpuppy (mixedpuppy) Date: 2005-03-07 19:11
The locale module does not correctly handle the
LANGUAGE environment variable if it contains multiple
settings.  Example:

LANGUAGE="en_DK:en_GB:en_US:en"

Note, en_DK does not exist in locale_alias

In normalize, the colons are replaced with dots, which
is incorrect.  getdefaultlocal should seperate these
first, then try each one until it finds one that works,
or fails on all.  

GLIBC documentation:
http://www.delorie.com/gnu/docs/glibc/libc_138.html

"While for the LC_xxx variables the value should
consist of exactly one specification of a locale the
LANGUAGE variable's value can consist of a colon
separated list of locale names."


Testing this is simple, just set your LANGUAGE
environment var to the above example, and use
locale.getdefaultlocal()

> export LANGUAGE="en_DK:en_GB:en_US:en"
> python
ActivePython 2.4 Build 244 (ActiveState Corp.) based on
Python 2.4 (#1, Feb  9 2005, 19:33:15)
[GCC 3.3.1 (SuSE Linux)] on linux2
Type "help", "copyright", "credits" or "license" for
more information.
>>> import locale
>>> locale.getdefaultlocale()
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "/opt/ActivePython-2.4/lib/python2.4/locale.py",
line 344, in getdefaultlocale
    return _parse_localename(localename)
  File "/opt/ActivePython-2.4/lib/python2.4/locale.py",
line 278, in _parse_localename
    raise ValueError, 'unknown locale: %s' % localename
ValueError: unknown locale: en_DK:en_GB:en_US:en
>>>
msg24493 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2005-03-10 15:43
Logged In: YES 
user_id=38388

The URL you gave does state that LANGUAGE can take mulitple
entries separated by colons. However, I fail to see how to
choose the locale from the list of possibilities. Any ideas ?
msg24494 - (view) Author: Serge Orlov (sorlov) Date: 2005-03-10 18:48
Logged In: YES 
user_id=1235914

The docs for getdefaultlocale state that it follows the GNU
gettext search path. OTOH gettext can return result from any
of catalogs en_DK:en_GB:en_US:en, it depends on the content
of the message. So maybe getdefaultlocale should just pick
up the first value from LANGUAGE ?
msg24495 - (view) Author: mixedpuppy (mixedpuppy) Date: 2005-03-10 21:50
Logged In: YES 
user_id=1234417

IMHO the proper behaviour is to split on the colon, then try
each one from start to finish until there is a success, or
all fail.  For example, if you just try en_DK, you will get
a failure since that is not in locale.locale_alias, but
en_GB or en_US would succeed.
msg24496 - (view) Author: Bernhard Herzog (bernhard) Date: 2005-09-26 16:43
Logged In: YES 
user_id=2369

Another consequence of this bug is that even if
getdefaultlocale does not fail with an exception, it may
return an invalid value for the encoding.  E.g. one thuban
user had

LANGUAGE=pt_BR:pt_PT:pt

getdefaultlocale did not raise an exception, but return
"pt_pt" as the encoding because the normalized form of the
above value was pt_BR.pt_pt and the locale module assumes
that the part after the "." is the encoding.
msg24497 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2005-09-26 18:23
Logged In: YES 
user_id=38388

The current CVS version returns this value:

>>> import locale
>>> locale.getdefaultlocale()
(None, None)

Given all the problems with the LANGUAGE environment variable
(which is a gettext() only thing) I'm inclined to remove
support for
it altogether.
msg24498 - (view) Author: Bernhard Reiter (ber) (Python committer) Date: 2005-10-16 13:26
Logged In: YES 
user_id=113859

Hi Marc-Andre, 
 
do you mean that the current CVS version will return (None, None) 
always or only for special LANUGUAGE settings? 
 
I do not have an overview about other problems with the 
LANGUAGE variable (from gettext), but adding support 
for the proper parsing of the colons and the testing seems 
a good thing to do from my perspective. 
Getdefaultlocale() will not get called often and if additional information 
can be used from the LANGUAGE variable, this will be benefical to the 
applications. 
 
Anyway, 
just my 0,02 Euro-Cents. 
 
Bernhard R. 
msg24499 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2005-10-17 09:30
Logged In: YES 
user_id=38388

Hi Bernhard,

sorry my last comment wasn't clear: you get this output if
you set the LANGUAGE variable to the example you gave
(LANGUAGE=pt_BR:pt_PT:pt).

The parsing order was changed, so that LANGUAGE is no longer
searched for first, but instead as last resort if the other
locale variables are not set.
msg24500 - (view) Author: Bernhard Reiter (ber) (Python committer) Date: 2005-10-17 12:06
Logged In: YES 
user_id=113859

Hi, 
 
using other information first seems to be a step forward to me. 
I just could not see this from the given example. 
 
But if LANGUAGE will be evaluated, will the colon be parsed correctly 
and the results tested? 
This seems to be the remainder of this bug. 
 
Bernhard R. 
msg112260 - (view) Author: Dan Buch (meatballhat) Date: 2010-08-01 03:30
I first verified that the relevant parts of ``locale:getdefaultlocale`` have been unchanged since 2005-10-17.

I'm adding a patch to remove default support for the LANGUAGE variable and tests to assert that values like 'en_DK:en_GB:en_US' raise ValueError (plus asserting that getting value from LC_ALL, LC_CTYPE, and LANG are all supported.)

None of the logic for normalizing candidate env vars has been changed, so the questions about how values like 'en_DK:en_GB:en_US' are handled all still apply -- I've just operated under the assumption that such values will continue to raise ValueError.
msg125562 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-01-06 16:24
The initial problem (":" in the LANGUAGE variable) was fixed in an independent (?) issue (#1166938) by r39572.

If I understood correctly, locale.getdefaultlocale() is supposed to give the locale settings that we will be active after the first call to locale.setlocale(locale.LC_ALL, ''). In this case, LANGUAGE should be ignored because it has no effect on the active locale. The variable is specific to the gettext library, it is not used by the locale machinery.

About remove-support-for-LANGUAGE--in-locale.patch: you should also update the documentation.
msg221816 - (view) Author: Mark Lawrence (BreamoreBoy) * Date: 2014-06-28 21:17
The words here https://docs.python.org/3/library/locale.html#locale.getdefaultlocale read in part "envvars defaults to the search path used in GNU gettext; it must always contain the variable name 'LANG'.".  I think this means that envvars should always contain 'LANG', even if the default is not used, but the code doesn't seem to need that.  If somebody can clarify this for me I'll submit a new patch.
msg228193 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-10-02 09:38
It looks to me that this issue is already gone.

>>> import os, locale
>>> os.environ['LANGUAGE'] = 'en_DK:en_GB:en_US:en'
>>> locale.getdefaultlocale(['LANGUAGE'])
('en_DK', 'ISO8859-1')

'en_DK' was added in issue20079.
History
Date User Action Args
2022-04-11 14:56:10adminsetgithub: 41664
2017-03-07 19:25:11serhiy.storchakasetstatus: pending -> closed
resolution: out of date
stage: patch review -> resolved
2014-10-02 09:38:54serhiy.storchakasetstatus: open -> pending
nosy: + serhiy.storchaka
messages: + msg228193

2014-06-28 21:17:19BreamoreBoysetnosy: + BreamoreBoy

messages: + msg221816
versions: + Python 3.4, Python 3.5, - Python 3.1, Python 3.2
2011-01-06 16:24:20vstinnersetnosy: lemburg, ber, bernhard, sorlov, mixedpuppy, vstinner, meatballhat
messages: + msg125562
2011-01-06 15:54:59pitrousetnosy: + vstinner

stage: test needed -> patch review
2010-08-21 12:33:20BreamoreBoysetversions: + Python 3.1, Python 2.7, - Python 3.3
2010-08-01 03:30:14meatballhatsetfiles: + remove-support-for-LANGUAGE--in-locale.patch
versions: + Python 3.2, Python 3.3, - Python 2.6
nosy: + meatballhat

messages: + msg112260

keywords: + patch
2009-04-22 14:37:09ajaksu2setkeywords: + easy
2009-03-20 21:48:52ajaksu2setstage: test needed
type: behavior
versions: + Python 2.6, - Python 2.4
2005-03-07 19:11:05mixedpuppycreate