This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: strptime(): can't switch locales more than once
Type: Stage:
Components: Library (Lib) Versions: Python 2.5
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: brett.cannon Nosy List: brett.cannon, kovan, meonkeys
Priority: normal Keywords:

Created on 2005-09-13 22:50 by meonkeys, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
switch_locales.py meonkeys, 2005-09-13 22:57 Demonstrates how locale switching after strptime() is called raises an exception on subsequent calls of strptime().
strptime_cache.diff brett.cannon, 2007-03-28 18:39 Possible patch to clear TimeRE cache when locale changes
strptime_timere_test.diff brett.cannon, 2007-03-29 02:07 Possible test for TimeRE instance being recreated if locale changes
Messages (14)
msg26277 - (view) Author: Adam Monsen (meonkeys) Date: 2005-09-13 22:50
After calling strptime() once, it appears that
subsequent efforts to modify the locale settings (so
dates strings in different locales can be parsed) throw
a ValueError. I'm pasting everything here since spacing
is irrelevant:

import locale, time
print locale.getdefaultlocale()        # ('en_US', 'utf')
print locale.getlocale(locale.LC_TIME) # (None, None)
# save old locale
old_loc = locale.getlocale(locale.LC_TIME)
locale.setlocale(locale.LC_TIME, 'nl_NL')
print locale.getlocale(locale.LC_TIME) # ('nl_NL',
'ISO8859-1')
# parse local date
date = '10 augustus 2005 om 17:26'
format = '%d %B %Y om %H:%M'
dateTuple = time.strptime(date, format)
# switch back to previous locale
locale.setlocale(locale.LC_TIME, old_loc)
print locale.getlocale(locale.LC_TIME) # (None, None)
date = '10 August 2005 at 17:26'
format = '%d %B %Y at %H:%M'
dateTuple = time.strptime(date, format)

The output I get from this script is:

('en_US', 'utf')
(None, None)
('nl_NL', 'ISO8859-1')
(None, None)
Traceback (most recent call last):
  File "switching.py", line 17, in ?
    dateTuple = time.strptime(date, format)
  File "/usr/lib/python2.4/_strptime.py", line 292, in
strptime
    raise ValueError("time data did not match format: 
data=%s  fmt=%s" %
ValueError: time data did not match format:  data=10
August 2005 at 17:26  fmt=%d %B %Y at %H:%M


One workaround I found is by manually busting the
regular expression cache in _strptime:

import _strptime
_strptime._cache_lock.acquire()
_strptime._TimeRE_cache = _strptime.TimeRE()
_strptime._regex_cache = {}
_strptime._cache_lock.release()

If I do all that, I can change the LC_TIME part of the
locale as many times as I choose.

If this isn't a bug, this should at least be in the
documentation for the locale module and/or strptime().
msg26278 - (view) Author: Adam Monsen (meonkeys) Date: 2005-09-13 22:57
Logged In: YES 
user_id=259388

I think there were some long lines in my code. Attaching
test case.
msg26279 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2005-09-15 02:42
Logged In: YES 
user_id=357491

OK, the problem was that the cache for the locale
information in terms of dates and time was being invalidated
and recreated, but the regex cache was not being touched.  I
has now been fixed in rev. 1.41 for 2.5 and in rev. 1.38.2.3
for 2.4 .

Thanks for reporting this, Adam.
msg26280 - (view) Author: kovan (kovan) Date: 2007-03-28 01:06
I think I'm having this issue with Python 2.5, as I can only make strptime take into account locale.setlocale() calls if I clear strptime's internal regexp cache between the calls to setlocal() and strptime().
msg26281 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2007-03-28 02:35
Can you show some code that recreatess the problem?
msg26282 - (view) Author: kovan (kovan) Date: 2007-03-28 07:06
This is the code:

def parseTime(strTime, format = "%a %b %d %H:%M:%S"):# example: Mon Aug 7 21:08:52                        

    locale.setlocale(locale.LC_TIME, ('en_US','UTF8'))    
    format = "%Y " + format
    strTime = str(datetime.now().year) + " " +strTime

    import _strptime
    _strptime._cache_lock.acquire()
    _strptime._TimeRE_cache = _strptime.TimeRE()
    _strptime._regex_cache = {}
    _strptime._cache_lock.release()    

    tuple = strptime(strTime, format)     
    return datetime(*tuple[0:6])


If I remove the code to clear the cache and add "print format_regex.pattern" statement to _strptime.py after "format_regex = time_re.compile(format)", I get 

(?P<Y>\d\d\d\d)\s*(?P<a>mi\�\�|s\�\�b|lun|mar|jue|vie|dom)\s*(?P<b>ene|feb|mar|abr|may|jun|jul|ago|sep|oct|nov|dic)\s*(?P<d>3[0-1]|[1-2]\d|0[1-9]|[1-9]|
 [1-9])\s*(?P<H>2[0-3]|[0-1]\d|\d):(?P<M>[0-5]\d|\d):(?P<S>6[0-1]|[0-5]\d|\d)

which is in my system's locale (es), and it should be in english.
msg26283 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2007-03-28 18:39
kovan, can you please apply the patch I have uploaded to your copy of _strptime and let me know if that fixes it?  I am oS X and switching locales doesn't work for me so I don't have an easy way to test this.
File Added: strptime_cache.diff
msg26284 - (view) Author: kovan (kovan) Date: 2007-03-28 22:55
I applied the patch, and it works now :). 
Thanks bcannon for the quick responses.
msg26285 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2007-03-28 23:40
The power of procrastination in the morning.  =)  I am going to try to come up with a test case for this.  I might ask, kovan, if you can run the test case to make sure it works.
msg26286 - (view) Author: kovan (kovan) Date: 2007-03-28 23:44
I'll be glad to help in whatever I can.
msg26287 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2007-03-29 02:07
I have uploaded a patch for test_strptime that adds a test to make sure that the TimeRE instance is recreated if the locale changes (went with en_US and de_DE, but could easily be other locales if there are other ones that are more common).  Let me know if the test runs fine and works.  Even better is if it fails without the fix.
File Added: strptime_timere_test.diff
msg26288 - (view) Author: kovan (kovan) Date: 2007-03-29 16:53
I've been looking at the test case, and I noticed that isn't actually checking anything, because locale.getlocale(locale.LC_TIME) is returning (None,None), which is ok and just means that the default locale (which is the C locale, not the system locale) is being used.
After removing that 'if' I also changed de_DE by es_ES to fit my system, and strptime('10', '%d') by strptime('Fri', '%a') and strptime('vie','%a'); because '10' is '10' in all -occidental- languages, and the test would not fail when the wrong locale is being used.

Once I made these changes to the test case, it successfully failed when using the non-patched _strptime.py, AND ran ok when using the patched version.

This is the test case I ended up using:



    def test_TimeRE_recreation(self):
        # The TimeRE instance should be recreated upon changing the locale.
        locale_info = locale.getlocale(locale.LC_TIME)
        locale.setlocale(locale.LC_TIME, ('en_US', 'UTF8'))
        try:
            _strptime.strptime('Fri', '%a')
            first_time_re_id = id(_strptime._TimeRE_cache)
            locale.setlocale(locale.LC_TIME, ('es_ES', 'UTF8'))
            _strptime.strptime('vie', '%a')
            second_time_re_id = id(_strptime._TimeRE_cache)
            self.failIfEqual(first_time_re_id, second_time_re_id)
        finally:
            locale.setlocale(locale.LC_TIME, locale_info)
msg26289 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2007-03-29 20:03
The test was checking that the TimeRE instance is recreated when the locale changes.  You do have a valid point about the 'if' check; should have put the setlocale call in an try/except block and just returned if an exception was raised.

As for the %d usage of strptime, that is just to force a call into strptime and thus trigger the new instance of TimeRE.  That is why the test checks the id of the objects; don't really care about strptime directly failing.  Did the test not fail properly even when you removed the 'if' but left everything else alone?
msg26290 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2007-04-08 04:18
r54646 (along with a fix for the test in r54647) has the change.  This is in the trunk.  I went with my test since if that test is wrong my whole understanding of how time.strptime works is wrong in terms of caching.  =)

I will backport to the 2.5 branch once 2.5.1 is done since I missed the deadline.  Thanks, Kovar, for all the help.
History
Date User Action Args
2022-04-11 14:56:13adminsetgithub: 42370
2005-09-13 22:50:55meonkeyscreate