This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: calendar.weekheader(n): n should mean chars not bytes
Type: Stage:
Components: Library (Lib) Versions: Python 2.3
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: doerwalter, hyeshik.chang, leorochael, loewis
Priority: high Keywords:

Created on 2004-05-04 18:38 by leorochael, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
diff.txt doerwalter, 2004-07-21 19:17
calendar.diff doerwalter, 2006-03-31 15:14
calendar2.diff doerwalter, 2006-03-31 17:36
calendar3.diff doerwalter, 2006-03-31 17:45
Messages (10)
msg20692 - (view) Author: Leonardo Rochael Almeida (leorochael) Date: 2004-05-04 18:38
calendar.weekheader(n) is locale aware, which is good
in principle. The parameter n, however, is interpreted
as meaning bytes, not chars, which can generate broken
strings for, e.g. localized weekday names:

>>> calendar.weekheader(2)
'Mo Tu We Th Fr Sa Su'
>>> locale.setlocale(locale.LC_ALL, "pt_BR.UTF-8")
'pt_BR.UTF-8'
>>> calendar.weekheader(2)
'Se Te Qu Qu Se S\xc3 Do'

Notice how "Sábado" (Saturday) above is missing the
second utf-8 byte for the encoding of "á":

>>> u"Sá".encode("utf-8")
'S\xc3\xa1'

The implementation of weekheader (and of all of
calendar.py, it seems) is based on localized 8 bit
strings. I suppose the correct fix for this bug will
involve a roundtrip thru unicode.
msg20693 - (view) Author: Hyeshik Chang (hyeshik.chang) * (Python committer) Date: 2004-05-07 23:57
Logged In: YES 
user_id=55188

I think calendar.weekheader should mean not chars nor bytes
but width.
Because the function is currectly used for fixed width
representations
of calendars.
Yes. They are same for western alphabets. But, for many of CJK
characters are in full width. So, they need only 1 character for
calendar.weekheader(2); and it's conventional in real life, too.

But, we don't have unicode.width() support to implement the
feature yet.
msg20694 - (view) Author: Walter Dörwald (doerwalter) * (Python committer) Date: 2004-06-02 19:08
Logged In: YES 
user_id=89016

Maybe we should have a second version of calendar (named 
ucalendar?) that works with unicode strings? Could those two 
modules be rewritten to use as much common functionality as 
possible? Or we could use a module global to configure 
whether str or unicode should be returned?

Most of the localization functionality in calendar seems to 
come from datetime.datetime.strftime(), so it probably would 
help to have a method datetime.datetime.ustrftime() that 
returns the formatted string as unicode (using the locale 
encoding).

Assigning to MvL as the locale/unicode expert.
msg20695 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2004-06-03 04:43
Logged In: YES 
user_id=21627

Adding an ucalendar module would be reasonable, IMO.
Introducing ustrftime is not necessary - we could just apply
the "unicode in/unicode out" procedure (i.e. if the format
is a Unicode string, return a Unicode result). The tricky
part of that is to convert the strftime result to Unicode.
We could try mbstowcs, but that would fail if the locale
doesn't use Unicode for wchar_t.

Once ucalendar is written, we could document that the
calendar module has known problems if the locale's encoding
is not Latin-1.

However, I'm not going to implement that any time soon, so
unassigning.
msg20696 - (view) Author: Walter Dörwald (doerwalter) * (Python committer) Date: 2004-07-21 19:17
Logged In: YES 
user_id=89016

The following patch doesn't fix the unicode problem, but it 
should enable us to have both 8bit and unicode calendars. It 
reimplements the calendar functionality as classes. This 
makes it possible to reuse the date calculation logic and 
extend or replace the string formatting logic. Implementing a 
unicode version would be done by subclassing TextCalendar 
and overwritting formatweekday() and formatmonthname().

The patch adds several other features:

A HTML version of a calendar can be output. (An example 
output can be found at 
http://styx.livinglogic.de/~walter/calendar/calendar.html).

The calendar module can be used as a script from the 
command line. Various options are available.

It's possible to specify the number of months per row (they 
were fixed at 3 in the old version).

If this patch is accepted I can provide documentation and 
tests.
msg20697 - (view) Author: Walter Dörwald (doerwalter) * (Python committer) Date: 2006-03-31 15:14
Logged In: YES 
user_id=89016

Here's a new version of the patch with documentation for the
Calendar classes and a new test. The script interface isn't
documented in the TeX file (python -mcalendar --help should
be enough).
msg20698 - (view) Author: Walter Dörwald (doerwalter) * (Python committer) Date: 2006-03-31 17:11
Logged In: YES 
user_id=89016

Checked in calendar.diff as r43483.
msg20699 - (view) Author: Walter Dörwald (doerwalter) * (Python committer) Date: 2006-03-31 17:36
Logged In: YES 
user_id=89016

This second patch (calendar2.diff) adds new subclasses
LocaleTextCalendar and LocaleHTMLCalendar that output
localized month and weekday names and can cope with encodings.
msg20700 - (view) Author: Walter Dörwald (doerwalter) * (Python committer) Date: 2006-03-31 17:45
Logged In: YES 
user_id=89016

This third patch (calendar3.diff) is a variant of of the
second patch, that uses xmlcharrefreplace error handling in
the HTML calendar.
msg20701 - (view) Author: Walter Dörwald (doerwalter) * (Python committer) Date: 2006-04-01 07:57
Logged In: YES 
user_id=89016

Checked in calendar3,diff (minus the test) as r43531.
History
Date User Action Args
2022-04-11 14:56:04adminsetgithub: 40218
2004-05-04 18:38:31leorochaelcreate