This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: MacRoman Encoding Bug (OHM vs. OMEGA)
Type: Stage:
Components: Unicode Versions: Python 2.4
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: lemburg Nosy List: lemburg, seanbpalmer
Priority: normal Keywords:

Created on 2005-12-16 02:22 by seanbpalmer, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Messages (2)
msg27087 - (view) Author: Sean B. Palmer (seanbpalmer) Date: 2005-12-16 02:22
The file encodings/mac_roman.py in Python 2.4.1
contains the following incorrect character definition
on line 96: 

        0x00bd: 0x2126, # OHM SIGN

This should read: 

        0x00bd: 0x03A9, # GREEK CAPITAL LETTER OMEGA

Presumably this bug occurred due to a misreading, given
that OHM and OMEGA having the same glyph. Evidence that
the OMEGA interpretation is correct: 

0xBD   0x03A9   # GREEK CAPITAL LETTER OMEGA
-http://www.unicode.org/Public/MAPPINGS/VENDORS/APPLE/ROMAN.TXT

Further evidence can be found by Googling for MacRoman
tables. This bug means that, for example, the following
code gives a UnicodeEncodeError when it shouldn't do: 

>>> u'\u03a9'.encode('macroman')

For a workaround, I've been using the following code: 

>>> import codecs
>>> from encodings import mac_roman
>>> mac_roman.decoding_map[0xBD] = 0x03A9
>>> mac_roman.encoding_map =
codecs.make_encoding_map(mac_roman.decoding_map)

And then, to use the example above: 

>>> u'\u03a9'.encode('macroman')
'\xbd'
>>> 

Thanks,

-- 
Sean B. Palmer
msg27088 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2005-12-16 14:47
Logged In: YES 
user_id=38388

This has been fixed in CVS and Python 2.5 will include the fix. 

A backport is not possible, because we've changed the way
charmap codecs work in 2.5.
History
Date User Action Args
2022-04-11 14:56:14adminsetgithub: 42700
2005-12-16 02:22:35seanbpalmercreate