Issue1572832
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2006-10-07 18:00 by chasonr, last changed 2022-04-11 14:56 by admin. This issue is now closed.
Files | ||||
---|---|---|---|---|
File name | Uploaded | Description | Edit | |
iso2022-patch.txt | chasonr, 2006-10-07 18:07 |
Messages (5) | |||
---|---|---|---|
msg51211 - (view) | Author: Ray Chason (chasonr) | Date: 2006-10-07 18:00 | |
This may relate to bug report 1005078, which was closed because it couldn't be duplicated with the information given. Run the following program for a segmentation fault on your Python interpreter: --CUT HERE--CUT HERE--CUT HERE--CUT HERE--CUT HERE--CUT HERE--CUT HERE-- import sys for x in xrange(0x10000, 0x110000): if sys.maxunicode >= 0x10000: ch = unichr(x) else: ch = unichr(0xD7C0+(x>>10)) + unichr(0xDC00+(x & 0x3FF)) try: # Any ISO 2022 codec will cause the segfault ch.encode("iso_2022_jp") except UnicodeEncodeError: pass --CUT HERE--CUT HERE--CUT HERE--CUT HERE--CUT HERE--CUT HERE--CUT HERE-- I have verified this bug on four different Pythons: * The current ActivePython (2.4.3 based), running on Windows XP SP2 * The stock Python 2.4.2 on Ubuntu Breezy (i386) * The stock Python 2.4.2 on Ubuntu Breezy (AMD64) * A home-built Python 2.5 on Ubuntu Breezy (i386); --enable-unicode=ucs4 is selected and other options are left at default It does not just affect iso_2022_jp, but all of the ISO 2002 codecs. If you are attempting to replicate the bug on Linux, you may get more repeatble results if you first go root and then: echo 0 > /proc/sys/kernel/randomize_va_space This seems related to bug report 1005078. However, bug report 1005078 claimed that a character in the BMP could cause a crash. I have not reproduced that bug using a BMP character; however, supplementary characters can in fact cause the ISO 2022 codecs to crash. The problem is that four functions in Modules/cjkcodecs/_codecs_iso2022.c do not check that the code point is less than 0x10000 before invoking the TRYMAP_ENC macro. This causes the bounds of the encoding table to be exceeded. The four functions are: * ksx1001_encoder * jisx0208_encoder * jisx0212_encoder * gb2312_encoder The enclosed patch adds the necessary checks, and the above program then completes without incident. It is derived from the official 2.5 release, but also applies cleanly against the daily drop of 6 October 2006 because the file Modules/cjkcodecs/_codecs_iso2022.c is unchanged in that drop. |
|||
msg51212 - (view) | Author: Ray Chason (chasonr) | Date: 2006-10-07 18:07 | |
Logged In: YES user_id=421946 There's no uploaded file! You have to check the checkbox labeled "Check to Upload & Attach File" when you upload a file. In addition, even if you *did* check this checkbox, a bug in SourceForge prevents attaching a file when *creating* an issue. Please try again. (This is a SourceForge annoyance that we can do nothing about. :-( ) |
|||
msg51213 - (view) | Author: Ray Chason (chasonr) | Date: 2006-10-07 18:07 | |
Logged In: YES user_id=421946 The upload seems to have quietly failed to work. Also, the indents got mashed on that test program, and we all know how important indents are to Python. Here it is again, with the test program prefixed this time. |
|||
msg51214 - (view) | Author: Neal Norwitz (nnorwitz) * | Date: 2006-10-08 00:53 | |
Logged In: YES user_id=33168 Thanks for the report. Perky, could you take a look at this patch? I don't know if it's correct or not. |
|||
msg51215 - (view) | Author: Hyeshik Chang (hyeshik.chang) * | Date: 2006-10-08 14:05 | |
Logged In: YES user_id=55188 The patch is correct. Thanks for the report! Applied in svn: r52223 for trunk r52224 for 2.4 r52225 for 2.5 |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:56:20 | admin | set | github: 44097 |
2006-10-07 18:00:20 | chasonr | create |