Issue1728403
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2007-05-30 15:36 by tsuraan3, last changed 2022-04-11 14:56 by admin. This issue is now closed.
Messages (5) | |||
---|---|---|---|
msg32147 - (view) | Author: tsuraan (tsuraan3) | Date: 2007-05-30 15:36 | |
Python enters some sort of infinite loop when attempting to read data from a malformed file that is big5 encoded (using the codecs library). This behaviour can be observed under Linux and FreeBSD, using Python 2.4 and 2.5 . A really simple example illustrating the bug follows: Python 2.4.4 (#1, May 15 2007, 13:33:55) [GCC 4.1.1 (Gentoo 4.1.1-r3)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import codecs >>> fname='out' >>> outfd=open(fname,'w') >>> outfd.write(chr(243)) >>> outfd.close() >>> >>> infd= codecs.open(fname, encoding='big5') >>> infd.read(1024) And then, it hangs forever. If I instead use the following code: Python 2.5 (r25:51908, Jan 8 2007, 19:09:28) [GCC 3.4.5 (Gentoo 3.4.5-r1, ssp-3.4.5-1.0, pie-8.7.9)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import codecs, signal >>> fname='out' >>> def handler(*args): ... raise Exception("boo!") ... >>> signal.signal(signal.SIGALRM, handler) 0 >>> outfd=open(fname, 'w') >>> outfd.write (chr(243)) >>> outfd.close() >>> >>> infd=codecs.open(fname, encoding='big5') >>> signal.alarm(5) 0 >>> infd.read(1024) The program still hangs forever. The program can be made to crash if I don't install a signal handler at all, but that's pretty lame. It looks like the entire interpreter is being locked up by this read, so I don't think there's likely to be a pure-python workaround, but I thought it would be a good but to have out there so a future version of python can (hopefully) fix this. |
|||
msg32148 - (view) | Author: Neal Norwitz (nnorwitz) * | Date: 2007-05-31 04:49 | |
Hye-Shik, could you take a look at this. There's an infinite loop in Modules/cjkcodecs/multibytecodec.c mbstreamreader_iread(). rsize == 1 each iteration. I don't know if there are more places that might have this problem. |
|||
msg32149 - (view) | Author: Neal Norwitz (nnorwitz) * | Date: 2007-05-31 04:51 | |
Bumping the priority since this is about as bad as a crash. |
|||
msg32150 - (view) | Author: Hyeshik Chang (hyeshik.chang) * | Date: 2007-06-05 19:31 | |
Thank you for the reporting, tsuraan, and thank you for the investigation, Neal. The bug is related to a logic that detects whether file reached end of file. I verified that any other part of CJKCodecs has such a logic. Fixed and committed in SVN. trunk 55770, release25-maint 55774, release24-maint 55772. |
|||
msg32151 - (view) | Author: Hyeshik Chang (hyeshik.chang) * | Date: 2007-06-05 19:34 | |
in my comment: > The bug is related to a logic that detects whether file reached end of file. I verified that any other part of CJKCodecs has such a logic. I meant "no part". sorry. :) |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:56:24 | admin | set | github: 45015 |
2007-05-30 15:36:03 | tsuraan3 | create |