This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: buffer overrun in repr() for unicode strings
Type: Stage:
Components: Interpreter Core Versions: Python 2.4
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: nnorwitz Nosy List: alexanderweb, georg.brandl, nnorwitz, sfllaw
Priority: high Keywords: patch

Created on 2006-08-16 21:27 by sfllaw, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
python2.4-2.4.3_unicodeobject.c.diff sfllaw, 2006-08-16 21:28 Patch
unicode_escape_fix.py georg.brandl, 2006-10-07 18:41
Messages (6)
msg50919 - (view) Author: Simon Law (sfllaw) * Date: 2006-08-16 21:27
From
https://launchpad.net/distros/ubuntu/+source/python2.4/+bug/56633

Benjamin C. Wiley Sittler reports:

hi,

i discovered a bug yesterday in repr() for unicode
strings. this
causes an unpatched non-debug wide (UTF-32/UCS-4) build
of python to
abort:

python2.4 -c 'assert(repr(u"\U00010000" * 39 +
u"\uffff" * 4096)) ==
(repr(u"\U00010000" * 39 + u"\uffff" * 4096))'

the problem is fixed by a change to unicodeobject.c. in
the process of
fixing it i also found and fixed another bug in repr()
on UCS-4 python
builds -- previously paired unicode surrogates were
being repr()'ed as a
single "character" even though they are not treated as
such by a UCS-4
python build -- i.e. eval(repr(u'\ud800\udc00')) !=
u'\ud800\udc00' in
an unpatched UCS-4 build.

Package: python2.4
Version: 2.4.3-7ubuntu2
Severity: important

when i run this command:

python -c
"repr(u'\u24ea\u059c\u200a\U0001d77e\uff07\u202f\u0747\u202f
\U0001d56b\U0001d5b9\U0001d4e9\u20052\u14bf\U0001d7f8\u200a\U0001d795
\U0001d6e7Z\u2006\u2002\U0001d50a\uff27\u13c0\u2000\uff16\u0411\uff16
\U0001d7e7\uff4c\u2006\u2001\ufe39\u2008\u0313]\u2008\u3014\u3015')"

python aborts with the following backtrace and memory dump:

*** glibc detected *** python: realloc(): invalid next
size: 0x081521e8
***
======= Backtrace: =========
/lib/tls/i686/cmov/libc.so.6[0xb7e8acd4]
/lib/tls/i686/cmov/libc.so.6(__libc_realloc+0xff)[0xb7e8cc5f]
python(_PyString_Resize+0x80)[0x8082b4b]
python[0x80991f7]
python(PyObject_Repr+0x58)[0x807d1fd]
python(PyEval_EvalFrame+0x4b37)[0x80b5270]
python(PyEval_EvalCodeEx+0x836)[0x80b65d6]
python(PyEval_EvalCode+0x57)[0x80b6640]
python(PyRun_SimpleStringFlags+0xa8)[0x80d8b7c]
python(Py_Main+0x685)[0x8055862]
python(main+0x22)[0x80550e2]
/lib/tls/i686/cmov/libc.so.6(__libc_start_main+0xd8)[0xb7e378b8]
python[0x8055041]
======= Memory map: ========
08048000-0811a000 r-xp 00000000 08:03 622736
/usr/bin/python2.4
0811a000-0813b000 rw-p 000d1000 08:03 622736
/usr/bin/python2.4
0813b000-081b5000 rw-p 0813b000 00:00 0 [heap]
b7c00000-b7c21000 rw-p b7c00000 00:00 0
b7c21000-b7d00000 ---p b7c21000 00:00 0
b7d40000-b7d4a000 r-xp 00000000 08:03 376899
/lib/libgcc_s.so.1
b7d4a000-b7d4b000 rw-p 00009000 08:03 376899
/lib/libgcc_s.so.1
b7d68000-b7d9b000 r--p 00000000 08:03
82634 /usr/lib/locale/en_US.utf8/LC_CTYPE
b7d9b000-b7d9e000 r-xp 00000000 08:03
625529 /usr/lib/python2.4/lib-dynload/_locale.so
b7d9e000-b7d9f000 rw-p 00003000 08:03
625529 /usr/lib/python2.4/lib-dynload/_locale.so
b7d9f000-b7e22000 rw-p b7d9f000 00:00 0
b7e22000-b7f51000 r-xp 00000000 08:03
66543 /lib/tls/i686/cmov/libc-2.4.so
b7f51000-b7f53000 r--p 0012e000 08:03
66543 /lib/tls/i686/cmov/libc-2.4.so
b7f53000-b7f55000 rw-p 00130000 08:03
66543 /lib/tls/i686/cmov/libc-2.4.so
b7f55000-b7f58000 rw-p b7f55000 00:00 0
b7f58000-b7f7c000 r-xp 00000000 08:03
66547 /lib/tls/i686/cmov/libm-2.4.so
b7f7c000-b7f7e000 rw-p 00023000 08:03
66547 /lib/tls/i686/cmov/libm-2.4.so
b7f7e000-b7f80000 r-xp 00000000 08:03
68161 /lib/tls/i686/cmov/libutil-2.4.so
b7f80000-b7f82000 rw-p 00001000 08:03
68161 /lib/tls/i686/cmov/libutil-2.4.so
b7f82000-b7f83000 rw-p b7f82000 00:00 0
b7f83000-b7f85000 r-xp 00000000 08:03
66546 /lib/tls/i686/cmov/libdl-2.4.so
b7f85000-b7f87000 rw-p 00001000 08:03
66546 /lib/tls/i686/cmov/libdl-2.4.so
b7f87000-b7f96000 r-xp 00000000 08:03
68156 /lib/tls/i686/cmov/libpthread-2.4.so
b7f96000-b7f98000 rw-p 0000f000 08:03
68156 /lib/tls/i686/cmov/libpthread-2.4.so
b7f98000-b7f9a000 rw-p b7f98000 00:00 0
b7fb0000-b7fb7000 r--s 00000000 08:03
2130015 /usr/lib/gconv/gconv-modules.cache
b7fb7000-b7fb9000 rw-p b7fb7000 00:00 0
b7fb9000-b7fd2000 r-xp 00000000 08:03 2737127
/lib/ld-2.4.so
b7fd2000-b7fd4000 rw-p 00018000 08:03 2737127
/lib/ld-2.4.so
bf99b000-bf9b3000 rw-p bf99b000 00:00 0 [stack]
ffffe000-fffff000 ---p 00000000 00:00 0 [vdso]
Aborted
msg50920 - (view) Author: Neal Norwitz (nnorwitz) * (Python committer) Date: 2006-08-21 22:22
Logged In: YES 
user_id=33168

Committed revision 51448. (2.6)
Committed revision 51450. (2.5)

Someone should backport to 2.4, leaving open until then.
msg50921 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2006-08-22 08:26
Logged In: YES 
user_id=849994

Applied to 2.4 in revision 51466.
msg50922 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2006-10-07 17:00
Logged In: YES 
user_id=849994

Attaching a code file containing a Python version of the
Unicode string escape routine for UCS-4 builds.

(it only replaces repr() if called from python, so not
everything is fixed, esp. "%r")
msg50923 - (view) Author: Alexander Schremmer (alexanderweb) Date: 2006-10-07 22:43
Logged In: YES 
user_id=254738

The CVE issue for this bug is CVE-2006-4980, which 
currently still under review.
msg50924 - (view) Author: Alexander Schremmer (alexanderweb) Date: 2006-10-15 11:56
Logged In: YES 
user_id=254738

The related security advisory is http://www.python.org/news/
security/PSF-2006-001/
History
Date User Action Args
2022-04-11 14:56:19adminsetgithub: 43838
2006-08-16 21:27:59sfllawcreate