This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: PyString_AsString() segfaults when passed a unicode string
Type: Stage:
Components: Interpreter Core Versions:
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: ajung, anthonybaxter, dcjim, tim.peters
Priority: high Keywords:

Created on 2004-12-09 13:08 by ajung, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Messages (5)
msg23618 - (view) Author: Andreas Jung (ajung) Date: 2004-12-09 13:08
If you pass a PyObject representation the unicode
string u'\xc4'
to PyString_AsString() then Python (2.3.4, 2.4.0) segfault.

Famous last words of Python 2.4:

Exception exceptions.UnicodeEncodeError:
<exceptions.UnicodeEncodeError instance at 0xf6f75e8c>
in 'garbage collection' ignored
Fatal Python error: unexpected exception during garbage
collection

Famous last words of Python 2.4 (debug build):


XXX undetected error
Traceback (most recent call last):
  File "test.py", line 4, in ?
    print S.split(u'\xc4')
UnicodeEncodeError: 'ascii' codec can't encode
character u'\xe4' in position 0: ordinal not in range(128)
[6545 refs]

This bug has been reported first on the zope-dev list.
I confirmed
that this error does not only occur in Zope but also in
a Python-only
environment.
msg23619 - (view) Author: Andreas Jung (ajung) Date: 2004-12-09 13:12
Logged In: YES 
user_id=11084

This error not only happens with u'\xc4', it happens with
*any* string
containing a character >0x7f.

msg23620 - (view) Author: Anthony Baxter (anthonybaxter) (Python triager) Date: 2004-12-09 14:04
Logged In: YES 
user_id=29957

Please attach a test case that shows the failure. I can't
reproduce it here with the information you've given.
msg23621 - (view) Author: Jim Fulton (dcjim) (Python triager) Date: 2004-12-09 14:14
Logged In: YES 
user_id=73023

I can't reproduce this:

>>> ick = u'\xc4'
>>> import struct
>>> struct.pack(ick)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
UnicodeEncodeError: 'ascii' codec can't encode character
u'\xc4' in position 0: ordinal not in range(128)
>>> struct.pack(u'i', 1)
'\x01\x00\x00\x00'

The code that provked this in Zope is buggy.,
msg23622 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2004-12-09 15:05
Logged In: YES 
user_id=31435

For the record, this appeared to be due to an extension 
module doing

PyString_AsString(name)[0]

when name was a Unicode string containing a "high-bit" 
character.  PyString_AsString(name) legitimately returned 
NULL, and bad stuff followed.
History
Date User Action Args
2022-04-11 14:56:08adminsetgithub: 41302
2004-12-09 13:08:37ajungcreate