This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: StringIO / cStringIO inconsistency with unicode
Type: Stage:
Components: Library (Lib) Versions:
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: arigo, loewis
Priority: normal Keywords:

Created on 2007-03-11 18:07 by arigo, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Messages (3)
msg31490 - (view) Author: Armin Rigo (arigo) * (Python committer) Date: 2007-03-11 18:07
When trying to write unicode strings into a
StringIO.StringIO() or a cStringIO.StringIO()
file, different things occur.  (This causes
a failing test in the "mako" project if
cStringIO is not available.)  Compare the
following with StringIO or with cStringIO:

    f = StringIO()
    f.write("\xC0")
    f.write(u"hello")
    print f.getvalue()

With cStringIO, unicode strings are
immediately encoded as ascii and the
getvalue() returns '\xC0hello'.  With
StringIO, on the other hand, the
getvalue() crashes in a ''.join()
trying to convert 'xC0' to unicode.
Normal file() objects follow the same
behavior as cStringIO, so my guess is
that StringIO.py is wrong here.
msg31491 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2007-03-11 18:31
It's intentional that they behave differently. StringIO supports Unicode strings, cStringIO doesn't. This means that you can build up a large Unicode string with StringIO, but not with cStringIO.

What should happen when you mix them is debatable.
msg31492 - (view) Author: Armin Rigo (arigo) * (Python committer) Date: 2007-03-12 00:04
I missed the documentation, which already desribes
this difference.
History
Date User Action Args
2022-04-11 14:56:23adminsetgithub: 44699
2007-03-11 18:07:00arigocreate