This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: print to unicode stream should __unicode
Type: enhancement Stage: test needed
Components: Unicode Versions: Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: lemburg Nosy List: ajaksu2, facundobatista, hthompson, lemburg, loewis, vstinner
Priority: normal Keywords:

Created on 2002-11-12 12:24 by hthompson, last changed 2022-04-10 16:05 by admin. This issue is now closed.

Messages (15)
msg53686 - (view) Author: Henry S Thompson (hthompson) Date: 2002-11-12 12:24
To make __unicode__ parallel to __str__ in what seems
like the right way, print >>f,x should check for
__unicode__ if f is a unicode-enabled stream
 See
http://mail.python.org/pipermail/python-list/2002-November/129859.html
for more details
msg53687 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2002-11-12 14:58
Logged In: YES 
user_id=38388

I'm not sure I understand: what is a "unicode stream".

All streams in Python are considered byte streams and
(currently) have no encoding attached. Changing that
would require a lot of work, some of which is under way.

Still, the best way to deal with this is to first encode Unicode
to a string using a known stream encoding and then sending
off the 8-bit data from the encoding process to the stream.
msg53688 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2002-11-12 15:20
Logged In: YES 
user_id=21627

Henry is talking about the objects returned from
codecs.open, or instantiating the classes returned from
codecs.get_writer.
msg53689 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2002-11-12 15:53
Logged In: YES 
user_id=38388

Hmm, in those cases, passing Unicode objects to .write()
should work (and thus printing too). I think he's trying
to print some user-defined instances to such a stream...
that's where __str__ is called instead of __unicode__
by PyFile_WriteObject(). 

The question then becomes: how should PyFile_WriteObject()
know whether to look for __unicode__ or not ?

msg53690 - (view) Author: Henry S Thompson (hthompson) Date: 2002-11-12 16:34
Logged In: YES 
user_id=612691

As MvL said, I'm looking at a case such as the following:

x=X()
f=codecs.getwriter('utf8')(open("/tmp/out","w"))
print >>f,x

where X has a __unicode__ method.

It seems wrong to me that __str__ gets used in this case,
not __unicode__
msg53691 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2002-11-12 16:38
Logged In: YES 
user_id=21627

So your class does have an __str__? Or is it the __repr__
that is being used?
msg53692 - (view) Author: Henry S Thompson (hthompson) Date: 2002-11-12 16:45
Logged In: YES 
user_id=612691

Usually it does not have __str__, so __repr__ is getting used.
In other cases, the class does have a __str__, and it gets
used, but even then I think it shouldn't if there's a
__unicode__

All this presumes there's a way to diagnose whether files
are wide or narrow -- I'm not familiar enough with the
implementation details here to know if this makes sense or not.

My original message made the point as follows:

If you call str(object), and object's class has a __str__
method, the
  value is the value of the __str__ method;

If you print an object to a normal stream, and object's
class has a
  __str__ method, what appears is the result of the __str__
method.

If you call unicode(object), and object's class has a
__unicode__ method, the
  value is the value of the __unicode__ method;

So far so good, but read on . . .

If you print an object to a unicode stream, and object's
class has a
  __unicode__ method, what appears is the result of . . .

_not_ the __unicode__ method, but the __str__ method, if
there is one,
otherwise the usual default
msg53693 - (view) Author: Facundo Batista (facundobatista) * (Python committer) Date: 2004-12-02 00:03
Logged In: YES 
user_id=752496

Please, could you verify if this problem persists in Python 2.3.4
or 2.4?

If yes, in which version? Can you provide a test case?

If the problem is solved, from which version?

Note that if you fail to answer in one month, I'll close this bug
as "Won't fix".

Thank you! 

.    Facundo
msg53694 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2004-12-02 15:45
Logged In: YES 
user_id=38388

Please leave this open: it's a reminder to start working on
an overhaul of the printing sub-system and file.write() in
particular.

Thanks.
msg53695 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2004-12-02 15:46
Logged In: YES 
user_id=38388

Changed into a feature request.
msg81717 - (view) Author: Daniel Diniz (ajaksu2) * (Python triager) Date: 2009-02-12 03:42
Not sure it's still important after 3.0 release. Confirmed in trunk:

import codecs

class X:
  def __unicode__(self):
    print 'unicode'
    return u'unicode'
  def __str__(self):
    print 'str'
    return 'str'

x=X()
f = codecs.getwriter('utf8')(open("/tmp/out","w"))

print >> f, x # 'str'
msg81743 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2009-02-12 11:49
ajaksu2> Not sure it's still important after 3.0 release.

Python2 has too much issues related to unicode. It's easier to switch 
to Python3 which use unicode by default for most functions (eg. 
print).

The issue can't be fixed in Python2 without breaking the compatibility 
(introduce regression). I think that the issue is fixed in Python3.

If you disagree, reopen this issue and *attach your patch* ;-)
msg81757 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2009-02-12 12:32
On 2009-02-12 12:49, STINNER Victor wrote:
> STINNER Victor <victor.stinner@haypocalc.com> added the comment:
> 
> ajaksu2> Not sure it's still important after 3.0 release.
> 
> Python2 has too much issues related to unicode. It's easier to switch 
> to Python3 which use unicode by default for most functions (eg. 
> print).

I don't agree with that statement. Python3 has better Unicode I/O
support, but apart from that it's pretty much the same show.

> The issue can't be fixed in Python2 without breaking the compatibility 
> (introduce regression). I think that the issue is fixed in Python3.

Python3 fixes the "print" statement to be a function, which allows
much better extensibility of the concept.

You can have the same in Python2 with a little effort: just create
your own myprint() function and have it process Unicode in whatever
way you want.
msg81761 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2009-02-12 13:22
> Python3 fixes the "print" statement to be a function, which allows
> much better extensibility of the concept.
>
> You can have the same in Python2 with a little effort: just create
> your own myprint() function and have it process Unicode in whatever
> way you want.

About myprint(): sure, it's possible to write a custom issue. But the feature 
request was to use obj.__unicode__() instead of obj.__str__() for an "unicode 
aware stream".

The problem with Python2 is that there is not clear separation between "bytes 
stream" ("raw stream"?) and "unicode aware stream" (like io.open or 
codecs.open). I think that fixing this issue (#637094) was one of the goal of 
the new I/O library (io in Python3).

Using __unicode__() for some object and __str__() for some other sounds 
strange/dangerous to me. I prefer bytes-only stream or unicode-only stream, 
but not bytes-and-sometimes-unicode stream.

Python2 has only bytes stream. Python3 has both: open(name, 'rb') is bytes 
only, and open(name, 'r') is unicode only.

lemburg> Does your message mean that you want to reopen the issue?
msg81764 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2009-02-12 14:01
On 2009-02-12 14:22, STINNER Victor wrote:
> STINNER Victor <victor.stinner@haypocalc.com> added the comment:
> 
>> Python3 fixes the "print" statement to be a function, which allows
>> much better extensibility of the concept.
>>
>> You can have the same in Python2 with a little effort: just create
>> your own myprint() function and have it process Unicode in whatever
>> way you want.
> 
> About myprint(): sure, it's possible to write a custom issue. But the feature 
> request was to use obj.__unicode__() instead of obj.__str__() for an "unicode 
> aware stream".
> 
> The problem with Python2 is that there is not clear separation between "bytes 
> stream" ("raw stream"?) and "unicode aware stream" (like io.open or 
> codecs.open). I think that fixing this issue (#637094) was one of the goal of 
> the new I/O library (io in Python3).

True, but the point of the original request was that the stream
should decide how to print the object, ie. you pass the object to
the stream's .write() method instead of first running str() on
it and then passing this to the .write() method.

This is easy to have using a custom print function and indeed
a good way to proceed if you want to port to Python3 at some
later point.

> Using __unicode__() for some object and __str__() for some other sounds 
> strange/dangerous to me. I prefer bytes-only stream or unicode-only stream, 
> but not bytes-and-sometimes-unicode stream.

If you use a StreamWriter instance in Python2 which uses one of
the Unicode codecs, then it will accept ASCII strings or Unicode
as input for .write(), ie. the stream decides on how to process
the input.

> Python2 has only bytes stream. Python3 has both: open(name, 'rb') is bytes 
> only, and open(name, 'r') is unicode only.

That's not entirely correct. Python2's codecs.py module provides
streams which can implement several different type combinations
for input and output.

> lemburg> Does your message mean that you want to reopen the issue?

No, I just wanted to correct your statement :-)
History
Date User Action Args
2022-04-10 16:05:53adminsetgithub: 37457
2009-02-12 14:01:20lemburgsetmessages: + msg81764
2009-02-12 13:22:32vstinnersetmessages: + msg81761
2009-02-12 12:32:18lemburgsetmessages: + msg81757
2009-02-12 11:49:24vstinnersetstatus: open -> closed
nosy: + vstinner
resolution: fixed
messages: + msg81743
2009-02-12 03:42:02ajaksu2setnosy: + ajaksu2
stage: test needed
messages: + msg81717
versions: + Python 2.7
2002-11-12 12:24:23hthompsoncreate