This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Incorrect handling of unicode "strings" in asynchat.py
Type: Stage:
Components: Library (Lib) Versions: Python 2.5
process
Status: closed Resolution: wont fix
Dependencies: Superseder:
Assigned To: effbot Nosy List: effbot, holle
Priority: normal Keywords:

Created on 2005-11-16 15:28 by holle, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
asynchat-py-2.4.diff holle, 2005-11-16 15:36 Patch for the asynchat.py Python version 2.4
asynchat-py-2.3.diff holle, 2005-11-16 15:37 Patch for the asynchat.py Python version 2.3.3
Messages (5)
msg26889 - (view) Author: Holger Lehmann (holle) Date: 2005-11-16 15:28
While debugging my Zope installation using the   
DeadLockDebugger along with the threadframe module   
under Python 2.3 and 2.4 I stumbled upon a weird   
behaviour of the method refill_buffer ind the class   
async_chat in the file asynchat.py (Around line 198).  
  
There is a special handling for strings:  
[...]  
elif isinstance(p, str):  
  self.producer_fifo.pop()  
  self.ac_out_buffer = self.ac_out_buffer + p  
  return  
data = p.more()  
[...]  
Now, if p is an instance of string, it gets a special  
handling and the function returns.  
But if p is an instance of unicode, it does not get  
the special handling, but dies in the line  
data = p.more()  
with the error, that unicode does not have a function  
named more.  
  
I was able to program a workaround by testing for str  
or unicode. Please see the attached diff-file for  
details. The code now works as expected. 
 
I guess a better way would be to import types and 
change the if into a "type(p) in types.StringTypes", 
but I did not dare to do this. 
 
I have not checked the code to see if there are more 
conditionals broken the same way. 
msg26890 - (view) Author: Holger Lehmann (holle) Date: 2005-11-16 15:34
Logged In: YES 
user_id=332779

Just for the record: 
 
I am using Linux 2.6.11 and Python 2.4 (SuSE 9.3) 
as well as Linux 2.6.5 and Python 2.3.3 (SLES 9 Beta 4). 
 
Both Python installations contain a sitecustomize.py with 
the following content: 
import sys 
sys.setdefaultencoding('iso-8859-1') 
 
The environment contains a variable LC_CTYPE set to: 
de_DE.UTF-8 
 
The Zope is a 2.7.3 compiled on both of the above machines 
by myself. 
msg26891 - (view) Author: Holger Lehmann (holle) Date: 2005-11-16 15:38
Logged In: YES 
user_id=332779

I checked the current svn repository to see if the "bug" 
still remains, yes it does: 
 
http://svn.python.org/view/python/trunk/Lib/asynchat.py?rev=38980&view=markup 
msg26892 - (view) Author: Holger Lehmann (holle) Date: 2005-11-16 15:41
Logged In: YES 
user_id=332779

And here is the exception (sorry for the weird format of 
the traceback, I did not do it ;-) ): 
 
ZServer uncaptured python exception, closing channel 
<ZServer.HTTPServer.zhttp_channel connected 
192.168.2.60:58493 at 
0x45f1de6c channel#: 14145 requests:> 
(exceptions.AttributeError:'unicode' object has no 
attribute 'more'  
[/usr/lib/python2.3/asyncore.py|read|69]  
[/usr/lib/python2.3/asyncore.py|handle_read_event|390]  
[/usr/lib/python2.3/asynchat.py|handle_read|136]  
[/srv/zope_infotip_rts_skel/lib/python/ZServer/medusa/http_server.py|
found_terminator|510]  
[/srv/zope_infotip_rts_render2/Products/DeadlockDebugger/dumper.py|
match|88]  
[/srv/zope_infotip_rts_skel/lib/python/ZServer/HTTPServer.py|
push|305]  
[/usr/lib/python2.3/asynchat.py|initiate_send|213]  
[/usr/lib/python2.3/asynchat.py|refill_buffer|200]) 
msg26893 - (view) Author: Fredrik Lundh (effbot) * (Python committer) Date: 2005-11-16 23:27
Logged In: YES 
user_id=38376

You seem to be missing both what a network layer does,
and what a Unicode string is.

The network layer is designed to deal with streams of 
bytes, not arbitrary objects.  

Unicode strings contain Unicode characters, not bytes.

If you want to pass something that isn't a stream of bytes 
over a network connection, it's up to you to convert that 
something to a byte stream *before* you pass them to the 
network layer.  For a Unicode string, this means that you 
have to encode the string, using a suitable encoding for 
your application (e.g. ISO-8859-1 or UTF-8 or whatever 
your application prefers).  The same applies if you want 
to pass in floating point values, integers, lists, images, 
or any other data type: it's up to you to decide how to 
convert the objects to a stream of bytes.

Please fix your application.
History
Date User Action Args
2022-04-11 14:56:14adminsetgithub: 42600
2005-11-16 15:28:19hollecreate