Issue 563665: urllib2 can't cope with error response

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/36688

classification

Title:	urllib2 can't cope with error response
Type:		Stage:
Components:	None	Versions:

process

Status:	closed	Resolution:	fixed
Dependencies:		Superseder:
Assigned To:	jhylton	Nosy List:	edemaine, jhylton
Priority:	normal	Keywords:

Created on 2002-06-02 22:28 by edemaine, last changed 2022-04-10 16:05 by admin. This issue is now closed.

Files
File name	Uploaded	Description	Edit
out	edemaine, 2002-06-02 22:28	Traceback caused by the simple example

Messages (4)
msg11028 - (view)	Author: Erik Demaine (edemaine)	Date: 2002-06-02 22:28
This looks similar to SF bug 216649, but with somewhat different symptoms. Redirection seems to cause an AttributeError (attempt to access self.fp.read when self.fp is None). Simple example: python -c "import urllib2; urllib2.urlopen ('http://www.yahoo.com/promotions/mom_com97/supermom.html')" Traceback from Python 2.2.1 attached. Same behavior appears with Python 2.2.
msg11029 - (view)	Author: Jeremy Hylton (jhylton)	Date: 2002-06-03 16:17
Logged In: YES user_id=31392 I haven't looked at 216649 yet, but this particular traceback is caused by a problem loading the redirected url. If you load http://promotions.yahoo.com/promotions/mom_com97/supermom.html, you'll see the same failure without invoking an redirect machinery. My first guess is that the yahoo server is sending an invalid response and the httplib isn't being generous enough in skipping the garbage and looking for the valid response data. Here's a brief trace of httplib activity: >>> import httplib >>> h = httplib.HTTP('promotions.yahoo.com') >>> h.set_debuglevel(2) >>> h.putrequest("GET /promotions/mom_com97/supermom.html") Traceback (most recent call last): File "<stdin>", line 1, in ? TypeError: putrequest() takes at least 3 arguments (2 given) >>> h.putrequest("GET", "/promotions/mom_com97/supermom.html") connect: (promotions.yahoo.com, 80) send: 'GET /promotions/mom_com97/supermom.html HTTP/1.0\r\n' >>> h.endheaders() send: '\r\n' >>> h.getreply() reply: '#\x0f\x01yhh00000011\x010\x01HTTP/1.0 200 OK\n' (-1, '#\x0f\x01yhh00000011\x010\x01HTTP/1.0 200 OK\n', None) Not sure what the text starting with a hash is all about. Of course, urllib2 has a bug that prevents it from reporting anything useful about this error. That needs to be fixed.
msg11030 - (view)	Author: Jeremy Hylton (jhylton)	Date: 2002-06-03 16:55
Logged In: YES user_id=31392 Fixed the urllib2 part of the problem in CVS as rev 1.31 of urllib2.py. You'll now get a better error message about what went wrong. Still not sure what httplib should do differently. I notice that Mozilla renders this page with the HTTP response in the text, including junk at the very beginning of the response. (The server is clearly broken.) It would probably be best if httplib treated this as an HTTP/0.9 response if there appears to be a valid message body. It looks like that's what Mozilla is doing.
msg11031 - (view)	Author: Jeremy Hylton (jhylton)	Date: 2002-07-06 18:49
Logged In: YES user_id=31392 httplib.py 1.55 now treats the page as an HTTP/0.9 response, just like Mozilla.

History
Date	User	Action	Args
2022-04-10 16:05:23	admin	set	github: 36688
2002-06-02 22:28:57	edemaine	create