Issue 755080: AssertionError from urllib.retrieve / httplib

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/38654

classification

Title:	AssertionError from urllib.retrieve / httplib
Type:		Stage:
Components:	Library (Lib)	Versions:	Python 2.3

process

Status:	closed	Resolution:	fixed
Dependencies:		Superseder:
Assigned To:		Nosy List:	jhylton, jmoses, terry.reedy, zenzen
Priority:	normal	Keywords:

Created on 2003-06-16 00:37 by zenzen, last changed 2022-04-10 16:09 by admin. This issue is now closed.

Messages (9)
msg16423 - (view)	Author: Stuart Bishop (zenzen)	Date: 2003-06-16 00:37
The following statement is occasionally generating AssertionErrors: current_page = urllib.urlopen(action,data).read() Traceback (most recent call last): File "/Users/zen/bin/autospamrep.py", line 161, in ? current_page = handle_spamcop_page(current_page) File "/Users/zen/bin/autospamrep.py", line 137, in handle_spamcop_page current_page = urllib.urlopen(action,data).read() File "/sw/lib/python2.3/httplib.py", line 1150, in read assert not self._line_consumed and self._line_left Fix may be to do the following in LineAndFileWrapper.__init__ (last two lines are new): def __init__(self, line, file): self._line = line self._file = file self._line_consumed = 0 self._line_offset = 0 self._line_left = len(line) if not self._line_left: self._done()
msg16424 - (view)	Author: Stuart Bishop (zenzen)	Date: 2003-06-16 00:55
Logged In: YES user_id=46639 My suggested fix is wrong.
msg16425 - (view)	Author: Jeremy Hylton (jhylton)	Date: 2003-06-16 19:40
Logged In: YES user_id=31392 Can you reproduce this problem easily? We've seen something like it before, but have had trouble figuring out what goes wrong.
msg16426 - (view)	Author: Stuart Bishop (zenzen)	Date: 2003-06-24 12:46
Logged In: YES user_id=46639 I've been unable to repeat the problem through a tcpwatch.py proxy, so I'm guessing the trigger is connecting to a fairly loaded server over a 56k modem - possibly the socket is in a bad state and nothing noticed? I'll try not going through tcpwatch.py for a bit and see if I can still trigger the problem in case there was a server problem triggering it that has been fixed.
msg16427 - (view)	Author: Jon Moses (jmoses)	Date: 2003-09-22 20:38
Logged In: YES user_id=55110 I also experience this problem, and it's repeatable. When trying to talk with CrossRef (www.crossref.com) server, I get this same error. I don't know why. All the crossref server does is spit back text. It normally takes between 10 and 20 seconds to recieve all the data. I've successfully viewed the results with mozilla, and with wget. I'd post the URL i'm hitting, but it's a for-pay service. This is the code I'm using: ... ( name, headers ) = urllib.urlretrieve( url ) ... While attempting to recieve this data, I tried doing a: ... u = urllib.urlopen( url ) for line in u.readlines(): print line ... but program execution seemed to continue while the data was being received, which is not cool. I'm not sure if that's expected behaviour or not. Let me know if I can provide you with any more information. -jon
msg16428 - (view)	Author: Jeremy Hylton (jhylton)	Date: 2003-09-23 03:09
Logged In: YES user_id=31392 jmoses: Are you seeing this problem with Python 2.3? I thought we had fixed the problem in the original report. Also, I'm not sure what you mean by program execution continuing. Do you mean that the for loop finished and the rest of the program continued executing, even though there was data left to read? What would probably help most is a trace of the session with httplib's set_debuglevel() enabled. If that's got sensitive data, you can email it to me privately.
msg16429 - (view)	Author: Jon Moses (jmoses)	Date: 2003-09-23 11:52
Logged In: YES user_id=55110 Whups, my bad, I just assumed (and we know what happens then) that this was for python 2.2, since that's what I was having the problem with. My next step was to try with Python 2.3. I'll let you know if it works (since it sounds like it should). And yes, that's what I meant. Data from the http read was still being outputted to the screen, while other output from _past_ where the read was occuring was also being output. I'd end up with output like this: [data from http read] [data from after] [data from http read] and the data was from the same connection. Hopefully the switch to 2.3 makes my issues moot. Thanks
msg16430 - (view)	Author: Jon Moses (jmoses)	Date: 2003-09-23 12:35
Logged In: YES user_id=55110 I switched from using urllib.urlretrieve / urllib.urlopen to using httplib, since I can debug with it. I no longer get the error this bug is about. The other problem I seemed to be having was related to the data I was recieving, which was generated in part from the data I was passing to the server. I changed the data I was sending (changed ' ' to '%20') and ever thing works fine. Even using urllib.urlopen(). Sorry for the confusion. The data that the server was sending back to the broken request was outputted like this, using httplib.http.set_debuglevel(1): ------start Getting: doi.crossref.org connect: (doi.crossref.org, 80) send: 'GET /servlet/query?usr=<deleted>&pwd=<deleted>&qdata=\|Canadian Journal of Fisheries and Aquatic Sciences\|Adkison\|52\|\|2762\|\|full_text\|1\|<snip> HTTP/1.0\r\n\r\n' reply: '\n' \|Canadian -----------end I don't know if that helps, but maybe. Thanks much.
msg16431 - (view)	Author: Terry J. Reedy (terry.reedy) *	Date: 2005-04-26 22:05
Logged In: YES user_id=593130 Closing since this appears to have been fixed in 2.3. If I am mistaken, reopen.

History
Date	User	Action	Args
2022-04-10 16:09:14	admin	set	github: 38654
2003-06-16 00:37:03	zenzen	create