This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: urllib2: AbstractHTTPHandler limits flexible client implemen
Type: Stage:
Components: Library (Lib) Versions:
process
Status: closed Resolution: postponed
Dependencies: Superseder:
Assigned To: jhylton Nosy List: bbum, gvanrossum, jhylton, moshez
Priority: low Keywords:

Created on 2001-03-12 00:43 by bbum, last changed 2022-04-10 16:03 by admin. This issue is now closed.

Messages (9)
msg3827 - (view) Author: Bill Bumgarner (bbum) Date: 2001-03-12 00:43
The implementation of the do_open() method on the AbstractHTTPHandler class contains a couple of "features" that could be considered to be "bugs".  In any case, each time I have wanted to use urllib2 for relatively straightforward development of an HTTP client, I have had to effectively replace the HTTPHandler with one that reimplements do_open() (or http_open() in 2.0).  Maybe my usage is not the norm-- in any case, the more information, the better...

Specifics (all names in context of Python 2.1):

- AbstractHTTPHandler does not allow for anything but GET or POST requests.   GET is the default and POST happens anytime the request object contains data to be passed to the server.

This limitation is the only thing that stands in the way of using the AbstractHTTPHandler *directly* to implement, say, a WebDAV client or to do something like a site sucker that uses the HEAD method to determine if content has changed.

- [this is likely a bug]  the method will throw an exception if *any* response is received from the server other than 200.   However, HTTP defines that all 2XX responses should be treated as successful. 

In any case, there are *a lot* of contexts within which a non-200 response may be treated as a 'success' of some sort or another.  Regardless, it is really outside of the scope of the AbstractHTTPHandler's implementation to make the success/failure decision-- it should simply return  the same thing regardless of the response status.

- [a bug?] Whenever an exception is raised (a non-200 code is received), the status code and reason (as provided by the server) are both lost. 

I see that moshez has been primarily responsible for recent changes surrounding this code.  I would be happy to contribute to the evolution of the code;  please feel free to contact me directly.
msg3828 - (view) Author: Bill Bumgarner (bbum) Date: 2001-03-12 03:59
Logged In: YES 
user_id=103811

I realized that the exception throw behaviour is more fundamental to the underlying implementation than may have been indicated in the above description.  In particular, throwing an HTTP exception when handling a 401 is key to making the various Authentication Handlers work.

I still feel that the behaviour should be normalized across all requests such that the callee is responsible for determining error conditions or, at the lest, has access to the same data in a relatively similar format upon success or failure.
msg3829 - (view) Author: Jeremy Hylton (jhylton) (Python triager) Date: 2001-03-16 18:43
Logged In: YES 
user_id=31392

I haven't had any spare cycles to devote to urllib2 this
year.  Perhaps Moshe can be of more help in the near term. 
Following the 2.1 release, I may have more time.

I never used urllib2 it a situation that produced anything
other than vanilla responses -- 200, 401, etc.  I'm not to
surprised to hear that there are problems with 2XX cases.

Can you post some examples of the sorts of things you want
to do?  It sounds reasonable in the abstract, but some code
would help.  If not in this patch archive, perhaps on
comp.lang.python?
msg3830 - (view) Author: Moshe Zadka (moshez) (Python triager) Date: 2001-03-18 09:22
Logged In: YES 
user_id=11645

None of these can really be classified as "bugs" rather then
functionality enhancement requests, and this is something
I'm not sure I want to do this close to the second beta.

BTW, one thing I'm sure I *don't* want to change -- handling
of 20x codes. If you want to handle 201/206/whatever, then
just handle them. With some __getattr__ trickery, you can
have a class that handles all http_error_20x errors, so this
is *easy* for 3rd party urllib2 extensions to add.

Regarding explicitly determining the command: just put the
command inside the request object, and use it in your
own HTTPHandler/HTTPSHandler. This may be done in the next
version of urllib2 (for 2.2). At that time I might also add
the feature that other encodings (not just
application/x-www-form-urlencoded, but also
multipart/form-data) will be supported.

msg3831 - (view) Author: Bill Bumgarner (bbum) Date: 2001-03-20 00:37
Logged In: YES 
user_id=103811

OK-- I can understand that logic (close to beta, etc).

Given the prominence of Python in the WebDav 
community combined with the increasing use of 2xx 
(and 1xx) codes, it would be extremely useful to 
include-- at the least-- examples of handling such via 
the urllib2 modules.   

Beyond that, it would be quite helpful to the developers 
to expend some amount of engineering effort such that 
handling 2xx response codes doesn't require 
__getattr__ trickery! 

Similarly, breaking out the HTTP raw connection setup 
from the method that actually composes and sends the 
HTTP request would be helpful in that it would greatly 
reduce the amount of code that has to be duplicated 
when subclassing the handler to customize handling of 
2xx or when specifying methods other than GET/POST.

I.e. most developers will be confused to the point of 
being overwhelmed if "how do I customize responses 
such that they don't raise" or "how do I send an 
OPTIONS or HEAD request" requires figuring out how 
to deal with setting up and sending a request via the 
much-lower-level-than-urllib2 HTTP API.
msg3832 - (view) Author: Moshe Zadka (moshez) (Python triager) Date: 2001-04-09 14:02
Logged In: YES 
user_id=11645

I'm formally postponing it until the 2.1 release comes out
-- clearly none of this can be considered a bug fix.
msg3833 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2001-09-05 18:07
Logged In: YES 
user_id=6380

Unassigning from Moshe -- he doesn't seem to have time
(Moshe, if you're still interested, just change the owner
field back to you).
msg3834 - (view) Author: Jeremy Hylton (jhylton) (Python triager) Date: 2001-10-09 18:06
Logged In: YES 
user_id=31392

It's only six or seven months since there was an active
discussion on this bug report.  Anyone still interested in
fixing it?  I think it's reasonable to try and fix these
issues for 2.2, but I don't have time to implement it all
myself.
msg3835 - (view) Author: Jeremy Hylton (jhylton) (Python triager) Date: 2002-04-01 21:54
Logged In: YES 
user_id=31392

I still think this is a useful feature, but I don't have 
time to champion it.  Since the original poster hasn't 
followed up in the last year, I'll just close the report.
History
Date User Action Args
2022-04-10 16:03:51adminsetgithub: 34137
2001-03-12 00:43:19bbumcreate