This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: urllib2.Request's headers are case-sens.
Type: Stage:
Components: Library (Lib) Versions: Python 2.3
process
Status: closed Resolution: accepted
Dependencies: Superseder:
Assigned To: brett.cannon Nosy List: brett.cannon, jjlee, loewis
Priority: normal Keywords: patch

Created on 2002-12-06 21:26 by jjlee, last changed 2022-04-10 16:05 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
case_diff jjlee, 2002-12-08 17:18
capitalize_patch jjlee, 2003-05-10 11:55
urllib2_header_case.diff brett.cannon, 2003-05-11 06:06 2003-05-10: by bcannon
addheaders_patch jjlee, 2003-06-17 13:06
Messages (14)
msg41893 - (view) Author: John J Lee (jjlee) Date: 2002-12-06 21:26
urllib2.Request's headers are case-sensitive.

This is unfortunate if, for example, you add a content-type header
like so:

req = urllib2.Request("http://blah/", data,
                      headers={"Content-Type": "text/ugly"})

because, while urllib2.AbstractHTTPHandler is careful to check not to
add this header if it's already in the Request, it happens to use a
different case convention:

                if not req.headers.has_key('Content-type'):
                    h.putheader('Content-type',
                                'application/x-www-form-urlencoded')

so you get both headers:

Content-Type: text/ugly
Content-type: application/x-www-form-urlencoded

in essentially random order.  The documentation says:

"""Note that there cannot be more than one header with the same name,
and later calls will overwrite previous calls in case the key
collides.  Currently, this is no loss of functionality, since all
headers which have meaning when used more than once have a
(header-specific) way of gaining the same functionality using only one
header."""

RFC 2616 (section 4.2) says:

"""The order in which header fields with the same field-name are
received is therefore significant to the interpretation of the
combined field value, and thus a proxy MUST NOT change the order of
these field values when a message is forwarded."""

The patch fixes this by adding normalisation of header case to
urllib.Request.  With the patch, you'd get:

Content-type: text/ugly


John
msg41894 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2002-12-07 08:43
Logged In: YES 
user_id=21627

There's no uploaded file!  You have to check the
checkbox labeled "Check to Upload & Attach File"
when you upload a file.

Please try again.

(This is a SourceForge annoyance that we can do
nothing about. :-( )
msg41895 - (view) Author: John J Lee (jjlee) Date: 2002-12-08 17:18
Logged In: YES 
user_id=261020

Here it is.

I swear I did check the box.  I clicked the button twice, though --
I guess SF doesn't like that.


John
msg41896 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2003-05-09 01:38
Logged In: YES 
user_id=357491

Do you think this would also work, John, if instead of having 
normalise_header_case you did name.title()?
msg41897 - (view) Author: John J Lee (jjlee) Date: 2003-05-09 12:41
Logged In: YES 
user_id=261020

Ooh, look at all those string methods I'd forgotten about. 
 
Yes, good idea, but name.capitalize() would be simpler and minutely 
more conservative (the module already uses that convention), hence 
better. 
 
msg41898 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2003-05-09 23:46
Logged In: YES 
user_id=357491

OK.  If you can rewrite the patch then using capitalize I will take a look and 
decide whether to apply it or not.

Also, if this will require changes to the docs please also include a patch for 
that.
msg41899 - (view) Author: John J Lee (jjlee) Date: 2003-05-10 11:55
Logged In: YES 
user_id=261020

Patch is attached (no doc changes required). 
 
msg41900 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2003-05-11 06:06
Logged In: YES 
user_id=357491

So the patch looks almost perfect.  There are only two things that I would 
change.  One is how you iterate over the dictionary.  It is better to use 
header.iteritems than header.items .  Second, I would not do the 
capitalization in __init__ directly.  Instead, to match the expectation of the 
docs ("headers should be a dictionary, and will be treated as if 
add_header() was called with each key and value as arguments") you should 
just call self.add_header(k, v)  in the loop.  This will lower code redundency 
and if for some reason add_header is changed no one will have to worry 
about changing __init__ at the same time.

But otherwise the patch looks good.  I have uploaded a corrected version of 
the patch.  Have a look and let me know that if it works for you.
msg41901 - (view) Author: John J Lee (jjlee) Date: 2003-05-11 11:19
Logged In: YES 
user_id=261020

I used iter rather than iteritems because that's what 
the rest of the module does, so maybe you want to 
look at the other 5 instances of that if you use 
iteritems. 
 
Otherwise, fine. 
 
msg41902 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2003-05-11 23:27
Logged In: YES 
user_id=357491

OK, I fixed the 'items' calls in my local copy of the file.  I am going to get 
someone to double-check this patch and if they give me the all-clear I will 
apply it.
msg41903 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2003-05-12 06:56
Logged In: YES 
user_id=357491

OK, the idea of the patch was cleared.  I will apply it some time this week.
msg41904 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2003-05-12 07:30
Logged In: YES 
user_id=357491

Applied as urllib2.py 1.43.
msg41905 - (view) Author: John J Lee (jjlee) Date: 2003-06-17 13:06
Logged In: YES 
user_id=261020

OpenerDirector.addheaders is another source of 
headers, on top of the ones provided by 
Request.headers and those hard-coded in 
AbstractHTTPHandler.do_open. 
 
These headers should be compared case- 
insensitively, just as the others are.  The patch 
I just attached does this. 
 
Since all the other headers are .capitalize()d, 
this patch also changes the default value of 
addheaders back to "User-agent" (reversing 
patch 599836). 
 
This really needs to be fixed before 2.3 final. 
msg41906 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2003-06-17 21:53
Logged In: YES 
user_id=357491

Dont as revision 1.51 .
History
Date User Action Args
2022-04-10 16:05:58adminsetgithub: 37576
2002-12-06 21:26:28jjleecreate