This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: urllib.quote throws exception on Unicode URL
Type: behavior Stage: resolved
Components: Library (Lib), Unicode Versions: Python 2.7
process
Status: closed Resolution: out of date
Dependencies: Superseder:
Assigned To: orsenthil Nosy List: adamnelson, ajaksu2, collinwinter, eric.araujo, ezio.melotti, mastrodomenico, mgiuca, nagle, orsenthil, pitrou, serhiy.storchaka, vak, varmaa, vstinner
Priority: normal Keywords: easy, patch

Created on 2007-05-04 06:11 by nagle, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
urllib-quote.patch mgiuca, 2010-03-14 08:46 Patch for urllib.quote review
Messages (35)
msg31944 - (view) Author: John Nagle (nagle) Date: 2007-05-04 06:11
The code in urllib.quote fails on Unicode input, when
called by robotparser with a Unicode URL.

Traceback (most recent call last):
File "./sitetruth/InfoSitePage.py", line 415, in run
pagetree = self.httpfetch() # fetch page
File "./sitetruth/InfoSitePage.py", line 368, in httpfetch
if not self.owner().checkrobotaccess(self.requestedurl) : # if access disallowed by robots.txt file
File "./sitetruth/InfoSiteContent.py", line 446, in checkrobotaccess
return(self.robotcheck.can_fetch(config.kuseragent, url)) # return can fetch
File "/usr/local/lib/python2.5/robotparser.py", line 159, in can_fetch
url = urllib.quote(urlparse.urlparse(urllib.unquote(url))[2]) or "/"
File "/usr/local/lib/python2.5/urllib.py", line 1197, in quote
res = map(safe_map.__getitem__, s)
KeyError: u'\xe2'

   That bit of code needs some attention.  
- It still assumes ASCII goes up to 255, which hasn't been true in Python for a while now.
- The initialization may not be thread-safe; a table is being initialized on first use.

"robotparser" was trying to check if a URL with a Unicode character in it was allowed.  Note the "KeyError: u'\xe2'" 
msg31945 - (view) Author: Collin Winter (collinwinter) * (Python committer) Date: 2007-06-05 23:39
Could you possibly provide a patch to fix this?
msg31946 - (view) Author: John Nagle (nagle) Date: 2007-06-06 16:49
As a workaround, you can surround calls to "can_fetch" with an try-block and catch KeyError exceptions.  That's what I'm doing.  
msg31947 - (view) Author: Atul Varma (varmaa) Date: 2007-06-13 15:36
It should be noted that the unicode aspect of this bug is actually a recognized flaw with a nontrivial solution.  See this thread from the Python-dev list, dated from July 2006:

http://mail.python.org/pipermail/python-dev/2006-July/067248.html

It was essentially agreed upon in this thread that the "obvious" solution--simply converting to UTF-8 as per rfc3986--doesn't actually cover all cases, and that passing a unicode string in to urllib.quote() indeed has ambiguous results.  For more information, see Mike Brown's comment on the aforementioned thread:

http://mail.python.org/pipermail/python-dev/2006-July/067335.html

It was generally agreed in the thread that the proper solution was to have urllib.quote() *only* deal with standard Python string data, and to raise a TypeError if a unicode string is passed in, implying that any conversion needs to be done by higher-level code, because implicit conversion within urllib.quote() is too ambiguous.

However, it seems the TypeError fix was never made to the Python SVN repository; perhaps this is because it may have broken legacy code that actually catches KeyErrors as John Nagle mentioned?  Or perhaps it was simply because no one ever got around to it.  Unfortunately, I'm not in a position to say for sure, but I hope my explanation helps.
msg78153 - (view) Author: Valery (vak) Date: 2008-12-21 17:20
Hi, gurus, can anyone then give a hint what we mortals should use in 
order to form  the URL with non-ascii symbols? We loved so much idea to 
feed our national symbols to urllib.quote as unicode string... and now 
we are quite disoriented... Thanks in advance for any comments!
Valery
msg78155 - (view) Author: Valery (vak) Date: 2008-12-21 17:57
(self-answer to msg78153)
the working recipe is:
http://www.nabble.com/Re:-Problem:-neither-urllib2.quote-nor-
urllib.quote-encode-the--unicode-strings-arguments-p19823144.html
msg81427 - (view) Author: Daniel Diniz (ajaksu2) * (Python triager) Date: 2009-02-08 23:55
IMHO, the TypeError would be a bugfix for 2.6.x. A urllib.quote_unicode
could be provided (in 2.7) to match urllib.parse.quote in 3.0 and the OP
usecase.

I can provide a simple patch, but I'm afraid the OP's remarks about
ASCII-range and thread-safety wouldn't be handled at all.
msg81828 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2009-02-12 22:14
quote() works in Python3 with any bytes string (not only ASCII) and 
any unicode string:

Python 3.1a0 (py3k:69105M, Feb  3 2009, 15:04:35)
>>> from urllib.parse import quote
>>> quote('é')
'%C3%A9'
>>> quote('\xe9')
'%C3%A9'
>>> quote('\xe9'.encode('utf-8'))
'%C3%A9'
>>> quote('\xe9'.encode('latin-1'))
'%E9'
msg86310 - (view) Author: John Nagle (nagle) Date: 2009-04-22 17:46
Note that the problem can't be solved by telling end users to call a
different "quote" function.  The problem is down inside a library
module. "robotparser" is calling "urllib.quote". One of those two
library modules needs to be fixed.
msg88367 - (view) Author: Matt Giuca (mgiuca) Date: 2009-05-26 16:14
The issue of urllib.quote was discussed at extreme length in issue 3300,
which was specific to Python 3.
http://bugs.python.org/issue3300

In the end, I rewrote the entire family of urllib.quote and unquote
functions; they're now Unicode compliant and accept additional encoding
and errors arguments to handle this.

They were never backported to the 2.x branch; maybe we should do so.
Note that the code is quite different and considerably more complex due
to all the new issues with Unicode strings.
msg101043 - (view) Author: Matt Giuca (mgiuca) Date: 2010-03-14 08:46
I've finally gotten around to a complete analysis of this code. I have a code/test/documentation patch which fixes the issue without any code breakage.

There is another bug in quote which I've found and fixed with this patch: If the 'safe' parameter is unicode, it raises a UnicodeDecodeError.

I have backported all of the 'quote' test cases from Python 3 (which I wrote) to Python 2. This exposed the reported bug as well as the above one. It's good to have a much larger set of test cases to work with. It tests things like all combinations of str/unicode, as well as non-ASCII byte string input and all manner of unicode inputs.

The bugfix itself comes from Python 3 (this has already been approved, over many months, by Guido, so I am hoping a similar change can get pushed through into Python 2 fairly easily). The solution is to add "encoding" and "errors" arguments to 'quote', and have quote encode the unicode string before anything else. 'encoding' defaults to 'utf-8'. So:

>>> quote(u'/El Niño/')
'/El%20Ni%C3%B1o/'

which is typically the desired behaviour. (Note that URI syntax does not cover Unicode strings; it merely says to encode them with some encoding, recommended but not required UTF-8, and then percent-encode those.)

With this patch, quote *always* returns a str, even on unicode input. I think that makes sense, because a URI is, by definition, an ASCII string. It could easily be made to return a unicode instead.

The other fix is for 'safe'. If 'safe' is a byte string we don't touch it. But if it is a Unicode string, we throw away all non-ASCII bytes. This means you can't make *characters* safe, only *bytes*, since URIs deal with bytes. In Python 3, we go further and throw away all non-ASCII bytes from 'safe' as well, so you can only make ASCII bytes safe. For this patch, I didn't go that far, for backwards compatibility reasons.

Also updated documentation.

In summary, this patch makes 'quote' fully Unicode compliant. It does not change any existing behaviour which wouldn't previously have thrown an exception, so it can't possibly break any existing code (unless it's relying on the exception being thrown).

(A minor change I made was replacing the line "cachekey = (safe, always_safe)" with "cachekey = safe". This avoids unnecessary work of hashing always_safe and the tuple, since always_safe doesn't change. It doesn't affect the behaviour.)

Note: I've also backported the 'unquote' test cases from Python 3 and found a few more bugs. I'm going to report them separately, with patches.
msg107104 - (view) Author: AdamN (adamnelson) Date: 2010-06-04 21:22
Nudge.  Somebody with the authority needs to increment the stage to "patch review".
msg110587 - (view) Author: Mark Lawrence (BreamoreBoy) * Date: 2010-07-17 17:58
I've run an eye over this and don't see any problems, particularly in the light of msg101043.  Only 2.7 is affected as the fix has been backported from py3k.  Please can we go forward with this.
msg110623 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2010-07-18 02:29
Incidentally, I was working on this yeserterday. Some minor changes were required in the patch as quote had undergone changes. 

Fixed and committed in revision 82940.

Thanks to Matt Giuca for this.
msg110702 - (view) Author: Matt Giuca (mgiuca) Date: 2010-07-19 01:21
Thanks for doing that, Senthil.
msg110730 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-07-19 10:53
Senthil, have you read my comment on python-checkins?
Couldn't this have been fixed without introducing a new API in a bugfix branch?
msg110731 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2010-07-19 11:06
I just checked your comment in the checkins list.

I saw this is as bug-fix, which was leading to change in the signature of the quote function, still in backward compatible way.
 
Should we still not do it? 

I understood only feature requests and behavior changes are disallowed in bug-fix branch.
msg110732 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-07-19 11:14
> I understood only feature requests and behavior changes are disallowed
> in bug-fix branch.

Well, isn't it a new feature you're adding?
msg110733 - (view) Author: Matt Giuca (mgiuca) Date: 2010-07-19 11:18
From http://mail.python.org/pipermail/python-checkins/2010-July/095350.html:
> Looking at the issue (which in itself was quite old), you could as well
> have fixed the robotparser module instead.

It isn't an issue with robotparser. The original reporter found it via robotparser, but it's nothing to do with that module. I found it independently and I would have reported it separately if it hadn't already been here.

It's definitely a bug in urllib (as shown by my extensive new test cases).
msg110734 - (view) Author: Matt Giuca (mgiuca) Date: 2010-07-19 11:21
> Well, isn't it a new feature you're adding?

You had a function which raised a confusing and unintentional KeyError when given non-ASCII Unicode input. Now it doesn't. That's the bug fix part.

What I assume you're referring to as a "new feature" is the new arguments. I'd say they're unfortunately necessary in fixing this bug, as the fix requires encoding the non-ASCII unicode characters with some encoding, and it's (arguably) necessary to give the programmer the choice of encoding, with sensible defaults.
msg110735 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-07-19 11:25
> It's definitely a bug in urllib

A bug in what way? Up to 2.6 (*), the docs state nothing about the type of the "string" parameter.
(*) http://docs.python.org/release/2.6.5/library/urllib.html#urllib.quote

I think everyone assumed that the parameter should be a "str" object and nothing else. Apparently some people used it accidentally with some unicode strings and it "worked" until these strings contained non-ASCII characters. But it's a side-effect of how 2.x unicode strings work, and it doesn't seem to me quote() was ever intended to accept unicode strings.

If we were following you, we would add "encoding" and "errors" arguments to any str-accepting 2.x function, so that it can also accept unicode strings. That's certainly not a reasonable solution.
msg110737 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2010-07-19 11:45
Well, my understanding was Type:behavior was a bug fix and Type: feature request was a new feature request, which may change some underlying behavior. I thought this issue was on the border.

The robotparser using this might be just one usage indicator, but having this capability in the quote definitely helps. And this could not have been done without changing the signature.

Ideally, this could have gone in 2.7, but I missed it.  Personally, I am still +1 in having this in 2.7.1. Is it undesirable? Does it need wider discussion?
msg110738 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-07-19 12:01
> Well, my understanding was Type:behavior was a bug fix and Type:
> feature request was a new feature request, which may change some
> underlying behavior. I thought this issue was on the border.

The original issue is against robotparser, and clearly states a bug
(robotparser doesn't work in some cases).
But solving a bug by adding a feature isn't appropriate for a bugfix
release.

You shouldn't look at how the issue is classified. What's important is
what the actual *patch* does.

A patch doesn't have to change existing behaviour to be considered a
feature. That's a misconception. Feature releases try to be
forward-compatible as well (if I use urllib.quote() in 2.Y, it will
still work in 2.Y+1).

Adding API parameters, or accepting additional types in an existing API,
is clearly a new feature.

> Ideally, this could have gone in 2.7, but I missed it.  Personally, I
> am still +1 in having this in 2.7.1. Is it undesirable? Does it need
> wider discussion?

We can certainly make exceptions from time to time but only when there's
a strong argument for it (e.g. a security issue). There doesn't seem to
be an urgency to make urllib.quote() work with non-ASCII unicode strings
in 2.7.1, while it didn't before anyway.

Furthermore, the core issue is the automatic coercion between unicode
and 8-bit strings in 2.x. Many APIs are affected by this, urllib.quote()
shouldn't be considered a special case.
msg110740 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2010-07-19 12:13
On Mon, Jul 19, 2010 at 11:25:30AM +0000, Antoine Pitrou wrote:
> If we were following you, we would add "encoding" and "errors"
> arguments to any str-accepting 2.x function, so that it can also
> accept unicode strings. That's certainly not a reasonable solution.

I don't think Matt is indicating that. Just the quote function can be
used with unicode string as perfectly valid string input.
In the original py3k bug too, there was a big discussion on this very
same topic.
msg110744 - (view) Author: Matt Giuca (mgiuca) Date: 2010-07-19 12:53
> I think everyone assumed that the parameter should be a "str" object
> and nothing else. Apparently some people used it accidentally with
> some unicode strings and it "worked" until these strings contained
> non-ASCII characters.

I don't consider use of Unicode strings in Python 2.7 to be "accidental". In my experience with Python 2, pretty much everything already works with Unicode strings, and it's best practice to use them.

Now one of the major goals of Python 2.6/2.7 is to allow the writing of code which ports smoothly to Python 3. Unicode support is a major issue here. To quote "What's new in Python 3" (http://docs.python.org/py3k/whatsnew/3.0.html):
"To be prepared in Python 2.x, start using unicode for all unencoded text, and str for binary or encoded data only. Then the 2to3  tool will do most of the work for you."
Having functions in Python 2.7 which don't accept Unicode (or worse, raise random exceptions) runs against best practices for moving to Python 3.

> If we were following you, we would add "encoding" and "errors" arguments
> to any str-accepting 2.x function, so that it can also accept unicode
> strings. That's certainly not a reasonable solution.

No, that's certainly not necessary. You don't need an "encoding" or "errors" argument on any given function in order to support unicode. In fact, most code written to work with strings naturally works with Unicode because unicode strings support the same basic operations.

The need for an "encoding" and "errors", and in fact the need to deal with string encoding at all with urllib.quote is due to the special nature of URLs. If URLs had a syntax like %uXXXX then there would be no need for encoding Unicode strings (as in UTF-8) at all. However, because the RFC specifies that Unicode strings are to be encoded into a byte sequence *using an unspecified encoding*, it is therefore necessary, for this specific function, to ask the programmer which encoding to use.

Thus I assure you, this is not just one random function I have picked to add these arguments to. This is the only one (that I know of) that requires them to support Unicode.

> The original issue is against robotparser, and clearly states a bug
> (robotparser doesn't work in some cases).

I don't know why this keeps coming back to robotparser. The original bug was not against robotparser; it is called "quote throws exception on Unicode URL" and that is the bug. Robotparser was just one demonstrative piece of code which failed because of it.

Having said that, I don't expect to continue this argument. If you (the Python developers) decide that it's too late to accept this, then I won't object to reverting it.
msg110746 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-07-19 13:21
> Now one of the major goals of Python 2.6/2.7 is to allow the writing
> of code which ports smoothly to Python 3. Unicode support is a major
> issue here.

I understand the argument. But 2.7 is a bugfix branch and shouldn't
receive new features, even backports. If we wanted 2.x to converge
further into 3.x, we would do a 2.8, which we have decided not to do.

> I don't consider use of Unicode strings in Python 2.7 to be
> "accidental". In my experience with Python 2, pretty much everything
> already works with Unicode strings, and it's best practice to use
> them.

Not true. From the urllib module itself:

$ touch /tmp/hé
$ python -c 'import urllib; urllib.urlretrieve("file:///tmp/hé")'
$ python -c 'import urllib; urllib.urlretrieve(u"file:///tmp/hé")'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/usr/lib64/python2.6/urllib.py", line 93, in urlretrieve
    return _urlopener.retrieve(url, filename, reporthook, data)
  File "/usr/lib64/python2.6/urllib.py", line 225, in retrieve
    url = unwrap(toBytes(url))
  File "/usr/lib64/python2.6/urllib.py", line 1027, in toBytes
    " contains non-ASCII characters")
UnicodeError: URL u'file:///tmp/h\xc3\xa9' contains non-ASCII characters

> Having functions in Python 2.7 which don't accept Unicode (or worse,
> raise random exceptions) runs against best practices for moving to
> Python 3.

There are lots of them, and urllib.quote() isn't an exception:

'x\x9c\xcbH\x04\x00\x013\x00\xca'
>>> zlib.compress(u"hà")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe0' in position 1: ordinal not in range(128)

pwd.struct_passwd(pw_name='root', pw_passwd='x', pw_uid=0, pw_gid=0, pw_gecos='root', pw_dir='/root', pw_shell='/bin/bash')
>>> pwd.getpwnam(u"rooté")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 4: ordinal not in range(128)

> In fact, most code written to work with strings naturally works with
> Unicode because unicode strings support the same basic operations.

What should zlib compression of an unicode string result in?

> > The original issue is against robotparser, and clearly states a bug
> > (robotparser doesn't work in some cases).
> 
> I don't know why this keeps coming back to robotparser. The original
> bug was not against robotparser; it is called "quote throws exception
> on Unicode URL" and that is the bug. Robotparser was just one
> demonstrative piece of code which failed because of it.

Well, there are two different concerns:
- robotparser fails on certain Web pages, which is a bug (unless the Web
pages are clearly malformed)
- urllib.quote() should accept any kind of unicode strings, and perform
appropriate encoding, with an ability to override default encoding
parameters: this is a feature request

The OP himself (John Nagle) said:
“The problem is down inside a library module. "robotparser" is calling
"urllib.quote". One of those two library modules needs to be fixed.”

It seems to imply that the primary concern was robotparser not working.
msg110748 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-07-19 13:34
Sorry, the email gateway of Roundup ate half of my snippets.
Here they are again:

>>> zlib.compress(u"ha")
'x\x9c\xcbH\x04\x00\x013\x00\xca'
>>> zlib.compress(u"hà")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe0' in position 1: ordinal not in range(128)

>>> pwd.getpwnam(u"root")
pwd.struct_passwd(pw_name='root', pw_passwd='x', pw_uid=0, pw_gid=0, pw_gecos='root', pw_dir='/root', pw_shell='/bin/bash')
>>> pwd.getpwnam(u"rooté")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 4: ordinal not in range(128)
msg110759 - (view) Author: Matt Giuca (mgiuca) Date: 2010-07-19 14:26
OK sure, there are some other things broken, but they are mostly not dealing with string data, but binary data (for example, zlib expects a sequence of bytes, not characters).

Just one quick point:

> urllib.urlretrieve("file:///tmp/hé")
> UnicodeError: URL u'file:///tmp/h\xc3\xa9' contains non-ASCII characters

That's precisely correct behaviour. URLs are not allowed to contain non-ASCII characters (that's the whole point of urllib.quote). urllib.quote should accept non-ASCII characters (for conversion into ASCII strings). Other URL processing functions should not accept non-ASCII characters, since they aren't valid URIs.
msg110782 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2010-07-19 16:59
There are two points here:

First, is it a desired behavior of quote function in 2.7?

IMO, it is. In the discussions of issue3300, I think, it was decided that quote handling of unicode strings may be backported. Behaviour wise the modified version still returns a string which is correct for py2.

The forward compatibility on 2.7.1 version here is on the basis that someone in 2.7 might be relying on Exception raised for Unicode string "for the quote function".
 
But my guess is, when they are trying to quote non-ascii characters using quote function which is path component, they might be expecting that it gives them a correct output and this (now) modified function would be of help.

Of the many cases we have on Unicode being auto-coerced to 8-bit string, this particular case of using UTF-8 as default encoding for Unicode and returning a string seems to be fine (again discussed in the earlier issue). We might not have good answers for many other cases.

The Second point, as this is leading to an API change we should not have it 2.7.1

It would be unfortunate, if we revert the patch on this account only. 

This can be classified a bug-fix producing the desirable behavior, just that it needs the API to change too. I don't know if we have never adopted this approach (of changing API in backward compatible manner) for anything other than the security bug fixes alone.
msg110786 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-07-19 17:27
> The forward compatibility on 2.7.1 version here is on the basis that
> someone in 2.7 might be relying on Exception raised for Unicode string
> "for the quote function".

Again, the problem isn't compatibility. It is, simply, that we shouldn't
add new features in a bugfix branch.

> The Second point, as this is leading to an API change we should not
> have it 2.7.1
> 
> It would be unfortunate, if we revert the patch on this account only. 

Let me put it differently: if this rule didn't exist, there would be no
point in having bugfix branches, since everyone would commit their
favourite new features to bugfix branches.

There are many things which were too late for 2.7, and nobody is trying
to make a case of adding them to 2.7.1.

> I don't know if we have never adopted this approach (of changing API
> in backward compatible manner) for anything other than the security
> bug fixes alone.

We have done it a couple of times in early 3.0 and even 3.1 versions,
but that was really exceptional, and 3.x allowed us to relax some of the
rules since it was so little used at the time.
msg110898 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2010-07-20 13:55
I agree to the points raised by Antoine. Also yesterday in IRC, Eric Smith mentioned that If someone uses these new parameters in 2.7.1 his code may not work with 2.7 (That would obviously be an undesirable behavior). So, it is better to leave at Exception raised and anyways py3k has correct behavior.

I shall revert the patch from 2.7.1.
msg111145 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2010-07-22 02:08
Reverted the checkin in revision 83045.

For the robotparser issue, one of the these two can be adopted.

1. Fix it by decoding the unicode url using utf-8, strict.
2. Catch the KeyError exception and raise a TypeError exception from the robotparser module informing the user that Unicode URLs are not allowed. So that users can handle at application end and send 8bit strings.

I prefer 2.
msg111147 - (view) Author: Matt Giuca (mgiuca) Date: 2010-07-22 02:18
If you're going the way of option 2, I would strongly advise against relying on the KeyError. The fact that a KeyError is raised by urllib.quote is not part of it's specification, it's a bug/quirk in the implementation (which is now unlikely to be change, but it's unsafe to rely on it).

Robotparser should encode the string, if and only if it is a unicode string, with ('ascii', 'strict'), catch the UnicodeEncodeError, and raise the TypeError you suggested. This will have precisely the same behaviour as your proposed option 2 (will work fine for byte strings and Unicode strings with ASCII-only characters, but raise a TypeError on Unicode strings with non-ASCII characters) without relying on the KeyError from urllib.quote.
msg185513 - (view) Author: Mark Lawrence (BreamoreBoy) * Date: 2013-03-29 20:07
A lot of work has already been done on this issue.  If this is likely to get into 2.7 then fine leave it open, if not can this be closed?
msg370463 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-05-31 14:29
Python 2.7 is no longer supported.
History
Date User Action Args
2022-04-11 14:56:24adminsetgithub: 44927
2020-05-31 14:29:43serhiy.storchakasetstatus: open -> closed

nosy: + serhiy.storchaka
messages: + msg370463

resolution: accepted -> out of date
2014-02-03 18:38:01BreamoreBoysetnosy: - BreamoreBoy
2013-03-29 20:07:02BreamoreBoysetmessages: + msg185513
2010-07-22 02:18:54mgiucasetmessages: + msg111147
2010-07-22 02:08:29orsenthilsetstatus: closed -> open
resolution: fixed -> accepted
messages: + msg111145
2010-07-20 13:55:25orsenthilsetmessages: + msg110898
2010-07-19 17:27:59pitrousetmessages: + msg110786
2010-07-19 16:59:22orsenthilsetmessages: + msg110782
2010-07-19 14:26:31mgiucasetmessages: + msg110759
2010-07-19 13:34:08pitrousetmessages: + msg110748
2010-07-19 13:21:59pitrousetmessages: + msg110746
2010-07-19 12:53:26mgiucasetmessages: + msg110744
2010-07-19 12:13:25orsenthilsetmessages: + msg110740
2010-07-19 12:01:00pitrousetmessages: + msg110738
2010-07-19 11:45:06orsenthilsetmessages: + msg110737
2010-07-19 11:25:29pitrousetmessages: + msg110735
2010-07-19 11:21:33mgiucasetmessages: + msg110734
2010-07-19 11:18:03mgiucasetmessages: + msg110733
2010-07-19 11:14:47pitrousetmessages: + msg110732
2010-07-19 11:06:22orsenthilsetmessages: + msg110731
2010-07-19 10:53:50pitrousetnosy: + pitrou
messages: + msg110730
2010-07-19 01:21:01mgiucasetmessages: + msg110702
2010-07-18 02:29:08orsenthilsetstatus: open -> closed
resolution: accepted -> fixed
messages: + msg110623

stage: patch review -> resolved
2010-07-17 17:58:18BreamoreBoysetnosy: + BreamoreBoy

messages: + msg110587
versions: - Python 2.6
2010-06-04 21:58:53ezio.melottisetstage: test needed -> patch review
2010-06-04 21:22:35adamnelsonsetnosy: + adamnelson
messages: + msg107104
2010-04-26 22:27:59mastrodomenicosetnosy: + mastrodomenico
2010-03-14 12:31:32ezio.melottisetnosy: + ezio.melotti
2010-03-14 08:46:35mgiucasetfiles: + urllib-quote.patch
keywords: + patch
messages: + msg101043
2010-02-20 00:40:37floxunlinkissue2637 dependencies
2010-02-16 04:34:54eric.araujosetnosy: + eric.araujo
2009-08-09 01:32:17orsenthilsettype: behavior
resolution: accepted
assignee: orsenthil
2009-05-26 16:14:19mgiucasetnosy: + mgiuca
messages: + msg88367
2009-04-22 17:46:12naglesetmessages: + msg86310
2009-04-22 17:22:44ajaksu2setkeywords: + easy
2009-02-13 02:00:52ajaksu2setnosy: + orsenthil
stage: test needed
2009-02-12 22:14:02vstinnersetmessages: + msg81828
2009-02-12 21:58:13vstinnersetnosy: + vstinner
components: + Unicode
2009-02-12 18:59:00ajaksu2linkissue2637 dependencies
2009-02-08 23:55:06ajaksu2setnosy: + ajaksu2
messages: + msg81427
versions: + Python 2.6, Python 2.7
2008-12-21 17:57:26vaksetmessages: + msg78155
2008-12-21 17:20:15vaksetnosy: + vak
messages: + msg78153
2007-05-04 06:11:55naglecreate