This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: urllib/urllib2(?) timeouts
Type: Stage:
Components: Documentation Versions: Python 2.3
process
Status: closed Resolution: rejected
Dependencies: Superseder:
Assigned To: Nosy List: bpettersen, brett.cannon, sonderblade
Priority: normal Keywords:

Created on 2003-09-10 07:43 by bpettersen, last changed 2022-04-10 16:11 by admin. This issue is now closed.

Messages (3)
msg18124 - (view) Author: Bjorn Pettersen (bpettersen) Date: 2003-09-10 07:43
Could Skip's example from 
http://groups.google.com/groups?hl=en&lr=&ie=UTF-
8&selm=mailman.1051711375.3445.python-list%
40python.org be added to the documentation? 

(slightly re-worded)
You can set the default timeout for sockets before 
opening the URL (use 0.0 for non-blocking sockets):

    dto = socket.getdefaulttimeout()
    socket.setdefaulttimeout(mytimeout)
    try:
        try:
            f = urllib2.urlopen("http://www.python.org/")
            x = f.read()
        except socket.error, msg:  ### Note2
            print "timeout"
    finally:
        # restore default
        socket.setdefaulttimeout(dto)

Note1: urllib, urllib2, or both? (and should there be 
a "see also: urllib2" in the urllib docs?)

Note2: Experimenting (and digressing -- it's late :-)

  urllib.urlopen, timeout=0.0: IOError(?) [1]
  urllib.urlopen, timeout=0.1: IOError(?) [2]
  urllib2.urlopen, timeout=0.0: urllib2.URLError [3]
  urllib2.urlopen, timeout=0.1: socket.timeout

I can understand the last one... sort of. The 
exception "reporting" is part of a module's interface, so 
I'd expect to be able to "Except urllib.exception" (modulo 
consistent naming) to catch all reasonable[0] 
exceptions resulting from calling into the module, and 
urllib.timeout to catch timeouts.

I can see the argument that the user set the timeout in 
the socket module, so should expect them, but then 
handling timeouts are routine for web programming...

IOError is not related to socket.timeout, so I'm not sure 
why it's left free to escape, especially with that errno :-)

-- bjorn

[0] of course not urllib.MemoryError or other 
catastrophic events

[1]
  File "E:\Python23\lib\httplib.py", line 564, in send
    self.connect()
  File "E:\Python23\lib\httplib.py", line 548, in connect
    raise socket.error, msg <--### see below
IOError: [Errno socket error] (10035, 'The socket 
operation could not complete without blocking')
>>> x = 0
>>> try:
...   urllib.urlopen
('http://www.comcast.net/memberservices/index.jsp')
... except Exception, e:
...   x = e
...
>>> x   
<exceptions.IOError instance at 0x00A63288> ###
 
[2]
  File "E:\Python23\lib\httplib.py", line 564, in send
    self.connect()
  File "E:\Python23\lib\httplib.py", line 548, in connect
    raise socket.error, msg
IOError: [Errno socket error] timed out

...and also...

  File "E:\Python23\lib\socket.py", line 323, in readline
    data = recv(1)
IOError: [Errno socket error] timed out

[3]
  File "e:\python23\lib\urllib2.py", line 849, in http_open
    return self.do_open(httplib.HTTP, req)
  File "e:\python23\lib\urllib2.py", line 834, in do_open
    raise URLError(err)
urllib2.URLError: <urlopen error (10035, 'The socket 
operation could not complete without blocking')>
msg18125 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2003-09-29 01:27
Logged In: YES 
user_id=357491

This cannot be added to the docs.  urllib and urllib2 do not 
officially support the socket timeout feature.  It just happens to 
work that way because of the current implementation.

Without changing the docs to explicitly support the socket timeouts 
this example is not valid.  And the docs cannot be changed without 
guaranteeing that urllib and urllib2 will always support the feature.
msg18126 - (view) Author: Björn Lindqvist (sonderblade) Date: 2006-07-05 15:52
Logged In: YES 
user_id=51702

Well then please, please, please make it so urllib and
urllib2 will always support this feature. The net is full of
hints suggesting that socket.setdefaulttimeout() should be
used and changing it would break lots of programs I'm sure.
Couldn't the timeout setting also be exposed to at the
urllib and urllib2 level?

urllib2.urlopen("www.python.org", timeout = 5)
History
Date User Action Args
2022-04-10 16:11:05adminsetgithub: 39208
2003-09-10 07:43:00bpettersencreate