HTTP connection maden by urllib, as well as by urllib2,
on some URL sleeps forever (until timeout happens) on
reading from the socket.
The popular Linux 'wget' utility behaviour is the same.
The Mozilla browser, as well as Internet Explorer
browser read this URL successfully, over proxy, as well
as directly.
The example URL is:
http://nds.nokia.com/uaprof/N3510ir100.xml
The example code is:
import urllib2
u =
urllib2.urlopen('http://nds.nokia.com/uaprof/N3510ir100.xml')
print u.info()
print '-------------'
for l in u :
print l
The urllib library does the same.
Info list was (on the moment when I tried it last time):
Accept-Ranges: bytes
Date: Mon, 01 Nov 2004 10:29:58 GMT
Content-Length: 9710
Content-Type: text/plain
Cache-Control: no-cache
Server: Netscape-Enterprise/4.1
X-WR-FLAGS: CCHOMode=7200:0:force
Etag: "acbd4f76-6-25ee-40910c98"
Last-modified: Thu, 29 Apr 2004 14:09:28 GMT
Via: 1.1 saec-nokp02ca (NetCache NetApp/5.3.1R2)
I have no idea why it happens. May be, the HTTP server
waits some additional headers? In any case, it is not a
good behaviour of the library, I think.
|