This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: websucker relative-URL errors
Type: Stage:
Components: Demos and Tools Versions: Python 2.2
process
Status: closed Resolution:
Dependencies: Superseder:
Assigned To: gvanrossum Nosy List: aleax, gvanrossum
Priority: critical Keywords:

Created on 2002-10-09 10:30 by aleax, last changed 2022-04-10 16:05 by admin. This issue is now closed.

Messages (3)
msg12669 - (view) Author: Alex Martelli (aleax) * (Python committer) Date: 2002-10-09 10:30
reproduce easily with, e.g.:
python websucker.py -v http://www.aleax.it

gives a series of error messages such as:

Check http://www.aleax.it/./py2.htm
Error ('http error', 404, 'Object Not Found')
 HREF  http://www.aleax.it/./py2.htm
  from http://www.aleax.it/./Python/ (///./py2.htm)

Check http://www.aleax.it/p1.htm
Error ('http error', 404, 'Object Not Found')
 HREF  http://www.aleax.it/p1.htm
  from http://www.aleax.it/./TutWin32/index.htm (///p1.htm)

but the relevant snippets of the HTML sources are e.g:
in Python/index.html:
<A href="./py2.htm">
in TutWin32/index.html:
<a href="p1.htm">

i.e. both relative URLs, so should resolve to the URLs
of the files that ARE present, Python/py2.htm and
TutWin32/p1.htm respectively.

And indeed /usr/bin/wget has no problem fetching
the whole small site.

Pls let me know if you want me to explore the bug further
and prepare a patch in time for 2.2.2 release -- otherwise
I think this shd at least be documented as a known bug
(making websucker close to unusable, alas).


Alex



msg12670 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2002-10-14 19:24
Logged In: YES 
user_id=6380

Argh! This looks like a bug in urlparse.py, introduced
somewhere in or after 2.2.

In 2.1, or in 2.2.1:

>>> import urlparse
>>> urlparse.urlunparse(urlparse.urlparse('./Python'))
'./Python'
>>> 

In 2.2.2 or 2.3:

>>> import urlparse
>>> urlparse.urlunparse(urlparse.urlparse('./Python'))
'///./Python'
>>> 

I'
msg12671 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2002-10-14 20:09
Logged In: YES 
user_id=6380

OK, fixed in 2.2.2 and 2.3. Whew!
History
Date User Action Args
2022-04-10 16:05:44adminsetgithub: 37290
2002-10-09 10:30:52aleaxcreate