This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: RobotFileParser.can_fetch return value
Type: Stage:
Components: Library (Lib) Versions: Python 2.3
process
Status: closed Resolution: accepted
Dependencies: Superseder:
Assigned To: Nosy List: loewis, quiver
Priority: normal Keywords: patch

Created on 2004-08-23 10:13 by quiver, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
robotparser.diff quiver, 2004-08-23 10:13 patch against Lib/robotparser.py
Messages (2)
msg46770 - (view) Author: George Yoshida (quiver) (Python committer) Date: 2004-08-23 10:13
Currently, RobotFileParser.can_fetch method returns two 
kinds of types(int and bool) and is not consistent.

More strictly, if a web site sets a robots.txt file, it 
returns 1/0. Otherwise, it returns boolean(True/False).

My change is to make can_fetch method return bool 
values in either case as documented in robotparser 
module.

# examples of can_fetch return values
>>> import robotparser
>>> rp = robotparser.RobotFileParser()
>>> rp.set_url('http://www.ruby-lang.org/robots.txt')
>>> rp.read()
>>> rp.can_fetch('*', 'http://www.ruby-lang.org/')
1
>>> rp.can_fetch('*', 'http://www.ruby-lang.org/doc/')
0
>>> rp.set_url('http://www.example.com/robots.txt')
>>> rp.read()
>>> rp.can_fetch('*', 'http://www.example.com/')
True
msg46771 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2004-08-23 20:44
Logged In: YES 
user_id=21627

Thanks for the patch.  Committed as robotparser.py 1.20. As
this patch is a slight interface change, I think it is not
appropriate for 2.3: 2.3 applications should be easily
enough capable of working around this problem, and the
change may actually break applications who really care about
the type of the return value.
History
Date User Action Args
2022-04-11 14:56:06adminsetgithub: 40809
2004-08-23 10:13:20quivercreate