This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: range() in for loops, again
Type: Stage:
Components: Interpreter Core Versions:
process
Status: closed Resolution: rejected
Dependencies: Superseder:
Assigned To: gvanrossum Nosy List: arigo, brett.cannon, gvanrossum, nnorwitz, rhettinger
Priority: normal Keywords: patch

Created on 2005-04-12 09:00 by arigo, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
range_iter.diff arigo, 2005-04-12 09:00 Patch #1
range_iter_2.diff arigo, 2005-04-13 10:24 Patch #2
Messages (10)
msg48205 - (view) Author: Armin Rigo (arigo) * (Python committer) Date: 2005-04-12 09:00
Here is (yet) a(nother) patch to make range() lazy and fast in the context of for loops.

This implementation allows any built-in function or method to declare two C implementations if it wants to.  The second implementation, if present, is used when the caller can know for sure that it will only ever need an iterator over the result.  This is done with a new ml_flag_iter member of PyMethodDef.  To minimize compatibility issues a new METH_ITERABLE flag must be set when ml_flag_iter is valid.

The CALL_FUNCTION bytecode uses that if the immediately following bytecode is GET_ITER.

The patch updates range() to use this flag and build an xrange iterator object.

Of course the same technique could be applied to other built-in functions, but none spring to mind.  Maybe a simple patch that only special-cases range() would do just as well...

FWIW, I get a 5% speed-up on pystone with this patch.
msg48206 - (view) Author: Neal Norwitz (nnorwitz) * (Python committer) Date: 2005-04-12 11:03
Logged In: YES 
user_id=33168

dict.keys(), dict.values(), and dict.items() could also be
possibilities.
msg48207 - (view) Author: Armin Rigo (arigo) * (Python committer) Date: 2005-04-12 11:34
Logged In: YES 
user_id=4771

Doesn't work, because the dict can be modified during iteration.
msg48208 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2005-04-13 00:31
Logged In: YES 
user_id=357491

I vote for a patch to special-case range().  This won't be
an issue in Python 3000 when stuff like this become
generators.  Still wish there was a way to come up with a
range object that has ``type(range(1)) == list`` be true,
but check the refcount on the object so that if it is 1 it
can act as a generator instead.
msg48209 - (view) Author: Armin Rigo (arigo) * (Python committer) Date: 2005-04-13 10:24
Logged In: YES 
user_id=4771

Ok, here is a patch special-casing range() only.  It's a bit simpler, and faster too.

Did I already mention around here that the class/type unification of Python 2.2 might have been better done, in my opinion, simply by replacing the built-in type() with lambda x: x.__class__?
msg48210 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2005-04-13 19:56
Logged In: YES 
user_id=357491

No, I don't remember you ever mentioning that, but that is a
good idea since that would allow easy faking of things.
msg48211 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2005-04-15 05:11
Logged In: YES 
user_id=80475

There are several builtins that could offer an iterator
alternative:
  range defers to xrange
  filter to itertools.ifilter
  map to itertools.imap (1)
  zip to itertools.izip

All of these offer huge space savings and a little speed
(with izip providing the best improvement because it avoids
tons of tuple allocations).

(1) an alternate imap needs to be provided since the
itertools version does not provide the None fill-in feature.
msg48212 - (view) Author: Armin Rigo (arigo) * (Python committer) Date: 2005-04-15 08:05
Logged In: YES 
user_id=4771

All these examples would imply a slight semantic change, I think:

   for x in filter(f, y):

if y is an iterator, the above line exhausts y before entering the loop.  The same for map and zip.

I want to avoid this at all costs, otherwise we'll see strange bug reports whose "solution" is: "well, just write instead

   seq=filter(f,y); for x in seq:

"... which would be very bad, in my opinion.

I'm already slightly concerned that

   r=range(10**7); for x in r: break

behaves quite differently than

   for x in range(10**7): break

with my patch.  I believe that in this case it is an acceptable tread-off, but I would certainly not check this patch in without explicit BDFL approval.

Guido, assigning to you for said principle approval or rejection...
msg48213 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2005-04-15 21:53
Logged In: YES 
user_id=6380

I think we should leave all this alone until Python 3.0, and
there all these would always be iterators (or iterator
factories -- maybe the 3.0 range would behave like xrange
today).

For people wanting speed, there's already xrange().

For people who don't care, the subtle semantic differences
are more of a risk than a bonus.
msg48214 - (view) Author: Armin Rigo (arigo) * (Python committer) Date: 2005-04-16 07:30
Logged In: YES 
user_id=4771

Makes sense in this context.
History
Date User Action Args
2022-04-11 14:56:10adminsetgithub: 41849
2005-04-12 09:00:51arigocreate