This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Add a stream type and a merge type to itertoolsmodule.c
Type: Stage:
Components: Extension Modules Versions: Python 2.4
process
Status: closed Resolution: rejected
Dependencies: Superseder:
Assigned To: Nosy List: rhettinger, rumjuggler
Priority: normal Keywords: patch

Created on 2003-07-20 00:53 by rumjuggler, last changed 2022-04-10 16:10 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
stream_merge.patch rumjuggler, 2003-07-20 00:53 the patch
Messages (3)
msg44309 - (view) Author: Ben Wolfson (rumjuggler) Date: 2003-07-20 00:53
This patch adds a stream type to itertoolsmodule.c,
which provides a way to cache results from a generator.
 This is useful if you want to iterate over the
generator more than once.  It also lets potentially
infinite generators simulate lists/tuples;
stream(some_generator())[10] produces the first 11
values from the generator and returns the last one, but
doesn't produce any more.

The other type added is useable to merge the output of
two (sorted) iterables into one iterable.

I assume documentation would also need to be updated if
this patch gets accepted, but since I imagine that
won't be an open-and-shut case I haven't written any yet.
msg44310 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2003-07-20 01:56
Logged In: YES 
user_id=80475

Originally, I looked at implementing all of itertools as a 
single object supporting various methods but rejected it after 
working through the use cases.  It may be time to take 
another look.

At first glance, this object does not fit well with other 
itertools:
* it returns an object supporting more than __iter__ and next
* it consumes memory (other itertools except for cycle do 
not require auxiliary storage)
* it does not support a functional style (to take advantage of 
the cache, a cache object needs to be created and further 
accesses go from there).  The one example you supplied is 
better accomplished with islice() -- see the documentation 
example for nth().
* it doesn't play nice with other itertools which would need 
to be modified to take advantage of the cache:  s=stream
(some_gen()).

I can see some need for caching behavior but would like to 
see compelling use cases that cannot easily be met with list
() and islice().  Create a few examples like the ones in the 
itertools documentation.  These will demonstrate the use 
cases and show that the new function can play nicely with 
the other building blocks.  Try implementing window() with 
the stream tool.  

Also, I'm concerned about the len() method on a potentially 
infinite generator.

See if you can find a better name for it than stream().  
Ideally, the name should suggest caching, sequence-like 
behavior, and lazy evaluation.

Try coding a pure python version and submitting it to the 
newgroup to build support for the idea, see if other's can 
refine the idea, and to tease out use cases.

The merge() function is not sufficiently general purpose to 
warrant inclusion in itertools.  Also, it would be best to allow 
custom comparison so as not to lock in ascending order 
behavior.

Your assumption on documentation is correct.  Also you 
would need unittests, examples, and a pure-python version.

Overall, the patch looks nicely done.
msg44311 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2003-09-01 23:07
Logged In: YES 
user_id=80475

The use cases for iterating more than once are better served 
by iterator splitting.  See 
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/2
13027 .

For the given example, stream(some_generator())[10], the 
need is already met by islice(a, 10, 11).next().

There may yet be an opportunity to develop a lazylist type 
that has an underlying iterator.  It would be belong outside 
the scope of itertools and would need to have demonstrated 
its usefulness by being released into the wild for while.
History
Date User Action Args
2022-04-10 16:10:02adminsetgithub: 38879
2003-07-20 00:53:22rumjugglercreate