Issue 1085744: PySequence_Tuple not as fast as PySequence_List

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/41327

classification

Title:	PySequence_Tuple not as fast as PySequence_List
Type:		Stage:
Components:	Interpreter Core	Versions:	Python 2.5

process

Status:	closed	Resolution:	fixed
Dependencies:		Superseder:
Assigned To:	rhettinger	Nosy List:	ncoghlan, rhettinger
Priority:	normal	Keywords:

Created on 2004-12-15 12:15 by ncoghlan, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name	Uploaded	Description	Edit
abstract.diff	rhettinger, 2004-12-15 21:27	Patch to PySequence_Tuple

Messages (6)
msg23713 - (view)	Author: Nick Coghlan (ncoghlan) *	Date: 2004-12-15 12:15
A little "my code is faster than your code" game on python-list showed up a pathological interaction between itertools.chain and str.join: C:\>python -m timeit -s "data = [map(str, range(x)) for x in range(1000)]; from itertools import chain" "''.join(chain(data))" 10 loops, best of 3: 1.2 sec per loop ***** OUCH!! Some extra experiments show that creating a list from the result of chain is fast, but creating a tuple is horrendously slow: C:\>python -m timeit -s "data = [map(str, range(x)) for x in range(1000)]; from itertools import chain" "''.join(list(chain(data)))" 10 loops, best of 3: 107 msec per loop C:\>python -m timeit -s "data = [map(str, range(x)) for x in range(1000)]; from itertools import chain" "''.join(tuple(chain(data)))" 10 loops, best of 3: 1.2 sec per loop Creating the tuple by means of a list is actually faster than creating the tuple directly: C:\>python -m timeit -s "data = [map(str, range(x)) for x in range(1000)]; from itertools import chain" "''.join(tuple(list(chain(data))))" 10 loops, best of 3: 146 msec per loop A check with imap suggests the problem may be a more general interaction between PySequence_Fast and iterators which don't have __len__ methods: C:\>python -m timeit -s "data = [y for x in range(1000) for y in map(str, range(x))]; from itertools import imap; val = lambda arg: arg" "''.join(list(imap(val, data)))" 10 loops, best of 3: 310 msec per loop C:\>python -m timeit -s "data = [y for x in range(1000) for y in map(str, range(x))]; from itertools import imap; val = lambda arg: arg" "''.join(imap(val, data))" 10 loops, best of 3: 1.4 sec per loop Looking at the code supports that, too - PySequence_Fast uses PySequence_Tuple, which is great when PyObject_Size gives a nice answer, but can lead to a lot of tuple resizing when it isn't (one resize per 10 items beyond 10 up to 500, then one resize per 100 items beyond 500). 2.4's optimised list extension means that this really* hurts performance wise. The other aspect is whether or not some of the utilities in itertools could benefit from a __len__ method that returned a sensible answer if their inputs defined __len__ methods, and returned -1 otherwise (this would certainly work well with PySequence_Tuple, but I realise it's a slightly odd behaviour for a __len__ method).
msg23714 - (view)	Author: Nick Coghlan (ncoghlan) *	Date: 2004-12-15 12:17
Logged In: YES user_id=1038590 Kicking in your direction Raymond, since this can badly affect the performance of itertools.
msg23715 - (view)	Author: Raymond Hettinger (rhettinger) *	Date: 2004-12-15 15:04
Logged In: YES user_id=80475 Remain calm ;-) Also, please isolate the issues instead of clumping everything together. And this is likely best handled by email rather than SF (you're asking me to explain all of your timings rather than pointing to a buggy piece of code). The idea (feature request) to make iterators length transparent has already been explored and reached a dead end. I've implemented it already it the few situations where it can work (for example itertools.repeat(obj, n) and dict.iteritems() report their length). Most other situations run into logical inconsistencies due to mutable iterables being indistinguishable from iterators over those iterables. See Lib/test/test_iterlen.py for the gory details. Writing f(it) is also its own issue -- for uses other than argument passing, Guido considers it an abuse (since unwinds its components on to ceval's stack). Likewise, there is nothing unique to itertools here. All of the timings can be shown to have similar results if generators are used instead of itertools. This issue is really how consumers handle unsized iterable inputs. You cover three different consumers, ''.join(it), list(it), and tuple(it) which all take different approaches to unsized iterable inputs. So, it is no surprise that the three have different performance characteristics. str.join() is its own little bundle of joy whose behavior is dictated by its need to make multiple passes over the input. Improving its handling of unsized iterable inputs is a thorny subject. You're welcome to post a patch. The best way to analyze what is going on is to disregard the timings and instead draw little diagrams of what is memory at any given time. Also, draw a data migration path -- you want the algorithm to move the elements as few times as possible. Be sure to handle cases like dict.iteritems() which know their length but are not a list or tuple. Also, handle cases where the inputs are created on the fly (not already held in memory as in your example): ''.join(str(i%10) for i in xrange(n)). FYI, there was an interesting bug report on this subject a year ago. IIRC, the OP saw that memory consumption was greater with an iterable input than if the iterable had been first wrapped in list(). I don't see anything out of whack for the creation of lists and tuples from unsized iterables. There are simply two different overallocation algorithms; consequently, there will always be situations where one outperforms the other. For your giant sized examples, the tuple growth strategy of fixed sized overallocations does poorly against the list strategy of increasingly large overallocations. In contrast, tuples can do better for certain small sizes. If you think there are critical use cases for tuple(it) where for a large, unsized it, then you can proposed a variable length strategy like that used for lists. If such use cases exist, I suspect they don't fit real well with Guido's intended philosophy on tuple use. Am closing this as a bug report. It is more a mixture of questions (explain this timing) and feature requests. Feel free to send me emails to explore this further. I've been through this all before and it is a substantial analysis project. If you go down this path, you may well find some room for improvements (str.join) in particular. Also, be sure to separate the issues. It is not all helpful to simultaneously cover itertools, *args, lists vs tuples, unsized iterators, str.join, list.extend, and a proposals to make terators report their length.
msg23716 - (view)	Author: Raymond Hettinger (rhettinger) *	Date: 2004-12-15 17:14
Logged In: YES user_id=80475 P.S. In your example, you can get an immediate improvement in performance by having Py_SequenceFast use PySequence_List instead of PySequence_Tuple. If you want to work on some Py2.5 optimizations along these lines, look at replacing the somewhat arbitrary over-allocation strategy in PySequence_Tuple. If you're super-motivated, see if you can improve on the algorithm for str.join. Since these are just strategy changes, they will improve some cases at the expense of others. The goal is to find the one that does the best, most of the time; not horribly in worst situations; and makes the fewest demands on memory.
msg23717 - (view)	Author: Raymond Hettinger (rhettinger) *	Date: 2004-12-15 21:27
Logged In: YES user_id=80475 Attaching a small patch for PySequence_Tuple. Try it out and let me know if you find it to be an improvement. It also adds a bit of error checking that should have been there anyway.
msg23718 - (view)	Author: Raymond Hettinger (rhettinger) *	Date: 2004-12-16 10:39
Logged In: YES user_id=80475 Loaded the patch. See Objects/abstract.c 2.133

History
Date	User	Action	Args
2022-04-11 14:56:08	admin	set	github: 41327
2004-12-15 12:15:49	ncoghlan	create