This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Optimization for textwrap
Type: Stage:
Components: Library (Lib) Versions: Python 2.5
process
Status: closed Resolution: accepted
Dependencies: Superseder:
Assigned To: rhettinger Nosy List: connelly, georg.brandl, jepler, rhettinger
Priority: normal Keywords: patch

Created on 2005-05-26 23:09 by connelly, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
textwrap-2005-05-26.patch connelly, 2005-05-26 23:09 Patch
Messages (6)
msg48382 - (view) Author: Connelly (connelly) Date: 2005-05-26 23:09
This patch speeds up the textwrap module, so that it
takes O(n) time to process n chunks rather than the
O(n^2) time that textwrap previously took.

This makes textwrap much faster for large texts (for
example, the patched version runs textwrap.fill('foo' *
100000) 80 times faster than the unpatched version).
msg48383 - (view) Author: Connelly (connelly) Date: 2005-05-27 16:37
Logged In: YES 
user_id=1039782

I assign the Copyright for this patch to the Python Software
Foundation.
msg48384 - (view) Author: Jeff Epler (jepler) Date: 2005-06-03 13:49
Logged In: YES 
user_id=2772

I didn't benchmark the speed increase, but I verified that
on my system 'test_textwrap.py' still completes
successfully.  I also looked at the patch and found the
change to be straightforward.  However, I wonder if the
speed increase would be nearly as great if 
collections.deque was used instead of the reversed list. 
Unfortunately, the textwrap module seems to intend
portability with older versions of Python, which would not
work if the collections module is used.
msg48385 - (view) Author: Connelly (connelly) Date: 2005-07-08 23:49
Logged In: YES 
user_id=1039782

Personally, I need linear running time for this module, so I 
contributed the patch in the hope that others would find it 
useful.  The practical result is that parsing a 5 MB unwrapped 
text file takes an order of seconds with the patch, and > 10 
minutes without the patch (3 GHz Pentium).

If the code is written with a deque or queue, there would be a 
similar speed incease, since deque operations are O(1).  Is 
the patch not 'clean enough' code-wide?  (I could recode it to 
use collections.deque or else a simple dummy queue if 
collections is not available).
msg48386 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2005-07-09 11:15
Logged In: YES 
user_id=1188172

I've applied the original patch, ensured the test suite ran
and measured the same speedup as Connelly.

Looks like a sound addition.
Another option is, as he said, using collections.deque or an
substitute class for compatibility.
msg48387 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2005-07-15 06:54
Logged In: YES 
user_id=80475

Applied as Lib/textwrap.py 1.38.
History
Date User Action Args
2022-04-11 14:56:11adminsetgithub: 42018
2005-05-26 23:09:51connellycreate