This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Conserve memory with list.pop()
Type: Stage:
Components: Interpreter Core Versions: Python 2.4
process
Status: closed Resolution:
Dependencies: Superseder:
Assigned To: rhettinger Nosy List: rhettinger, tim.peters
Priority: high Keywords: patch

Created on 2004-09-06 07:16 by rhettinger, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
list.diff rhettinger, 2004-09-06 07:16
Messages (3)
msg46862 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2004-09-06 07:16
The current list resizing scheme only downsizes when
more than 16 elements are removed in a single step: 
del a[100:120].

When popping elements off of a list one at a time, the
current resizing approach never shrinks.

This patch makes it shrink whenever more than half of
the  space is unused.

There may be a better approach.  This one is simple and
avoids thrashing in stack applications that hover up
and down around a nearly steady state.
msg46863 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2004-09-12 04:28
Logged In: YES 
user_id=31435

+1 on the idea, and maybe <wink> on the code.  Reading this 
routine always gave me a headache, and I finally figured out 
why:  _new_size should be name new_allocated.  It's the 
new value of self->allocated, after all, and has much less to 
do with newsize (which _new_size looks most like) or ob_size.

If I rename the code like that, I find it much easier to follow.  
Then the tail end of the patched code looks like

new_allocated = (newsize >> 3) + (self->ob_size < 8 ? 3 : 6) 
+ newsize;
if (newsize < (allocated >> 1))
	new_allocated = newsize;

Then I ask "why?" about the last "if" clause.  That is, why 
change new_allocated away from the value it was given on 
the preceding line?  IOW, if that was a good value for 
new_allocated when the list was growing, why not also when 
the list is shrinking?  If we reset new_allocated to *exactly* 
the smaller requested new size, then if they add an element 
next time we have to endure a realloc right away again.  I 
think it's good to leave some room for growth.  If it turns out 
the list isn't yo-yo'ing, but going straight down to 0, that's 
fine, we'll just shrink it at a slightly slower pace then.

For example, suppose the list has room for 500 slots, 
currently holds 250, and the last element is popped.  
Unconditionally accepting the first new_allocated computed 
would leave it with 249 + 3 + (249 >> 3) = 283 allocated 
slots instead of with 249.  The relative difference is small, but 
the former is friendlier if the next operation is an append, and 
is faster to compute (skips the "if").
msg46864 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2004-09-12 19:57
Logged In: YES 
user_id=80475

Revised and applies as Objects/listobject.c  2.223

* Changed the var name as requested.  Great idea.

* Changed the second "if" to only check for zero  (so that
lists can shrink all the way down).  As requested, this
change leaves non-zero length lists in an over-allocated state.

* Restated the overallocation formula so that it depends
only on the requested size instead of both requested and
previous size.  Doesn't change the growth schedule but does
make the algorithm easier to analyze.
History
Date User Action Args
2022-04-11 14:56:06adminsetgithub: 40873
2004-09-06 07:16:43rhettingercreate