Issue 1573394: struct module doesn't use weakref for cache

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/44100

classification

Title:	struct module doesn't use weakref for cache
Type:		Stage:
Components:	Library (Lib)	Versions:	Python 2.5

process

Status:	closed	Resolution:	wont fix
Dependencies:		Superseder:
Assigned To:	bob.ippolito	Nosy List:	bob.ippolito, josiahcarlson, markaflacy, zseil
Priority:	normal	Keywords:

Created on 2006-10-08 23:37 by markaflacy, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Messages (8)
msg30184 - (view)	Author: Mark Flacy (markaflacy)	Date: 2006-10-08 23:37
At the moment, when you attempt to add your 101st different Struct object to the cache, all the other 100 entries are tossed into the garbage. That seems a trifle odd. The Struct cache would appear to be a perfect use for a weakref dictionary.
msg30185 - (view)	Author: Ziga Seilnacht (zseil) *	Date: 2006-10-12 15:52
Logged In: YES user_id=1326842 WeakValueDictionary would be useless here; the cache is the only thing that holds references to Struct objects. Replacing the _cache.clear() call with _cache.popitem() seems like a better solution.
msg30186 - (view)	Author: Bob Ippolito (bob.ippolito) *	Date: 2006-10-12 16:56
Logged In: YES user_id=139309 Weakrefs would slow it down.. probably to the point where the cache wouldn't be an improvement at all. The re module does the same thing with the regular expression cache. This isn't a bug until someone presents a patch that proves another strategy is better for performance.
msg30187 - (view)	Author: Ziga Seilnacht (zseil) *	Date: 2006-10-12 17:36
Logged In: YES user_id=1326842 I was thinking abot something like this: Index: Lib/struct.py =================================================================== --- Lib/struct.py (revision 52316) +++ Lib/struct.py (working copy) @@ -35,7 +35,7 @@ def _compile(fmt): # Internal: compile struct pattern if len(_cache) >= _MAXCACHE: - _cache.clear() + _cache.popitem() s = Struct(fmt) _cache[fmt] = s return s (sorry, I don't have the rights to attach a file)
msg30188 - (view)	Author: Bob Ippolito (bob.ippolito) *	Date: 2006-10-12 17:54
Logged In: YES user_id=139309 Yes, that code is different. You haven't shown that it's better though. Using popitem probably doesn't have very good cache effects.
msg30189 - (view)	Author: Josiah Carlson (josiahcarlson) *	Date: 2006-10-16 22:41
Logged In: YES user_id=341410 Add a deque and a bit of logic, and one would get a fifo implementation of the cache. from collections import deque _toss = deque() def _compile(fmt): # Internal: compile struct pattern if len(_cache) >= _MAXCACHE: del _cache[_toss.popleft()] s = Struct(fmt) _cache[fmt] = s _toss.append(fmt) return s Race conditions from multiple threads could result in #threads-1 more entries in the cache than _MAXCACHE, but this could be trimmed in later calls by replacing 'if' with 'while'.
msg30190 - (view)	Author: Bob Ippolito (bob.ippolito) *	Date: 2006-10-16 23:57
Logged In: YES user_id=139309 The flush strategy is a silly thing to rattle on about. The idea is just to pick a cache size that's big enough that any sane application will never have to flush it. The current cache size is plenty big enough for anything I threw at it. If there's an application out there that needs a bigger cache I'd like to see it, and then perhaps we could adjust the size in some minor release to accommodate it. A FIFO is probably a bad strategy because there are a lot of common structs that are used a lot (the ones with just a few characters). If there was an obviously better solution here, surely it would have already been proposed or implemented for the re module.
msg30191 - (view)	Author: Josiah Carlson (josiahcarlson) *	Date: 2006-10-17 00:08
Logged In: YES user_id=341410 I agree, just pointing out that there exists a cache replacement strategy. Also, there is an LRU implementation in the Python cookbook, but unless it is rewritten in C (a serious overkill for the 2? uses in the stdlib), it would necessitate a lock, which would be a waste. But we've talked enough, and I probably shouldn't have bothered posting this message. I'll let the thread die this time, really!

History
Date	User	Action	Args
2022-04-11 14:56:20	admin	set	github: 44100
2006-10-08 23:37:14	markaflacy	create