Issue729395
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2003-04-29 08:03 by rhettinger, last changed 2022-04-10 16:08 by admin. This issue is now closed.
Files | ||||
---|---|---|---|---|
File name | Uploaded | Description | Edit | |
dictobject.diff | rhettinger, 2003-04-29 08:03 | One line patch to dictobject.c | ||
dict2.diff | gvanrossum, 2003-04-29 10:49 | Five-line patch |
Messages (10) | |||
---|---|---|---|
msg43535 - (view) | Author: Raymond Hettinger (rhettinger) * | Date: 2003-04-29 08:03 | |
The idea is to make dictionaries more sparse on average (decreasing the density between 0 and 50%). This results in fewer collisions, faster collision resolution, fewer memory accesses, and better cache performance. A small side-benefit is halving the number of expensive resize operations as the dictionary grows and reducing the cost of the resize operations. The patch helps both smaller sized dictionaries and large ones. It entails a one line change to dictobject.c resulting in a new schedule of dictionary sizes for a given number of entries: Number of Current size Proposed size Filled Entries of dictionary of dictionary -------------- ------------- ------------- 0 to 5 8 8 6 to 10 16 32 11 to 21 32 32 22 to 42 64 128 43 to 85 128 128 86 to 170 256 512 171 to 341 512 512 and so on until 100,000 entries where the old schedule takes over. The 100,000 entry limit was added because of python-dev feedback about memory consumption in the presence of many large dictionaries. At 12 bytes per entry on a 32-bit box, the switchback occurs at 1.2 Mb for a single dict. This ought to allow hundreds of large dictionaries on common memory sizes. Performance improvements vary from application to application and across various hardware setups. So far, typical results are from 0 to 5%. |
|||
msg43536 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2003-04-29 10:49 | |
Logged In: YES user_id=6380 My preferred version, with code reordering suggested by Tim and further refinement by me (dict2.diff) |
|||
msg43537 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2003-04-29 10:56 | |
Logged In: YES user_id=6380 However note that I cannot reproduce the speedup on Zope 3 startup any more. It may have been something else in my system. :-( |
|||
msg43538 - (view) | Author: Skip Montanaro (skip.montanaro) * | Date: 2003-05-05 20:21 | |
Logged In: YES user_id=44345 Using Guido's version I saw a consistent improvement in pystone numbers. I ran it five times each with and without the patch. Every "after" run was faster than every "before" run and comparing the fastest run of each showed a 5.4% improvement (13587 vs. 14326.6 pystones). I also ran the pystone benchmark using python -i and then used ps to examine the memory usage after completion. VM and RSS both increased modestly (VM by 0.2%, from 27936 to 27984 and 2.2%, RSS from 3520 to 3600). Not a big deal in my book, but I'm one of those people w/ 1GB of RAM Tim referred to. ;-) |
|||
msg43539 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2003-05-05 20:29 | |
Logged In: YES user_id=6380 I say, check it in, it can't hurt. A comment in the code should explain the considerations though. |
|||
msg43540 - (view) | Author: Raymond Hettinger (rhettinger) * | Date: 2003-05-05 22:23 | |
Logged In: YES user_id=80475 Applied to Objects/dictobject.c 2.145 Closing patch. Thanks to all who contributed, timed, or encouraged. |
|||
msg43541 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2003-05-06 00:16 | |
Logged In: YES user_id=6380 Should the other calls to dictresize() also be changed? |
|||
msg43542 - (view) | Author: Raymond Hettinger (rhettinger) * | Date: 2003-05-06 00:36 | |
Logged In: YES user_id=80475 Most likely, yes, but I would like a day to think about the use cases and implications before saying for certain. |
|||
msg43543 - (view) | Author: Raymond Hettinger (rhettinger) * | Date: 2003-05-06 23:46 | |
Logged In: YES user_id=80475 Yes, in both places, the resize argument of mp- >ma_used*3/2 should be changed to mp->ma_used*2. |
|||
msg43544 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2003-05-07 00:39 | |
Logged In: YES user_id=6380 Ok, do it. |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-10 16:08:26 | admin | set | github: 38397 |
2003-04-29 08:03:45 | rhettinger | create |