This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Constructor for counting things
Type: enhancement Stage:
Components: None Versions:
process
Status: closed Resolution: rejected
Dependencies: Superseder:
Assigned To: Nosy List: brett.cannon, dtorp, georg.brandl, georg.brandl, mcherm
Priority: normal Keywords:

Created on 2003-05-17 02:22 by dtorp, last changed 2022-04-10 16:08 by admin. This issue is now closed.

Messages (6)
msg53889 - (view) Author: David Albert Torpey (dtorp) Date: 2003-05-17 02:22
Counting things is very common.  May I suggest an 
alternate dictionary constructor that works like this:

class BetterDictionary(dict):
    def bag(classobject, sequence):
        "Fast way to count things"
        b = classobject()
        for k in sequence:
            b[k] = b.get(k,0) + 1
        return b
    bag = classmethod(bag)

print BetterDictionary.bag("jack and jill went up a 
hill ...")

A C implementation could do this very fast.
msg53890 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2003-08-12 06:12
Logged In: YES 
user_id=357491

I don't know if it is *that* common.  Besides, as your code shows, 
it isn't that complex.  It strikes me as a little too specific to be 
made a built-in dict constructor.
msg53891 - (view) Author: David Albert Torpey (dtorp) Date: 2003-08-26 18:07
Logged In: YES 
user_id=681258

Smalltalk has a collection class called Bag that fills the same 
role.  It is a frequently used class.

The current way of doing things is too slow.  Disassembly 
shows that the whole process involves loading a constant for 
1, loading a constant for zero, loading the local variable 'b', 
performing an attribute lookup for 'get', loading the local 
variable 'k', calling 'get', running the summation, re-loading 
both 'b' and 'k' and doing a subscript store.  Along the way, 
there are two dictionary lookups for the same key.   All of 
that is just the inner loop and is wrapped in for-loop with all 
of its overhead.  Instead, the whole thing could be done in 
one step at C speed:  b.bag(itemlist).
msg53892 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2005-07-17 14:07
Logged In: YES 
user_id=1188172

Perhaps "bag" is a good addition to the "collections" module.
msg53893 - (view) Author: Michael Chermside (mcherm) (Python triager) Date: 2005-12-28 17:19
Logged In: YES 
user_id=99874

I think I prefer birkenfeld's approach: create a well-tuned
implementation for the collections module rather than adding
yet one MORE use case to the already overburdened (yet very
powerful) dict.
msg53894 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2006-03-17 19:33
Logged In: YES 
user_id=849994

I think this a reasonable use case for defaultdict:

d = defaultdict(int)
for k in seq:
    d[k] += 1
History
Date User Action Args
2022-04-10 16:08:47adminsetgithub: 38516
2003-05-17 02:22:49dtorpcreate