This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: random.choice(setinstance) fails
Type: enhancement Stage:
Components: Library (Lib) Versions:
process
Status: closed Resolution: rejected
Dependencies: Superseder:
Assigned To: rhettinger Nosy List: aisaac0, georg.brandl, rhettinger, rrenaud, tim.peters, wim.glenn
Priority: normal Keywords:

Created on 2006-09-02 18:27 by aisaac0, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Messages (9)
msg54889 - (view) Author: Alan (aisaac0) Date: 2006-09-02 18:27
It seems perverse that random.choice cannot pick from 
a set.
(Maybe it is also perverse that it cannot pick a key 
from a dict.)
The current definition is

def choice(self, seq):
    """Choose a random element from a non-empty 
sequence."""
    return seq[int(self.random() * len(seq))]  # 
raises IndexError if seq is empty

How about something along the lines of:

def choice(self, seq):
    """Choose a random element from a non-empty 
sequence."""
    idx = int(self.random() * len(seq))
    try:
        result = seq[idx]  # raises IndexError if seq 
is empty
    except TypeError:
        result = list(seq)[idx]
    return result
 
msg54890 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2006-09-02 18:33
Logged In: YES 
user_id=849994

To Raymond: Wasn't there a discussion about random.choice()
some time ago?
msg54891 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2006-09-02 19:37
Logged In: YES 
user_id=31435

This is certainly a feature request rather than a bug (the
code is working as documented (a set is not a sequence) and
as designed).

Change it to a feature request, and it will probably just
sit there for years ;-)  The unwritten promise is that
choice(s) on a sequence of length N will take O(1) time and
O(1) space.  There is no known way to do this given the
current implementations of dicts and sets better than O(N)
time and O(1) space (and even that poor behavior is tricky
to achieve -- the way suggested by the OP requires O(N) time
and O(N) space).

Rather than give users surprisingly atrocious performance
for one type of argument, leaving it out of choice()
emphasizes reality:  if you want efficient random selection
from a set, you /need/ to use a fancier data structure than
a Python set.  OTOH, if you don't care about efficiency,
it's hardly a burden to write random.choice(list(myset))
instead.  Then at least the atrociousness of your algorithm
design is obvious to readers of the code <0.5 wink>.
msg54892 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2006-09-02 22:44
Logged In: YES 
user_id=80475

Sorry, even as a feature request, this is doomed.  
msg84251 - (view) Author: Rob Renaud (rrenaud) Date: 2009-03-27 05:05
I found this via google search when disappointed that random.choice
raised an exception rather than returned a random item in the set.

It's quite easy to implement random.choice for sets/dicts in O(1)
expected time from the C implementation as long as the set/dict
implementation guarantees minimal constant density.  Simply generate
random indices in the set object until one with an object is found . 
This has will work in expected O(1/density) probes.

I suppose making random.choice work for sets/dicts isn't worth a C
implementation (as happy as it would have made me a few hours ago...)?
msg84253 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2009-03-27 05:26
The CPython set/dict implementation does not guarantee "minimal constant
density", so "quite easy" doesn't apply in reality.  For example, a set
that once contained a million elements may still contain a million
/slots/ for elements after all but one of the elements has been removed.
 And in that case, the only way to find the sole occupied slot is to
search over the million allocated slots.

In short, all the comments in this item from 2006 still apply without
change.
msg188701 - (view) Author: wim glenn (wim.glenn) * Date: 2013-05-08 02:49
The implementation suggested by the OP

def choice(self, seq):
    """Choose a random element from a non-empty 
sequence."""
    idx = int(self.random() * len(seq))
    try:
        result = seq[idx]  # raises IndexError if seq 
is empty
    except TypeError:
        result = list(seq)[idx]
    return result

Is broken because input may be a dict with, for example, keys 0 1 and 7 - potentially causing the line result = seq[idx] to pass when logically it should raise.  Rather it would be needed to determine from the input whether it was a non-sequence type collection, which sounds pretty hairy...
msg188702 - (view) Author: wim glenn (wim.glenn) * Date: 2013-05-08 02:57
How about 

if isinstance(seq, collections.Sequence):
  # do it the usual way ..
else:
  return choice(list(seq))
msg188892 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2013-05-11 03:08
Sorry, this is still rejected for the reasons that Tim mentioned.
History
Date User Action Args
2022-04-11 14:56:19adminsetgithub: 43930
2013-05-11 03:08:25rhettingersetmessages: + msg188892
2013-05-08 02:57:19wim.glennsetmessages: + msg188702
2013-05-08 02:49:45wim.glennsetnosy: + wim.glenn
messages: + msg188701
2009-03-27 05:27:12tim.peterssetmessages: - msg84254
2009-03-27 05:26:32tim.peterssetmessages: + msg84254
2009-03-27 05:26:03tim.peterssetmessages: + msg84253
2009-03-27 05:05:10rrenaudsetnosy: + rrenaud
messages: + msg84251
2006-09-02 18:27:30aisaac0create