This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: New style classes and __hash__
Type: Stage:
Components: None Versions:
process
Status: closed Resolution: rejected
Dependencies: Superseder:
Assigned To: gvanrossum Nosy List: danielhs, gvanrossum, jhylton, theller, tim.peters
Priority: normal Keywords:

Created on 2002-12-30 18:39 by theller, last changed 2022-04-10 16:06 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
patch.txt tim.peters, 2003-05-11 04:39 object_hash() and test_class.py patch
newpatch.txt gvanrossum, 2003-12-05 18:30 Patch to remove object_hash(); includes Tim's test_class.py patch
Messages (17)
msg13711 - (view) Author: Thomas Heller (theller) * (Python committer) Date: 2002-12-30 18:39
New style classes obviously inherit a __hash__ 
implementation from object which returns the id. Per 
default this allows using instances as dictionary keys, 
but usually with the wrong behaviour, because most 
often user classes are mutable, and their contained data 
should be used to calculate the hash value.

IMO one possible solution would be to change 
typeobject.c:object_hash() to raise TypeError, and 
change all the immutable (core) Python objects to use 
_Py_HashPointer in their tp_hash slot.
msg13712 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2002-12-30 18:44
Logged In: YES 
user_id=6380

There seems to be code that tries to inherit tp_hash only
when tp_compare and tp_richcompare are also inherited, but
it seems to be failing.
msg13713 - (view) Author: Thomas Heller (theller) * (Python committer) Date: 2002-12-30 18:50
Logged In: YES 
user_id=11105

You mean at the end of the inherit_slots() function?
For my extension which I'm currently debugging, tp_compare, 
tp_richcompare, and tp_hash are inherited from base, but 
only tp_hash is != NULL there.
msg13714 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2003-02-11 23:01
Logged In: YES 
user_id=6380

I spent an afternoon looking into this, and I can't see an
easy solution. The idea of only inheriting __hash__ together
with certain other slots is really flawed; it may be better
if object DIDN'T define a default implementation for
__hash__, comparisons (both flavors), and other things, or
maybe the default __hash__ should raise an exception when
the comparisons are not those inherited from object, or
maybe PyType_Ready should insert a dummy __hash__ when it
sees that you redefine __eq__, or...  I really don't know.
I'm going to sleep on this some more, and lower the
priority. You can always get the right behavior by
explicitly defining __hash__.
msg13715 - (view) Author: Jeremy Hylton (jhylton) (Python triager) Date: 2003-05-09 17:32
Logged In: YES 
user_id=31392

Currently, a new-style class that defines __cmp__ but not
__hash__ is usable as a dictionary key.  That seems related
to this bug.  Should I paste the example here and bump the
priority?  Or should I open a separate bug report?
msg13716 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2003-05-09 17:44
Logged In: YES 
user_id=6380

Yes, please paste an example here.
msg13717 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2003-05-09 17:54
Logged In: YES 
user_id=31435

>>> class C:  # classic class complains
...   __cmp__ = lambda a, b: 0
...
>>> {C(): 1}
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
TypeError: unhashable instance
>>> class C(object):   # new-style class does not complain
...   __cmp__ = lambda a, b: 0
...
>>> {C(): 1}
{<__main__.C object at 0x007F6970>: 1}
>>> 

That was under current CVS.  I see the same behavior in 
2.2.3, so this isn't new.

About Thomas's original report, I don't agree -- the default 
behavior is very useful.  The rule I've always lived by is that, to 
be usable as a dict key, an instance's class must either:

1. Implement none of {__cmp__, __eq__, __hash__}.

or

2. Implement __hash__ and (at least) one of {__cmp, __eq__}.

Classic classes still work this way.  New-style classes don't 
appear to outlaw any combination here.
msg13718 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2003-05-09 18:04
Logged In: YES 
user_id=6380

A trick similar to what we do in object_new might work.
There, we raise an error if the tp_init slot is the default
function (object_init) and any arguments are passed.

I propose that object_hash checks that tp_compare and
tp_richcompare are both NULL. I'm attaching a patch -- let
me know if that works.
msg13719 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2003-05-11 04:39
Logged In: YES 
user_id=31435

The patch seems fines to me.  I've attached a new patch, 
combining yours with new tests in test_class.py.
msg13720 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2003-05-11 10:16
Logged In: YES 
user_id=6380

OK, feel free to check it in.
msg13721 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2003-05-11 13:41
Logged In: YES 
user_id=6380

Oops, not so fast. This also makes object.__hash__() calls
fail when it is explicitly invoked, e.g. when a class
overrides __eq__ to print a message and then call the base
class __eq__, it must do the same for __hash__, but
object.__hash__ will still fail in this case. I'll think of
a fix for that.
msg13722 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2003-12-05 18:06
Logged In: YES 
user_id=6380

I wonder if the solution could be as simple as removing the
tp_hash slot from the object class? 
 
I just tried that and it passes the entire test suite, as
well as the tests that Tim added to the patch. 
 
The trick is that PyObject_Hash() has a fallback which does
the right thing. 
 
And when the base object class doesn't set tp_compare or
tp_richcompare, I think it should be allowed not to set
tp_hash either. 
msg13723 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2003-12-05 18:30
Logged In: YES 
user_id=6380

Here's the patch I am thinking of.
msg13724 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2003-12-22 21:04
Logged In: YES 
user_id=6380

Anybody see a reason why I shouldn't check this in? See
python-dev discussion.
msg13725 - (view) Author: Daniel (danielhs) Date: 2006-10-06 02:22
Logged In: YES 
user_id=1609821

Still doesn't work as expected in 2.5.
Just wanted to bump along since I noticed this bug today,
and found this bug report (which hasn't changed in nearly 3
years).
msg13726 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2006-10-06 21:19
Logged In: YES 
user_id=6380

I don't recall what was wrong with the patch.

I do know that I fixed this in Python 3000; but the fix
there was only possible due to other unrelated fixes that
couldn't possibly be backported.

I propose to leave this broken until Py3k, so let's close
this bug.
msg13727 - (view) Author: Daniel (danielhs) Date: 2006-10-06 21:29
Logged In: YES 
user_id=1609821

I don't know if this is the right answer or not, but, this
post seems to indicate that the fix would break Jython.  But
a later post in the thread notes that since the latest
version of Jython is only version 2.1 this shouldn't be an
issue.

First post:
http://mail.python.org/pipermail/python-list/2004-December/257637.html

Second post:
http://mail.python.org/pipermail/python-list/2004-December/257690.html

Daniel
History
Date User Action Args
2022-04-10 16:06:04adminsetgithub: 37665
2002-12-30 18:39:37thellercreate