This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Inplace set merge produces wrong results
Type: Stage:
Components: Interpreter Core Versions:
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: doko, rhettinger, tim.peters
Priority: high Keywords:

Created on 2004-12-07 05:36 by doko, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
bug.tar.gz doko, 2004-12-07 05:36 testcase and data
Messages (3)
msg23555 - (view) Author: Matthias Klose (doko) * (Python committer) Date: 2004-12-07 05:36
[forwarded from http://bugs.debian.org/284490]

the inplace set merge can produce wrong results
compared to the a = a | b non in-place (and slower)
version. 
 
Please see the attached tarball: running the "test"
script shows the problem. 
msg23556 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2004-12-07 07:46
Logged In: YES 
user_id=31435

I can pretty much guarantee this isn't a bug in Python, 
but is in some aspect of your algorithm that *relies* on 
not sharing mutable sets.

For example, if I leave Debtags1.py's

	self.items[t] |= add_elements

alone but add this right after it:

	self.items[t] = self.items[t].copy()

then Debtags1.py produces the same output as 
Debtags.py.  Same thing with Debtags2.py:  adding that 
line also makes Debtag2.py's output the same.

That proves the problem isn't in the implementations 
of "|=" or .update().  It strongly suggests that you're 
mutating a shared set that you're not expecting to 
mutate, or aren't expecting is shared.

For example, your driver does

s = db.elset(sys.argv[1])
for t in sys.argv[2:]:
	s &= db.elset(t)

and that mutates the set in self.items[sys.argv[1]].  If 
you don't intend that computing output will mutate the 
sets in db, then that code is confused.  That's not the 
source of your differing output, but "something like it" 
probably is.

In fact, the problem is probably here:

self.items[t] = add_elements

That can assign the same add_elements as the value 
associated with *many* distinct values of t.  Then you 
try to update those later in place, but "those" is a single 
shared set.  Changing the value associated with one of 
the t's then changes the value associated with all of the 
t's that originally got assigned the same "add_elements" 
set.

If I go back to the original Debtags1.py, and replace

self.items[t] = add_elements

with

self.items[t] = add_elements.copy()

then the later updates-in-place do no harm, and it 
produces the output you said you expected.

If you don't understand this, here's a dead simple 
example:

>>> x = set([1])
>>> y = x  # y and x are the *same* set now
>>> x |= set([2])  # so mutating x in place ...
>>> x
set([1, 2])
>>> y   # ... also mutates the set bound to y
set([1, 2])
>>>
msg23557 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2004-12-07 11:35
Logged In: YES 
user_id=80475

On line 39, replace
    self.items[t] = add_elements
with
    self.items[t] = add_elements.copy()

That will fix all three.
History
Date User Action Args
2022-04-11 14:56:08adminsetgithub: 41288
2004-12-07 05:36:05dokocreate