This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Multiple problems with GC in 2.3.3
Type: Stage:
Components: Interpreter Core Versions:
process
Status: closed Resolution: wont fix
Dependencies: Superseder:
Assigned To: Nosy List: kjetilja, mwh, nnorwitz, tim.peters, washirv
Priority: normal Keywords:

Created on 2004-01-01 19:00 by washirv, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
gdb.trace washirv, 2004-01-01 19:00
Messages (10)
msg19510 - (view) Author: washington irving (washirv) Date: 2004-01-01 19:00
Hi. We're running a multithreaded application that
spiders some web pages, and parses them. We've had 2
types of problems: one where we have the python process
segfault. Another where python spins in an infinite
loop. We are running on FreeBSD 4.8-RELEASE. We have
not had this problem with 2.2. We have this problem
with both 2.3.2 and 2.3.3. This is repeatable, and
we're willing to help in every way to fix this. I've
attached the gdb stack trace for the process that
segfaulted, and the process that spins in an infinte
loop. We attached to it in gdb and ctrl-c'ed to check
the status. There are 2 separate gdb traces in the
attached file.

Thanks
msg19511 - (view) Author: Neal Norwitz (nnorwitz) * (Python committer) Date: 2004-01-01 19:14
Logged In: YES 
user_id=33168

Have you tried building python --with-pydebug?  That may
lead to an assert or some other indication of the problem. 
Do you have a test case to reproduce this problem?
msg19512 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2004-01-01 19:31
Logged In: YES 
user_id=31435

Are you using the pycurl extension module?  If so, you should 
report your problems to the pycurl project too.  I don't know 
whether the problem *is* in pycurl, but the only reports of 
this type ever seen before have come from people using 
pycurl.  If you're not using pycurl, it would be good to know 
that too.
msg19513 - (view) Author: washington irving (washirv) Date: 2004-01-01 20:30
Logged In: YES 
user_id=941550

We are using pycurl. And we have reported it, in case. The
reason I'm reporting it here is that the stack trace does
not involve pycurl in any way. it seems to be python all the
way. I will build with pydebug and report back.

The only test case I have is the program we're running now.
We haven't yet managed to reduce it to a simple test case.
We're working on it. We would be open to figuring out a way
to give you access to the code itself...
msg19514 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2004-01-01 20:56
Logged In: YES 
user_id=31435

Sorry, I can't offer time to help debug this.  Maybe someone 
else can, but since the evidence so strongly points at pycurl 
(see below), it would be best to get someone from that 
project to volunteer.

FYI, *many* incorrect usages of the Python C API first show 
up when cyclic gc is running, simply because gc traverses 
every container object in existence.  gc is thus extremely 
sensitive to coding errors like uninitialized memory, wild 
stores, thread races, and incorrect usage of the C-level GC 
API.  By the time gc suffers the effects of such an error, it's 
*typical* that the code causing the error is long gone, having 
screwed up millions or billions of cycles before gc ran (gc 
doesn't run all that often).

It's also typical that such bugs are eventually traced to 
coding errors in extension modules -- the Python core is too 
heavily exercised by too many users on too many platforms 
for such fundamental bugs to survive long there.

That doesn't mean the Python core can't be at fault, but 
does mean it's unlikely to be in the core.  Add to that that 
the symptoms you report have been reported by, and only by, 
people using pycurl, and the evidence pointing to pycurl is 
simply overwhelming.

At least two earlier reports from pycurl users said Python 2.3 
died with a

    GC object already tracked

fatal error.  That's a new check in 2.3, added to try to catch 
one incorrect usage of the Python C API.  That symptom has 
also been reported only by pycurl users.

BTW, if pycurl also has some sort of debug-mode build option 
(don't know -- haven't used pycurl), it would be good to build 
with that too.
msg19515 - (view) Author: Neal Norwitz (nnorwitz) * (Python committer) Date: 2004-01-01 21:16
Logged In: YES 
user_id=33168

I also don't have enough time to help.  Another way to try
to find the problem is by using valgrind, dmalloc, electric
fence and/or purify or any other memory debugger.
msg19516 - (view) Author: washington irving (washirv) Date: 2004-01-02 02:29
Logged In: YES 
user_id=941550

> That's a new check in 2.3, added to try to catch 
> one incorrect usage of the Python C API.  That symptom has 
> also been reported only by pycurl users.

I'm wondering what this incorrect usage is so I can poke
into the pycurl code myself and take a look and see what's
going on there.
Thanks
msg19517 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2004-01-02 03:15
Logged In: YES 
user_id=31435

I doubt you can guess this easily.  When an object that 
participates in cyclic gc is first created, its gc_refs member is 
set to constant _PyGC_REFS_UNTRACKED.  This tells gc to 
leave this object alone:  it's still (mostly) uninitialized memory 
at this point, so it's not safe for gc to try to do anything with 
it.

After its creator has initialized the object's memory to a sane 
state, the creator must call PyObject_GC_Track() to tell the 
memory system that the object is no longer insane, and 
specifically that it's now safe to call this object's tp_traverse 
method.  That's where the new error message can happen:  if 
the object's gc_refs member is *not* 
_PyGC_REFS_UNTRACKED when PyObject_GC_Track() is 
called, Python aborts with a "GC object already tracked" fatal 
error.

The most obvious way for this to happen is to call 
PyObject_GC_Track() more than once on the same object, 
without an intervening call to PyObject_GC_UnTrack().

That's not the most *likely* way for this to happen, though.  
The most likely is for a wild store to overwrite the gc_refs 
member by mistake.  I've seen that happen in core Python 
during development, but never (so far) in a released Python.  
I've also seen it happen during development of the C 
extension modules used in Zope.

Random hint:  if you do

import gc
gc.set_threshold(1)

that will greatly increase the frequency with which gc runs.  
While gc is almost never at fault when something blows up 
while gc is running (that's just historical fact -- that code is 
solid), as I said before, the true cause of the blowup typically 
happened long ago.  Making gc run much more frequently can 
often help provoke the blowup into happening much closer to 
the time the real damage was done.  It's still unlikely to show 
up in the stack trace at the time of the blowup, though.
msg19518 - (view) Author: Kjetil Jacobsen (kjetilja) Date: 2004-01-13 15:07
Logged In: YES 
user_id=5685

This is a known pycurl issue.  The problem is avoided by
turning off the gc tracking code in pycurl which is now the
default behaviour in pycurl (i.e. in the current cvs
version) until the problems related to the gc tracking have
been resolved properly.

So, as an intermediate solution, use the current cvs version
of pycurl.
msg19519 - (view) Author: Michael Hudson (mwh) (Python committer) Date: 2004-01-13 15:13
Logged In: YES 
user_id=6656

Ooh, it's Someone Else's Problem.
History
Date User Action Args
2022-04-11 14:56:01adminsetgithub: 39749
2004-01-01 19:00:53washirvcreate