This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: GIL not released around getaddrinfo()
Type: Stage:
Components: Extension Modules Versions:
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: jvr Nosy List: jvr, loewis, mhammond
Priority: normal Keywords:

Created on 2003-05-02 23:23 by jvr, last changed 2022-04-10 16:08 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
getaddrinfo_gil.patch jvr, 2003-05-03 18:02 getaddrinfo vs. the GIL path #1
getaddrinfo_gil.patch jvr, 2003-05-06 20:51 getaddrinfo vs. the GIL path #2
Messages (13)
msg15843 - (view) Author: Just van Rossum (jvr) * (Python triager) Date: 2003-05-02 23:23
I have a situation where sock.connect() blocks in getaddrinfo(), but 
since the GIL isn't released around getaddrinfo() calls, other threads 
also block.
msg15844 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2003-05-03 09:03
Logged In: YES 
user_id=21627

This is not a bug. getaddrinfo isn't thread-safe on some
systems, so we must protect it from multiple access.
msg15845 - (view) Author: Just van Rossum (jvr) * (Python triager) Date: 2003-05-03 10:09
Logged In: YES 
user_id=92689

Isn't that an orthogonal issue? In that we should then use a lock for 
getaddrinfo(), yet do release the GIL.

I find it unacceptable that my GUI main thread can hang for several seconds 
(or more) just by doing a sock.connect() in a different thread.

Short of that, can we find out on _which_ platforms getaddrinfo() isn't 
thread-safe and work around that?
msg15846 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2003-05-03 12:18
Logged In: YES 
user_id=21627

See python.org/sf/604210.

Yes, using a separate lock would be possible. At a minimum,
the BSD implementation and the getaddrinfo emulation are not
thread-safe; there may be more. 

As I write in 604210: it would be fine if a separate lock is
used only on the known-bad platforms, since the RFC states
getaddrinfo is thread-safe.

Would you like to work on a patch?
msg15847 - (view) Author: Mark Hammond (mhammond) * (Python committer) Date: 2003-05-03 12:30
Logged In: YES 
user_id=14198

How about simply not releasing the GIL on these platforms? 
Seems simpler than a new lock.
msg15848 - (view) Author: Just van Rossum (jvr) * (Python triager) Date: 2003-05-03 13:40
Logged In: YES 
user_id=92689

Since my platform is BSD-ish, I want a lock ;-)

Yes, I would like to work on a patch, except I will need some hand-holding, 
as I've never done anything with threads from C. Can you point me to 
examples of such a lock in the Python source code?
msg15849 - (view) Author: Just van Rossum (jvr) * (Python triager) Date: 2003-05-03 13:56
Logged In: YES 
user_id=92689

(Never mind, I found plenty of examples in sockemodule.c itself.)
msg15850 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2003-05-03 14:00
Logged In: YES 
user_id=21627

Mark: the original complaint came from BSD users. So if we
only release the GIL on a "good" platform, the BSD users
aren't helped. Likewise, unless compiled by VC7.x, Windows
users are not helped, since Python would use the getaddrinfo
emulation. So if this as attempted, I think we need a
best-effort solution for all systems.

You could look at the Tcl lock. I recommend you look at the
2.2 sources, since the Tcl lock logic got more complicated
recently (as Tcl now also has a multi-threaded mode of
operation).

If your platform is Mac OS X, don't trust that it is too
BSDish in this respect. If both FreeBSD and NeXT had some
function, Apple apparently has always chosen the NeXT
version. getaddrinfo was one such case. But then, it appears
that the NeXT version is even worse than the FreeBSD version.
msg15851 - (view) Author: Just van Rossum (jvr) * (Python triager) Date: 2003-05-03 18:02
Logged In: YES 
user_id=92689

I've attached a first cut of a patch. It implements wrappers for getaddrinfo 
and getnameinfo, to take the burden of dealing with the lock and the GIL 
away from the caller. As suggested by MvL in prvt mail, I've simply reused 
the existing gethostbyname lock, but renamed it to netdb_lock.

What needs to be reviewed carefully is when exactly USE_NETDB_LOCK 
(formerly known as USE_GETHOSTBYNAME_LOCK) is defined. It happens to 
be defined for my platform, so it "works for me".
msg15852 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2003-05-03 21:45
Logged In: YES 
user_id=21627

The patch is incorrect: usage of locked getaddrinfo depends
on whether gethostbyname_r is supported on the platform.
However, this is unrelated: A platform may have
gethostbyname_r, and still not provide a thread-safe
getaddrinfo, or it may not have gethostbyname_r, and yet
provide a thread-safe getaddrinfo.

I suggest you separate issues:
1. Should the lock variable be declared and initialized?
    You have two options: a) only create if needed (see below),
    or b) always create if Python is threaded
2. Should getaddrinfo use the lock?
    In general, it shouldn't, except for a few known-bad
systems.
3. Should gethostbyname use the lock? Yes, 
    unless the reentrant versions of this function are used.
    Notice that for Windows, gethostbyname itself is
thread-safe, 
    as it allocates a thread-local hostent.

I think the lock management code is also incorrect: There is
a certain moment after getaddrinfo returns where neither the
GIL nor the getaddrinfo lock is held, yet Python accesses
the getaddrinfo result afterwards. If getaddrinfo returns a
pointer to global data, a second call could modify the
result even though we have not processed it yet.

I'm not sure how to solve this. It is probably safest to
keep the getaddrinfo lock until freeaddrinfo has been called.

Also, for systems that don't lock getaddrinfo, I would
prefer if py_getaddrinfo was a define for getaddrinfo.
msg15853 - (view) Author: Just van Rossum (jvr) * (Python triager) Date: 2003-05-06 20:51
Logged In: YES 
user_id=92689

Here's a new attempt. Martin and I went back and forth a few times, and 
what I now ended up with isn't all _that_ different from the initial patch, 
except it no longer uses wrapped functions.
msg15854 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2003-05-09 07:44
Logged In: YES 
user_id=21627

The patch looks fine, and works on Linux for me. Please
apply it.
msg15855 - (view) Author: Just van Rossum (jvr) * (Python triager) Date: 2003-05-09 07:54
Logged In: YES 
user_id=92689

Thanks! Applied as socketmodule.c rev. 1.265.
History
Date User Action Args
2022-04-10 16:08:33adminsetgithub: 38435
2003-05-02 23:23:59jvrcreate