This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: random.py/os.urandom robustness
Type: Stage:
Components: Library (Lib) Versions: Python 2.4
process
Status: closed Resolution: accepted
Dependencies: Superseder:
Assigned To: georg.brandl Nosy List: astrand, georg.brandl, georg.brandl, gvanrossum, jafo, lcaamano, loewis, majid, pocket_aces, rhettinger
Priority: normal Keywords:

Created on 2005-04-06 01:03 by majid, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
os-urandomfd.diff georg.brandl, 2005-07-04 14:26
Messages (21)
msg24890 - (view) Author: Fazal Majid (majid) Date: 2005-04-06 01:03
Python 2.4.1 now uses os.urandom() to seed the random
number generator. This is mostly an improvement, but
can lead to subtle regression bugs.

os.urandom() will open /dev/urandom on demand, e.g.
when random.Random.seed() is called, and keep it alive
as os._urandomfd.

It is standard programming practice for a daemon
process to close file descriptors it has inherited from
its parent process, and if it closes the file
descriptor corresponding to os._urandomfd, the os
module is blissfully unaware and the next time
os.urandom() is called, it will try to read from a
closed file descriptor (or worse, a new one opened
since), with unpredictable results.

My recommendation would be to make os.urandom() open
/dev/urandom each time and not keep a persistent file
descripto. This will be slightly slower, but more
robust. I am not sure how I feel about a standard
library function steal a file descriptor slot forever,
specially when os.urandom() is probably going to be
called only once in the lifetime of a program, when the
random module is seeded.
msg24891 - (view) Author: Fazal Majid (majid) Date: 2005-04-06 01:06
Logged In: YES 
user_id=110477

There are many modules that have a dependency on random, for
instance os.tempnam(), and a program could well
inadvertently use it before closing file descriptors.
msg24892 - (view) Author: Sean Reifschneider (jafo) * (Python committer) Date: 2005-04-06 03:11
Logged In: YES 
user_id=81797

Just providing some feedback:

I'm able to reproduce this.  Importing random will cause
this file descriptor to be called.  Opening urandom on every
call could lead to unacceptable syscall overhead for some. 
Perhaps there should be a "urandomcleanup" method that
closes the file descriptor, and then random could get the
bytes from urandom(), and clean up after itself?

Personally, I only clean up the file descriptors I have
allocated when I fork a new process.  On the one hand I
agree with you about sucking up a fd in the standard
library, but on the other hand I'm thinking that you just
shouldn't be closing file descriptors for stuff you'll be
needing.  That's my two cents on this bug.
msg24893 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2005-04-06 07:11
Logged In: YES 
user_id=21627

To add robustness, it would be possible to catch read errors
from _urandomfd, and try reopening it if it got somehow closed.
msg24894 - (view) Author: Sean Reifschneider (jafo) * (Python committer) Date: 2005-04-06 07:20
Logged In: YES 
user_id=81797

Yeah, I was thinking the same thing.  It doesn't address the
consumed file handle, but it does address the "robustness"
issue.  It complicates the code, but should work.
msg24895 - (view) Author: Fazal Majid (majid) Date: 2005-04-06 07:27
Logged In: YES 
user_id=110477

Unfortunately, catching exceptions is not sufficient - the
file descriptor may have been reassigned. Fortunately in my
case, to a socket which raised ENOSYS, but if it had been a
normal file, this would have been much harder to trace
because reading from it would cause weird errors for readers
of the reassigned fd without triggering an exception in
os.urandom() itself.

As for not closing file descriptors you haven't opened
yourself, if the process is the result of a vfork/exec (in
my case Python processes started by a cluster manager, sort
of like init), the child process has no clue what file
descriptors, sockets or the like it has inherited from its
parent process, and the safest course is to close them all.
Indeed, that's what W. Richard Stevens recommends in
"Advanced Programming for the UNIX environment".

As far as I can tell, os.urandom() is used mostly to seed
the RNG in random.py, and thus is not a high-drequency
operation. It is going to be very hard to document this
adequately for coders to defend against - in my case, the
problem was being triggered by os.tempnam() from within
Webware's PSP compiler. There are so many functions that
depend on random (sometimes in non-obvious ways), you can't
flag them all so their users know they should use
urandomcleanup.

One possible solution would be for os.py to offer a
go_daemon() function that implements the fd closing, signal
masking, process group and terminal disassociation required
by true daemons. This function could take care of internal
book-keeping like calling urandomcleanup.
msg24896 - (view) Author: Sean Reifschneider (jafo) * (Python committer) Date: 2005-04-06 09:04
Logged In: YES 
user_id=81797

The child is a copy of the parent.  Therefore, if in the
parent you open a few file descriptors, those are the ones
you should close in the child.  That is exactly what I've
done in the past when I forked a child, and it has worked
very well.

I suspect Stevens would make an exception to his guideline
in the event that closing a file descriptor results in
library routine failures.
msg24897 - (view) Author: Luis P Caamano (lcaamano) Date: 2005-04-19 16:53
Logged In: YES 
user_id=279987

We're facing this problem.  We're thinking of patching our os.py 
module to always open /dev/urandom on every call.  Does 
anybody know if this would have any bad consequences other 
than the obvious system call overhead?

BTW, here's the traceback we get.  As you probably can guess, 
something called os.urandom before we closed all file descriptors 
in the daemonizing code and it then failed when os.urandom 
tried to use the cached fd.

 Traceback (most recent call last):
   File  "/opt/race/share/sw/common/bin/dpmd", line 27, in ?
 dpmd().run()
   File  "Linux/CommandLineApp.py", line 336, in run
   File  "Linux/daemonbase.py", line 324, in main
   File  "Linux/server.py", line 61, in addServices
   File  "Linux/dpmd.py", line 293, in __init__
   File  "Linux/threadutils.py", line 44, in start
   File  "Linux/xmlrpcd.py", line 165, in createThread
   File  "Linux/threadutils.py", line 126, in __init__
   
File  "/opt/race/share/sw/os/Linux_2.4_i686/python/lib/python2.4/t
empfile.py", line 423, in NamedTemporaryFile
 dir = gettempdir()
   
File  "/opt/race/share/sw/os/Linux_2.4_i686/python/lib/python2.4/t
empfile.py", line 262, in gettempdir
 tempdir =  _get_default_tempdir()
   
File  "/opt/race/share/sw/os/Linux_2.4_i686/python/lib/python2.4/t
empfile.py", line 185, in _get_default_tempdir
 namer =  _RandomNameSequence()
   
File  "/opt/race/share/sw/os/Linux_2.4_i686/python/lib/python2.4/t
empfile.py", line 121, in __init__
 self.rng = _Random()
   
File "/opt/race/share/sw/os/Linux_2.4_i686/python/lib/python2.4/r
andom.py", line 96, in __init__
 self.seed(x)
   
File "/opt/race/share/sw/os/Linux_2.4_i686/python/lib/python2.4/r
andom.py", line 110, in seed

 a =  long(_hexlify(_urandom(16)), 16)

   
File  "/opt/race/share/sw/os/Linux_2.4_i686/python/lib/python2.4/
os.py", line  728, in urandom 
 bytes +=  read(_urandomfd, n - len(bytes))

 OSError : [Errno 9] Bad file  descriptor

msg24898 - (view) Author: Fazal Majid (majid) Date: 2005-04-19 17:49
Logged In: YES 
user_id=110477

Modifying the system os.py is not a good idea. A better
work-around is to skip the /dev/urandom fd when you are
closing all fds. This is the code we use:

def close_fd():
  # close all inherited file descriptors
  start_fd = 3
  # Python 2.4.1 and later use /dev/urandom to seed the
random module's RNG
  # a file descriptor is kept internally as os._urandomfd
(created on demand
  # the first time os.urandom() is called), and should not
be closed
  try:
    os.urandom(4)
    urandom_fd = getattr(os, '_urandomfd', None)
  except AttributeError:
    urandom_fd = None
  if '-close_fd' in sys.argv:
    start_fd = int(sys.argv[sys.argv.index('-close_fd') + 1])
  for fd in range(start_fd, 256):
    if fd == urandom_fd:
      continue
    try:
      os.close(fd)
    except OSError:
      pass
msg24899 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2005-04-19 20:05
Logged In: YES 
user_id=6380

I recommend to close this as invalid. The daemonization code
is clearly broken.
msg24900 - (view) Author: Sean Reifschneider (jafo) * (Python committer) Date: 2005-04-19 22:27
Logged In: YES 
user_id=81797

Perhaps the best way to resolve this would be for the
standard library to provide code that either does the
daemonize process, or at least does the closing of the
sockets that may be done as part of the daemonize, that way
it's clear what the "right" way is to do it.  Thoughts?
msg24901 - (view) Author: Luis P Caamano (lcaamano) Date: 2005-04-20 02:31
Logged In: YES 
user_id=279987

Clearly broken?  Hardly.

Daemonization code is not the only place where it's recommend 
and standard practice to close file descriptors.

It's unreasonable to expect python programs to keep track of all 
the possible file descriptors the python library might cache to 
make sure it doesn't close them in all the daemonization 
routines ... btw, contrary to standard unix programming practices.

Are there any other file descriptors we should know about?

msg24902 - (view) Author: Sean Reifschneider (jafo) * (Python committer) Date: 2005-04-20 02:49
Logged In: YES 
user_id=81797

Conversely, I would say that it's unreasonable to expect
other things not to break if you go through and close file
descriptors that the standard library has opened.
msg24903 - (view) Author: Luis P Caamano (lcaamano) Date: 2005-04-20 12:04
Logged In: YES 
user_id=279987

I don't think so.  One of the rules in libc, the standard C library, 
is that it cannot cache file descriptors for that same reason.  This 
is not new.


msg24904 - (view) Author: Luis P Caamano (lcaamano) Date: 2005-04-20 13:17
Logged In: YES 
user_id=279987

Here's a reference:

 http://tinyurl.com/b8mk3

The relevant post:

============================================

On 25 Feb 2001 10:48:22 GMT Casper H.S. Dik - Network 
Security Engineer <Casper....@holland.sun.com> wrote: 

| 


Solaris at various times used a cached /dev/zero fd both for 
mapping 
| thread stacks and even one for the runtime linker. 
| The runtime linker was mostly fine, but the thread library did 
have 
| problems with people closing fds.  We since added 
MAP_ANON and no 
| longer require open("/dev/zero") .  THe caaching of fds was 
gotten 
| rid of before that. 
| 
| There are valid reasons to close all fds; e.g., if you really don't 
| want to inherit and (you're a daemon and don't care). 
| 
| In most cases, though, the "close all" stuff performed by shells 
| and such at statup serves no purpose.  (Other than causing 
more bugs 


) 

So the dilemma is that closing fds can cause problems and 
leaving 
them open can cause problems, when a forked child does this.  
This 
seems to tell me that hiding fds in libraries and objects is a bad 
idea because processes need to know what is safe to close 
and/or 
what needs to be left open. 

======================================


If the python library had some module or global list of opened file 
descriptors, then it would be OK to expect programs to keep 
those open across fork/exec.  Something like:

 from os import opened_fds

And then it would be no problem to skip those when closing fds.  
Otherwise, your nice daemon code that deals with _urandom_fd 
will break later on when somebody caches another fd 
somewhere else in the standard library.

Also, the proposed os.daemonize() function that knows about its 
own fds would also work.

Still, the most robust solution is not to cache open fds in the 
library or perhaps catch the EBADF exception and reopen.

There are several solutions but closing this bug as invalid doesn't 
seem an appropriate one.


msg24905 - (view) Author: pocket_aces (pocket_aces) Date: 2005-06-08 22:10
Logged In: YES 
user_id=1293510

We ran across this problem when we upgraded to 2.4.  We use
python embedded in a multi-threaded C++ process and use
multiple subinterpreters.  When a subinterpreter shuts down
and the os module unloads, os._urandomfd is not closed
because it is not a file object but rather just an integer.
 As such, after a while, our process had hundreds of
dangling open file descriptors to /dev/urandom.

I would think, at the very least, if this were a file
object, it would be closed when the module was unloaded (the
deallocator for fileobject closes the file).  However, that
doesn't make it any easier for those who are forking
processes.  Probably the best bet is to close it after
reading the data.  If you need a "high performance, multiple
seek" urandom, just open /dev/urandom yourself.

Either way, this bug is not invalid and needs to be addressed.

My 2 cents..
 --J
msg24906 - (view) Author: Peter Åstrand (astrand) * (Python committer) Date: 2005-07-04 13:01
Logged In: YES 
user_id=344921

This bug is a major problem for us as well. This bug also
breaks the subprocess module. Try, for example:

subprocess.Popen(["ls"], close_fds=1, preexec_fn=lambda:
os.urandom(4))

I agree with lcaamano; the library should NOT cache a file
descriptor by default. Correctness and robustness is more
important than speed. 

Has anyone really been able to verify that the performance
with opening /dev/urandom each time is a problem? If it is,
we could add a new function to the os module that activates
the fd caching. If you have been calling this function, you
have indicated that you are aware of the problem and will
not close the cached fd. Legacy code will continue to function. 
msg24907 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2005-07-04 14:26
Logged In: YES 
user_id=1188172

Attaching a patch for Lib/os.py, fixing this on Unix.

On Windows, a completely different method is used for
urandom, so I think it is not relevant here.

Please review.
msg24908 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2005-07-04 14:56
Logged In: YES 
user_id=80475

I'm on Windows so cannot be of much use on the patch review.
 It looks fine to me and I agree that something ought to be
done.  MvL reviewed and posted the original patch so he may
be better able to comment on this patch.  Alternatively,
Peter A. can review/approve it.
msg24909 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2005-07-04 15:54
Logged In: YES 
user_id=21627

The patch is fine, please apply - both to the trunk and to 2.4. 
msg24910 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2005-07-04 17:17
Logged In: YES 
user_id=1188172

Committed as Lib/os.py r1.87, r1.83.2.3.
History
Date User Action Args
2022-04-11 14:56:10adminsetgithub: 41809
2005-04-06 01:03:31majidcreate