This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: file.write + closed pipe = no error
Type: behavior Stage:
Components: Interpreter Core Versions: Python 3.1, Python 3.2, Python 2.7
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: akuchling, edemaine, forest_atq, naufraghi, neologix, pitrou, sascha_silbe, schmir
Priority: normal Keywords:

Created on 2006-05-15 16:10 by edemaine, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
test.c edemaine, 2006-05-15 16:10 C program illustrating fwrite behavior
blah.py edemaine, 2006-07-02 12:35 Test case illustrating bug
Messages (12)
msg28534 - (view) Author: Erik Demaine (edemaine) Date: 2006-05-15 16:10
I am writing a Python script on Linux that gets called
via ssh (ssh hostname script.py) and I would like it to
know when its stdout gets closed because the ssh
connection gets killed.  I assumed that it would
suffice to write to stdout, and that I would get an
error if stdout was no longer connected to anything. 
This is not the case, however.  I believe it is because
of incorrect error checking in Objects/fileobject.c's
file_write.

Consider this example:

while True:
__print 'Hello'
__time.sleep (1)

If this program is run via ssh and then the ssh
connection dies, the program continues running forever
(or at least, over 10 hours).  No exceptions are thrown.

In contrast, this example does die as soon as the ssh
connection dies (within one second):

while True:
__os.write (1, 'Hello')
__time.sleep (1)

I claim that this is because os.write does proper error
checking, but file.write seems not to.  I was surprised
to find this intricacy in fwrite().  Consider the
attached C program, test.c.  (Warning: If you run it,
it will create a file /tmp/hello, and it will keep
running until you kill it.)  While the ssh connection
remains open, fwrite() reports a length of 6 bytes
written, ferror() reports no error, and errno remains
0.  Once the ssh connection dies, fwrite() still
reports a length of 6 bytes written (surprise!), but
ferror(stdout) reports an error, and errno changes to 5
(EIO).  So apparently one cannot tell from the return
value of fwrite() alone whether the write actually
succeeded; it seems necessary to call ferror() to
determine whether the write caused an error.

I think the only change necessary is on line 2443 of
file_write() in Objects/fileobject.c (in svn version
46003):

2441        n2 = fwrite(s, 1, n, f->f_fp);
2442        Py_END_ALLOW_THREADS
2443        if (n2 != n) {
2444                PyErr_SetFromErrno(PyExc_IOError);
2445                clearerr(f->f_fp);

I am not totally sure whether the "n2 != n" condition
should be changed to "n2 != n || ferror (f->f_fp)" or
simply "ferror (f->f_fp)", but I believe that the
condition should be changed to one of these
possibilities.  The current behavior is wrong.

Incidentally, you'll notice that the C code has to turn
off signal SIGPIPE (like Python does) in order to not
die right away.  However, I could not get Python to die
by re-enabling SIGPIPE.  I tried "signal.signal
(signal.SIGPIPE, signal.SIG_DFL)" and "signal.signal
(signal.SIGPIPE, lambda x, y: sys.exit ())" and neither
one caused death of the script when the ssh connection
died.  Perhaps I'm not using the signal module correctly?

I am on Linux 2.6.11 on a two-CPU Intel Pentium 4, and
I am running the latest Subversion version of Python,
but my guess is that this error transcends most if not
all versions of Python.
msg28535 - (view) Author: Erik Demaine (edemaine) Date: 2006-05-15 16:26
Logged In: YES 
user_id=265183

One more thing: fwrite() is used in a couple of other
places, and I think the same comment applies to them.  They are:

- file_writelines() in Objects/fileobject.c
- w_string() in Python/marshal.c doesn't seem to have any
error checking?  (At least no ferror() call in marhsal.c...)
- string_print() in Objects/stringobject.c doesn't seem to
have any error checking (but I'm not quite sure what this
means in Python land).
- flush_data() in Modules/_hotshot.c
- array_tofile() in Modules/arraymodule.c
- write_file() in Modules/cPickle.c
- putshort(), putlong(), writeheader(), writetab() [and the
functions that call them] in Modules/rgbimgmodule.c
- svc_writefile() in Modules/svmodule.c
msg28536 - (view) Author: A.M. Kuchling (akuchling) * (Python committer) Date: 2006-06-03 20:16
Logged In: YES 
user_id=11375

I agree with your analysis, and think your suggested fixes are correct.

However, I'm unable to construct a small test case that exercises this bug.  I 
can't even replicate the problem with SSH; when I run a remote script with 
SSH and then kill SSH with Ctrl-C, the write() gets a -1.  Are you terminating 
SSH in some other way?  (I'd really like to have a test case for this problem 
before committing the fix.)
msg28537 - (view) Author: Erik Demaine (edemaine) Date: 2006-07-02 12:35
Logged In: YES 
user_id=265183

A simple test case is this Python script (fleshed out from
previous example), also attached:

import sys, time
while True:
__print 'Hello'
__sys.stdout.flush ()
__time.sleep (1)

Save as blah.py on machine foo, run 'ssh foo python blah.py'
on machine bar--you will see 'Hello' every second--then, in
another shell on bar, kill the ssh process on bar.  blah.py
should still be running on foo.  ('foo' and 'bar' can
actually be the same machine.)

The example from the original bug report that uses
os.write() instead of print was an example that *does* work.
msg28538 - (view) Author: Erik Demaine (edemaine) Date: 2006-08-09 16:13
Logged In: YES 
user_id=265183

Just to clarify (as I reread your question): I'm killing the
ssh via UNIX (or Cygwin) 'kill' command, not via CTRL-C.  I
didn't try, but it may be that CTRL-C works fine.
msg59630 - (view) Author: Ralf Schmitt (schmir) Date: 2008-01-09 22:29
the c program is broken as it does not check the error code of fflush.
The real problem is buffering.

while True:
__print 'Hello'
__time.sleep (1)

will not notice an error until the buffers are flushed.
Running python t.py |head -n2 and killing head does not give me an
error. with PYTHONUNBUFFERED=1 or when using sys.stdout.flush() the
program breaks with:

~/ PYTHONUNBUFFERED=1 python t.py|head -n2                       
ralf@rat64 ok
Hello
Hello
Traceback (most recent call last):
  File "t.py", line 5, in <module>
    print "Hello"
IOError: [Errno 32] Broken pipe
msg59631 - (view) Author: Ralf Schmitt (schmir) Date: 2008-01-09 22:34
ahh.no. the c program does the fflush on the logfile...sorry.
msg126093 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2011-01-12 13:16
This is normal behaviour: stdout is normally line buffered (_IOLBF) only if connected to a tty.
When it's not connected to a tty, it's full buffered (_IOFBF). This is done on purpose for performance reason. To convince yourself, run 

$ cat test.py
for i in range(1, 1000000):
    print('hello world')

$ time python test.py > /tmp/foo

With buffering off (-u option), the same commande takes almost 10 times longer.

If the application wants to be sure to receive a SIGPIPE when the pipe's end is closed, it should just flush stdout explicitely (sys.stdout.flush()).

Suggesting to close.
msg126109 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-01-12 16:10
Agreed with Charles-François, this is normal behaviour since the bytes written on stdout are buffered (up to a certain size). If calling flush() doesn't solve the issue, please reopen the issue.
msg126116 - (view) Author: Erik Demaine (edemaine) Date: 2011-01-12 17:36
msg28537 shows a version with flush, and says that it fails.  I haven't tested since 2006, though, so I can retry, in particular to see whether the patch suggested in the original post has been applied now.
msg126118 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-01-12 18:09
> msg28537 shows a version with flush, and says that it fails

I cannot reproduce. Either with Python 2.5.2 (!), 2.7 or 3.2, on a remote Debian system. Even using "kill -9" on the local ssh process does shut down the remote Python process.
If I comment out the flush() call, it is clearly reproduceable. I would suggest you did something wrong when testing the flush() version.
msg126119 - (view) Author: Erik Demaine (edemaine) Date: 2011-01-12 18:30
I just tested on Python 2.5.2, 2.6.2, and 3.0.1, and I could not reproduce the error (using the code in msg28537).  It would seem that file.flush is catching the problem, even though file.write is ignoring the error, but I can't see any changes since 1.5.2 that would have changed this behavior of file.flush.  So I'm not sure what happened, but at least it seems to no longer be a bug.  Closing.
History
Date User Action Args
2022-04-11 14:56:17adminsetgithub: 43362
2011-01-12 18:30:09edemainesetstatus: pending -> closed
nosy: akuchling, edemaine, pitrou, schmir, naufraghi, forest_atq, sascha_silbe, neologix
messages: + msg126119
2011-01-12 18:09:55pitrousetstatus: open -> pending
nosy: akuchling, edemaine, pitrou, schmir, naufraghi, forest_atq, sascha_silbe, neologix
messages: + msg126118

resolution: not a bug
stage: test needed ->
2011-01-12 17:36:02edemainesetstatus: closed -> open

messages: + msg126116
resolution: not a bug -> (no value)
nosy: akuchling, edemaine, pitrou, schmir, naufraghi, forest_atq, sascha_silbe, neologix
2011-01-12 16:10:12pitrousetstatus: open -> closed

messages: + msg126109
resolution: not a bug
nosy: akuchling, edemaine, pitrou, schmir, naufraghi, forest_atq, sascha_silbe, neologix
2011-01-12 13:16:22neologixsetnosy: + pitrou, neologix
messages: + msg126093
2010-11-12 21:00:53akuchlingsetassignee: akuchling ->
2010-08-03 22:51:14terry.reedysetstage: test needed
versions: + Python 3.1, Python 2.7, Python 3.2, - Python 2.6, Python 2.5
2009-10-24 16:10:58naufraghisetnosy: + naufraghi
type: behavior
2009-10-07 18:23:12forest_atqsetnosy: + forest_atq

versions: + Python 2.6
2009-03-25 12:57:51sascha_silbesetnosy: + sascha_silbe
2008-01-09 22:34:31schmirsetmessages: + msg59631
2008-01-09 22:29:07schmirsetnosy: + schmir
messages: + msg59630
2006-05-15 16:10:06edemainecreate