This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: add non-blocking read and write methods to subprocess.Popen
Type: enhancement Stage: resolved
Components: asyncio, Library (Lib) Versions: Python 3.5
process
Status: closed Resolution: out of date
Dependencies: Superseder:
Assigned To: Nosy List: Andrew.Boettcher, ajaksu2, akira, astrand, cvrebert, ericpruitt, eryksun, giampaolo.rodola, gvanrossum, janzert, josiahcarlson, martin.panter, ooooooooo, parameter, r.david.murray, rosslagerwall, sbt, techtonik, v+python, vstinner, yselivanov
Priority: normal Keywords: patch

Created on 2005-04-28 20:40 by josiahcarlson, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
sp_diff.txt josiahcarlson, 2005-06-26 19:47
subprocess.patch josiahcarlson, 2014-04-06 20:41 Add non-blocking subprocess access review
subprocess_2.patch josiahcarlson, 2014-04-09 16:40 replace select() with selectors review
subprocess_3.patch josiahcarlson, 2014-04-13 00:33 Remove looping write_nonblocking(), addresses comments review
subprocess_4.patch josiahcarlson, 2014-04-14 21:50 Fixes Windows writes, improves docs, removes universal newlines with non-blocking IO review
subprocess_5.patch josiahcarlson, 2014-04-15 00:49 Fixes a partial comment on a new test review
subprocess_6.patch josiahcarlson, 2014-04-17 07:08 Addressed comments, multi-platform cleanup review
subprocess_7.patch josiahcarlson, 2014-05-30 00:04 Removes tulip/asyncio related changes review
subprocess_8.patch josiahcarlson, 2015-03-25 18:32 review
Messages (64)
msg54505 - (view) Author: Josiah Carlson (josiahcarlson) * (Python triager) Date: 2005-04-28 20:40
It would be terribly nice if the Popen class in the
subprocess module would allow a programmer to easily
say "send some data right now (if I have some to send)
and receive whatever information is available right
now".  Essentially the equivalent of
asyncore.loop(count=1), only that it returns the data
received, instead of placing it in a buffer.

Why would this functionality be useful?  Currently,
when using the subprocess module with pipes, the
interaction with a pipe is limited to "send data if
desired, close the subprocess' stdin, read all output
from the subprocess' stdout, ...". 

Certainly one can pull the pipes out of the Popen
instance, and perform the necessary functions on posix
systems (with select or poll), but the additional magic
on WIndows is a little less obvious (but still possible).

There is a post by Paul Du Bois with an example using
win32pipe.PeekNamedPipe:
http://groups-beta.google.com/group/comp.lang.python/msg/115e9332cc1ca09d?hl=en

And a message from Neil Hodgeson stating that
PeekNamedPipe works on anonymous pipes:
http://mail.python.org/pipermail/python-dev/2000-May/004200.html


With this modification, creating Expect-like modules
for any platform, as well as embedded shells inside any
GUI with a text input widget, becomes easy.  Heck, even
Idle could use it rather than a socket for its
interactive interpreter.
msg54506 - (view) Author: Dave Schuyler (parameter) Date: 2005-04-28 23:38
Logged In: YES 
user_id=473331

More control of sub-processes would be great.  Several times
in the past few years I've done stuff with os.system,
popenN, or spawn* and it would be useful to have even more
cross-platform consistency and support in this area. 
Anyway, this is just a vote to encurage other developers.
msg54507 - (view) Author: Josiah Carlson (josiahcarlson) * (Python triager) Date: 2005-05-28 23:22
Logged In: YES 
user_id=341410

I've got a version of subprocess that has this functionality
with pywin32.  Making it work on *nix systems with usable
select is trivial.

About the only question is whether the functionality is
desireable, and if so, what kind of API is reasonable.

Perhaps adding an optional argument 'wait_for_completion' to
the communicate method, which defaults to True for executing
what is currently in the body of communicate.

If the 'wait_for_completion' argument is untrue, then it
does non-waiting asynchronous IO, returning either a 2 or
3-tuple on completion.  If input is None, it is a 2-tuple of
(stdout, stderr).  If input is a string, it is a 3-tuple of
(bytes_sent, stdout, stderr).

How does that sound?
msg54508 - (view) Author: Josiah Carlson (josiahcarlson) * (Python triager) Date: 2005-05-29 00:15
Logged In: YES 
user_id=341410

I suppose I should mention one side-effect of all this.  It
requires three more functions from pywin32 be included in
the _subprocess driver; win32file.ReadFile,
win32file.WriteFile, and win32pipe.PeekNamedPipe .  Read and
Peek are for reading data from stdout and stderr, and Write
is for the support for partial writes to stdin.
msg54509 - (view) Author: Josiah Carlson (josiahcarlson) * (Python triager) Date: 2005-06-26 19:47
Logged In: YES 
user_id=341410

How about if subprocesses have 3 new methods, send(input),
recv(maxlen), and recv_stderr(maxlen).

send(input) would perform like socket.send(), sending as
much as it currently can, returning the number of bytes sent.
recv(maxlen) and recv_stderr(maxlen) would recieve up to the
provided number of bytes from the stdout or stderr pipes
respectively.

I currently have an implementation of the above on Windows
and posix.  I include the context diff against revision 1.20
of subprocess.py in current CVS.
msg54510 - (view) Author: Josiah Carlson (josiahcarlson) * (Python triager) Date: 2005-09-21 20:51
Logged In: YES 
user_id=341410

I've implemented this as a subclass of subprocess.Popen in
the Python cookbook, available here:
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/440554

Essentially this is a request for inclusion in the standard
library for Python 2.5 .
msg54511 - (view) Author: Josiah Carlson (josiahcarlson) * (Python triager) Date: 2005-09-21 20:55
Logged In: YES 
user_id=341410

I've implemented this as a subclass of subprocess.Popen in
the Python cookbook, available here:
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/440554
msg54512 - (view) Author: Benjamin (ooooooooo) Date: 2006-12-30 21:45
I would also like to see this feature. I'm using Josiah Carlson's recipe for the time being. I'm agnostic about whether the asynchronicity is a "mode" or just a case of using different functions. However, if the former is chosen, then one should be able to switch modes at will, because sometimes I want blocking and sometimes I don't.
msg54513 - (view) Author: Josiah Carlson (josiahcarlson) * (Python triager) Date: 2006-12-30 23:21
The way subprocess is currently written, if you were to use a blocking semantic, you would no longer be able to use an asynchronous semantic afterwards (it will read the result until there is nothing more to read).  Note that this is the case whether it is mode or method based.
msg54514 - (view) Author: Benjamin (ooooooooo) Date: 2006-12-31 15:19
If there were a blocking "read x bytes" type call, could you not do some non-blocking stuff afterwards? If you use the "read" method, with a byte limit, directly on the stdout member of Popen, that could return with your bytes, and then you do some non-blocking stuff afterwards. That's the external view of course, I suspect that you're saying there's some internal lower-level reason this is currently not possible.

Thanks for your interim class BTW, it is proving to be very useful.
msg54515 - (view) Author: Josiah Carlson (josiahcarlson) * (Python triager) Date: 2006-12-31 18:04
Unless you use PeekNamedPipe on Windows, there is no guarantee that the pipe will return *any* data until the process ends, even if you say pipe.read(1) and there is 1k of data already sent.

Note that the two async reading methods that I provide (recv() and recv_err()) take a 'maximum bytes' argument.  Just like I have written a 'send_all()' utility function, I (or you) can easily write a 'recv_exact()' utility function that receives an exact number of bytes before returning.  That functionality would be more or less required for one of the expected use-cases I specify in the recipe, "writing a multi-platform 'expect' module".

Stick with async calls (with the utility recv_exact()) until you need to use the .communicate() method.
msg84692 - (view) Author: Daniel Diniz (ajaksu2) * (Python triager) Date: 2009-03-30 22:36
Josiah: can this be closed?
msg84719 - (view) Author: Josiah Carlson (josiahcarlson) * (Python triager) Date: 2009-03-30 23:41
I don't believe this should be closed.  The functionality is still 
desired by me and others who have posted on and off since the patch was 
created.  This patch definitely needs some love (tests mostly).
msg89309 - (view) Author: Eric Pruitt (ericpruitt) * Date: 2009-06-13 01:50
Hello, I am working on this patch and some additional features for my
Google Summer of Code project (subdev.blogspot.com) and will eventually
attempt to get the code committed to Python 3.1 or 3.2 and Python 2.7. I
will have the unit tests completed shortly and will post a patch for
Python 2.7 and / or Python 3.1rc1.
msg89433 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2009-06-16 17:13
Retargeting to 3.2 since 3.1 is now feature-frozen.
msg114495 - (view) Author: Mark Lawrence (BreamoreBoy) * Date: 2010-08-21 14:14
PEP 3145 has been written in response to this request.
msg161326 - (view) Author: anatoly techtonik (techtonik) Date: 2012-05-22 05:44
What is the status? What is required to get this into 3.3? See also issue14872.
msg161352 - (view) Author: Richard Oudkerk (sbt) * (Python committer) Date: 2012-05-22 14:17
Comments on Josiah's patch:

* It uses pywin32 for PeekNamedPipe -- this is now available from _winapi.

* I don't think send(), recv() and recv_exact() will work correctly if
  buffering is used -- an error should be raised in this case.

* I think send(), recv(), recv_exact() should raise errors if the
  associated stream is None instead of returning 0 or ''.


Additional comments on http://code.google.com/p/subprocdev

* In a few places the code does "raise Exception(...)" -- some other 
  exception class should be used.

* It is not clear to me why TextIOWrapper should try to implement seek()
  and tell().  (The file-like wrapper for a socket does not.)

* I don't like the hardwired timeouts for faking asynchronicity:
  Popen.asyncread(), Popen.asyncwrite() use 0.1 seconds, and
  TextIOWrapper.read() uses 1.25 seconds.

* There is no binary counterpart to TextIOWrapper.

* Using fcntl.fcntl() to switch back and forth between blocking and
  unblocking is unnecessary.  select() guarantees the read/write on a pipe
  will succeed without blocking.  (The same is not necessarily true with
  a socket.)

* __closecheck() should be renamed _closecheck().

* FileWrapper is just a wrapper for TextIOWrapper, except that it has an
  ignored newlines argument.

* I can't see any documentation.
msg161356 - (view) Author: Richard Oudkerk (sbt) * (Python committer) Date: 2012-05-22 14:27
Personally, I would factor out the code for Popen.communicate() in to a Communicator class which wraps a Popen object and has a method

    communicate(input, timeout=None) -> (bytes_written, output, error)

On Windows this would use threads, and on Unix, select.
msg161362 - (view) Author: Eric Pruitt (ericpruitt) * Date: 2012-05-22 16:07
There's documentation, but you have to switch to the Python 3k branch --
http://code.google.com/p/subprocdev/source/browse/?name=python3k#hg%2Fdoc.

As for the other criticisms, I'm sure there are plenty of things that need to
be improved upon; I was not a very experienced when I started the project.
msg161400 - (view) Author: Ross Lagerwall (rosslagerwall) (Python committer) Date: 2012-05-23 11:05
> Personally, I would factor out the code for Popen.communicate() in to a > Communicator class which wraps a Popen object and has a method
>
>    communicate(input, timeout=None) -> (bytes_written, output, error)

How would this differ from the normal communicate()?

It seems like there are two different ideas for why people want an "asynchronous subprocess":

One is that they want to use communicate() but not be limited by memory issues.
I think a good API for this case is an asyncore style API or like the one from the patch in issue1260171.

Another use case is for an expect-type interface where you read and write based on a timeout or some kind of delimiter like a newline.

These should probably be addressed independently.

See also issue10482.
msg161409 - (view) Author: Richard Oudkerk (sbt) * (Python committer) Date: 2012-05-23 13:52
> How would this differ from the normal communicate()?

It would block until one of the following occurs:

* some data has been written to stdin,
* some data has been read from stdout or stderr, or
* timeout passes (if timeout is not None).

The normal communicate() could be implemented by running this in a loop.  The amount of data returned at once could be limited by an argument of the constructor.
msg177139 - (view) Author: anatoly techtonik (techtonik) Date: 2012-12-08 00:12
Can anybody write a sum up on this?
msg215171 - (view) Author: Josiah Carlson (josiahcarlson) * (Python triager) Date: 2014-03-30 06:53
Due to some rumblings over on the mentors list and python-dev, this is now getting some love.

Guido has stated that something should make it into the subprocess module for 3.5 in this email: https://groups.google.com/d/msg/dev-python/I6adJLIjNHk/Usrvxe_PVJIJ

His suggested API is:
proc.write_nonblocking(data)
data = proc.read_nonblocking()
data = proc.read_stderr_nonblocking()


I've implemented the API for Windows and Linux, and below are my findings.


On the Linux side of things, everything seems to be working more or less as expected, though in writing tests I did need to add an optional argument to qcat.py in order to have it flush its stdout to be able to read from the parent process. Odd, but whatever.

Also, Richard Oudkerk's claims above about not needing to use fcntl to swap flags is not correct. It's necessary to not block on reading, even if select is used. Select just guarantees that there is at least 1 byte or a closed handle, not that your full read will complete.


On the Windows side of things, my assumptions about WriteFile() were flawed, and it seems that the only way to make it actually not block if/when the outgoing buffer is full is to create a secondary thread to handle the writing*. I've scoured the MSDN docs and there doesn't seem to be an available API for interrogating the pipe to determine whether the buffer is full, how full it is, or whether the write that you'd like to do will succeed or block.

* This applies even with the use of asyncio. Because the intent for the call is to return more or less immediately, a thread still has to be created to handle the event loop for IO completion, which means that asyncio can't prevent the use of threads and not block without basically integrating an event loop into the caller.

Considering that the Windows communicate() method already uses threads to handle reading and writing, I don't believe it is a big deal to add it for this situation too. Any major objections?
msg215178 - (view) Author: Richard Oudkerk (sbt) * (Python committer) Date: 2014-03-30 11:28
Using asyncio and the IOCP eventloop it is not necessary to use threads.  (Windows may use worker threads for overlapped IO, but that is hidden from Python.)  See

  https://code.google.com/p/tulip/source/browse/examples/child_process.py

for vaguely "expect-like" interaction with a child python process which works on Windows.  It writes commands to stdin, and reads results/tracebacks from stdout/stderr.

Of course, it is also possible to use overlapped IO directly.
msg215411 - (view) Author: Josiah Carlson (josiahcarlson) * (Python triager) Date: 2014-04-03 00:25
Had some time to work on this today.

I was missing something in my earlier versions of the code, and have managed to get overlapped IOs to work, so at least I'm not quite so far behind the dozen or so core developers who know more about the Windows pieces than I do. Richard, thank you for the post, I wasn't looking hard enough for how to get overlapped IOs to work, and your message made me look harder.

On Linux, it is trivial to support the blocking communicate() and non-blocking additions. There's only one weirdness, and that's the fcntl bit flipping during write.

On Windows, it's not all that difficult to switch to using overlapped IOs for *all* writes, and maybe even for communicate()-based reads, which would remove the need for threads. Ironically enough, non-blocking reads actually *don't* require overlapped IO thanks to PeekNamedPipe, which could even be used to cut the number of extra threads from 2 to 1 in communicate().

Now that I've got it working, I can do one of the following (from least changes to the most):
1. Add a "nonblocking" flag, which pre-flips the fcntl bit in Linux and uses overlapped IO on writes in Windows - this would be documented to remove the ability to call communicate()
2. As an alternative to #1, I can create a new class that lacks the communicate() method and adds the non-blocking methods
3. Gut/rewrite the Windows-based communicate() function to use overlapped IO on everything, removing the threads, and making it at least superficially like Linux (prepare your overlapped IO, then use WaitForMultipleObjects() to multiplex), while also adding the non-blocking methods

Unless someone brings up an important counterpoint, I'll work on #3 tonight or tomorrow evening to get an initial code/test patch, with docs coming shortly after (if not immediately).
msg215420 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2014-04-03 02:55
multiprocessing.connection uses the _winapi module for overlapped I/O and named pipes in case you're looking for examples:

http://hg.python.org/cpython/file/3.4/Lib/multiprocessing/connection.py
msg215426 - (view) Author: Richard Oudkerk (sbt) * (Python committer) Date: 2014-04-03 07:38
I would recommended using _overlapped instead of _winapi.
I intend to move multiprocessing over in future.

Also note that you can do nonblocking reads by starting an overlapped read
then cancelling it immediately if it fails with "incomplete".  You will
need to recheck if it completes anyway because of a race.
msg215496 - (view) Author: Josiah Carlson (josiahcarlson) * (Python triager) Date: 2014-04-04 07:01
Quick update before I head to bed.

Thank you for the input, I had gotten the individual async calls working a couple days ago, and I was just working to replace the communicate() method for Windows.

Yes, I'm using asyncio._overlapped, though asyncio uses subprocessing, so I needed to defer import of _overlapped until one of a couple crucial methods were being called in order to prevent issues relating to circular imports.

I also ended up moving asyncio.windows_utils.pipe out to subprocess, as copying/pasting it for a third 'create an overlapped-io pipe' implementation for the standard library just doesn't make a lot of sense, and another circular import seemed like a bad idea. If/when subprocess goes away completely, someone else can move the function back.

With overlapped IOs able to be cancelled, it's not all that bad to make a completely re-entrant communicate() without threads, though I made a few mistakes in my first pass tonight that I need to fix tomorrow.
msg215582 - (view) Author: Josiah Carlson (josiahcarlson) * (Python triager) Date: 2014-04-05 06:52
All of the standard tests plus another few that I added all pass on Windows and Linux. I've got some cleanup and a couple more tests to add tomorrow, then I'll post a patch.

I ended up not using any overlapped IO cancellation in the Windows variant of communicate(), as there is no benefit to doing so, and the function is as re-entrant as the original.
msg215668 - (view) Author: Josiah Carlson (josiahcarlson) * (Python triager) Date: 2014-04-06 20:41
Should have uploaded this yesterday, but I got caught up with typical weekend activities. The docs probably need more, and I hear that select.select() is disliked, so that will probably need to be changed too.
msg215708 - (view) Author: Giampaolo Rodola' (giampaolo.rodola) * (Python committer) Date: 2014-04-07 12:27
Some initial comments here: http://bugs.python.org/review/1191964/#msg1.
Also, if I think about integrating this into an IO loop it seems natural to me to think about timeouts. How complicated would it be to do this?

data = proc.read_nonblocking(timeout=0)
data = proc.read_stderr_nonblocking(timeout=0)
msg215709 - (view) Author: Akira Li (akira) * Date: 2014-04-07 14:33
> Also, Richard Oudkerk's claims above about not needing to use fcntl to swap flags is not correct. It's necessary to not block on reading, even if select is used. Select just guarantees that there is at least 1 byte or a closed handle, not that your full read will complete.

On Linux, `fcntl` is not necessary if you use `os.read(pipe, bufsize)` after `select` instead of `pipe.read(bufsize)`. `os.read` may just return less than `bufsize` bytes if they are not available.
msg215711 - (view) Author: Akira Li (akira) * Date: 2014-04-07 16:15
Could `read_nonblocking()`, `write_nonblocking()` raise OSError(EAGAIN) (it could be named `ResourceTemporarilyUnavailableError`) if read/write would block?

It would allow to distinguish EOF (permanent condition) from "read/write would block" (temporarily condition) without additional API calls.
msg215826 - (view) Author: Josiah Carlson (josiahcarlson) * (Python triager) Date: 2014-04-09 16:40
Submitting an updated patch addressing Giampaolo's comments.
msg215833 - (view) Author: Richard Oudkerk (sbt) * (Python committer) Date: 2014-04-09 18:25
Can you explain why you write in 512 byte chunks.  Writing in one chunk should not cause a deadlock.
msg215956 - (view) Author: Josiah Carlson (josiahcarlson) * (Python triager) Date: 2014-04-12 00:03
I added the chunking for Windows because in manual testing before finishing the patch, I found that large sends on Windows without actually waiting for the result can periodically result in zero data sent, despite a child process that wants to read.

Looking a bit more, this zero result is caused by ov.cancel() followed by ov.getresult() raising an OSError, specifically:
[WinError 995] The I/O operation has been aborted because of either a thread exit or an application request

That causes us to observe on the Python side of things that zero data was sent for some writes, but when looking at what the child process actually received, we discover that some data was actually sent. How much compared to what we thought we sent? That depends. I observed in testing today that the client could receive ~3.5 megs when we thought we had sent ~3 megs.

To make a long story short-ish, using Overlapped IO with WriteFile() and Overlapped.cancel(), without pausing between attempts with either a sleep or something else results in a difference in what we think vs. reality roughly 87% of the time with 512 byte chunks (87 trials out of 100), and roughly 100% of the time with 4096 byte chunks (100 trials out of 100). Note that this is when constantly trying to write data to the pipe. (each trial is as many Popen.write_nonblocking() calls as can complete in .25 seconds)

Inducing a 1 ms sleep between each overlapped.WriteFile() attempt drops the error rate to 0/100 trials and 1/100 trials for 512 byte and 4096 byte writes, respectively. Testing for larger block sizes suggests that 2048 bytes is the largest send that we can push through and actually get correct results.


So, according to my tests, there isn't a method by which we can both cancel an overlapped IO while at the same time guaranteeing that we will account exactly for the data that was actually sent without adding an implicit or explicit delay. Which makes sense as we are basically trying to interrupt another process in their attempts to read data that we said they could read, but doing so via a kernel call that interrupts another kernel call that is doing chunk-by-chunk copies from our write buffer (maybe to some kernel memory then) to their read buffer.

Anyway, by cutting down what we attempt to send at any one time, and forcing delays between attempted sends, we can come pretty close to guaranteeing that child processes don't have any sends that we can't account for. I'll try to get a patch out this weekend that encompasses these ideas with a new test that demonstrates the issue on Windows (for those who want to verify my results).
msg215969 - (view) Author: Richard Oudkerk (sbt) * (Python committer) Date: 2014-04-12 11:32
I added some comments.

Your problem with lost data may be caused by the fact you call ov.cancel() and expect ov.pending to tell you whether the write has/will succeed.  Instead you should use ov.getresult() and expect either success or an "aborted" error.
msg215990 - (view) Author: Josiah Carlson (josiahcarlson) * (Python triager) Date: 2014-04-13 00:33
No, the problem is that that ov.cancel() will attempt to cancel the IO, but a subsequent ov.getresult(True) doesn't always return what was *actually* written to the pipe unless you explicitly wait for the result to be available. But even if you explicitly wait for ov.pending to be False, even if you alternate between checking for ov.pending to be False and for WaitForSingleObject() to return that the event was signaled (which may never actually happen, according to my tests), ov.getresult(True) could *still* raise an exception (that same WinError 995), and we *still* don't know how much data was sent.

As one of your comments on subprocess_2.patch, you said:
> Should probably add an assertion that 512 bytes was written.

That's not always the case. I found several odd byte writes, and some writes that were whatever blocksize I'd chosen minus 32 bytes (though not reported on the write side due to the aforementioned exception/counting error, but the child process read an odd number of bytes).

How about you take a look at the patch I'm uploading now. It gets rid of the loop in write_nonblocking(), as per Giampaolo's request adds a blocksize parameter on reading, and because I was in there, I added a timeout to writing.

If in this new patch I'm doing something wrong, and you can explain via code how to guarantee that ProcessTestCase._test_multiple_passes doesn't fail, I'd really appreciate it.
msg216226 - (view) Author: Josiah Carlson (josiahcarlson) * (Python triager) Date: 2014-04-14 21:50
I ended up eliminating the overlapped IO cancel call on Windows. Better to be correct than to minimize internal state. Instead, we keep a reference to the overlapped IO object, and any attempts to write to the child stdin before the previous overlapped IO completes are kicked back as a 0 byte write. The communicate() method does the right thing with pre-existing non-blocking writes, whether input is passed or not.

I also eliminated universal_newline support for non-blocking reads and writes to prevent error conditions on edge cases:

On the write side of things, you could end up with a partial multi-byte character write, which with universal newlines, would be impossible to finish the send using the public API without first modifying state attributes on the Popen object (disabling universal newlines for a subsequent bytes write_nonblocking() call).

On the read side of things, if you get a partial read of a multi-byte character, then the subsequent decode operation would fail with a UnicodeDecodeError. Though you could pull the original bytes from the exception, that's an awful API to create.
msg216442 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2014-04-16 06:47
I suggest to change the title of the issue to: "subprocess: add non-blocking read and write methods" to avoid the confusion with asyncio subprocess module which runs read and write "in the background" for you.
msg216445 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2014-04-16 07:28
I started to review the patch 5:
http://bugs.python.org/review/1191964/#ps11598

When I read unit tests, I realized that I don't like "write_nonblocking" name. It's too generic. A process has many files (more than just stdin, stdout, stderr: see pass_fds parameter of Popen). I would like an explicit "write_stdin_nonblocking" and "read_stdout_nonblocking".
msg216665 - (view) Author: Josiah Carlson (josiahcarlson) * (Python triager) Date: 2014-04-17 07:08
Victor, I addressed the majority of your comments except for a couple stylistic pieces. Your annoyance with the short poll time for Windows made me re-read the docs from MS, which made me realize that my interpretation was wrong. It also made me confirm Richard Oudkerk's earlier note about ReadFile on overlapped IOs.

I left similar notes next to your comments.

On the method naming side of things, note that you can only write to "stdin", and you can only read from "stdout" or "stderr". This is documented behavior, so I believe the method names are already quite reasonable (and BDFL approved ;)).
msg216669 - (view) Author: Richard Oudkerk (sbt) * (Python committer) Date: 2014-04-17 07:22
If you use the short timeouts to make the wait interruptible then you can
use waitformultipleobjects (which automatically waits on an extra event
object) instead of waitforsingleobject.
msg216692 - (view) Author: Josiah Carlson (josiahcarlson) * (Python triager) Date: 2014-04-17 14:35
Richard: short timeouts are no longer an issue. Nothing to worry about :)
msg218826 - (view) Author: Josiah Carlson (josiahcarlson) * (Python triager) Date: 2014-05-20 00:21
First off, thank you everyone who has reviewed and commented so far. I very much appreciate your input and efforts.

Does anyone have questions, comments, or concerns about the patch? If no one mentions anything in the next week or so, I'll ping the email thread.
msg218861 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2014-05-21 01:27
@Josiah: Modifications of the asyncio module should be done first in the Tulip project because we try to keep the same code base in Tulip, Python 3.4 and 3.5. You may duplicate the code the avoid this dependency?

For the documentation, I don't think that you need ".. note" and ".. warning", just write paragraphs of text.
msg219373 - (view) Author: Josiah Carlson (josiahcarlson) * (Python triager) Date: 2014-05-30 00:04
I submitted an issue to the tulip/asyncio bug tracker: https://code.google.com/p/tulip/issues/detail?id=170

And I am uploading a new patch that only includes non-tulip/asyncio related changes, as tulip/asyncio changes will eventually be propagated to Python.
msg219374 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2014-05-30 00:07
FWIW while the Tulip changes should indeed go into the Tulip repo first, the rest should be committed to the CPython 3.5 tree first. Then I'll commit the Tulip changes, first to the Tulip repo, then to the CPython 3.4 branch (yes!) and then merge that into the 3.5 (default) branch. (Yes, I know it's complicated, but it beats other ways of keeping the Tulip and CPython trees in sync. And I have a script in the Tulip tree that automates most of the work.)
msg219432 - (view) Author: Akira Li (akira) * Date: 2014-05-31 00:57
> Does anyone have questions, comments, or concerns about the patch?

It seems the current API doesn't distinguish between BlockingIOError (temporary error), BrokenPipeError (permanent error) and EOF (permanent
non-error condition) -- everything is treated as EOF. Is it intentional?
msg219616 - (view) Author: Josiah Carlson (josiahcarlson) * (Python triager) Date: 2014-06-02 18:47
First, with the use of Overlapped IO on Windows, BlockingIOError should never come up, and we don't even handle that exception.

As for BrokenPipeError vs. EOF; while we do treat them more or less the same, we aren't killing the process or reporting that the process is dead, and we tell users to check the results of Popen.poll() in the docs.

Could we bubble up BrokenPipeError? Sure. Does it make sense? In the case of Popen.communicate(), exposing the error and changing expected behavior doesn't make sense.

I have implemented and would continue to lean towards continuing to hide BrokenPipeError on the additional API endpoints. Why? If you know that a non-blocking send or receive could result in BrokenPipeError, now you need to wrap all of your communication pieces in BrokenPipeError handlers. Does it give you more control? Yes. But the majority of consumers of this API (and most APIs in general) will experience failure and could lose critical information before they realize, "Oh wait, I need to wrap all of these communication pieces in exception handlers."

I would be open to adding either a keyword-only argument to the methods and/or an instance attribute to enable/disable raising of BrokenPipeErrors during the non-blocking calls if others think that this added level of control. I would lean towards an instance attribute that is defaulted to False.
msg219656 - (view) Author: Akira Li (akira) * Date: 2014-06-03 02:44
> First, with the use of Overlapped IO on Windows, BlockingIOError should never come up, and we don't even handle that exception.

Is it correct that the way to distinguish between "would block" and EOF 
is Popen.poll()? Is it possible to detect it in _nonblocking() methods
 and raise BlockingIOError manually (or return None for read_..()) in "would block" case?

> But the majority of consumers of this API (and most APIs in general) will experience failure and could lose critical information before they realize, "Oh wait, I need to wrap all of these communication pieces in exception handlers."

It is an argument to catch *all* OSErrors internally that is not a good idea.

Popen.communicate() hides EPIPE in the current implementation therefore its behaviour 
shouldn't change. But subprocess' code allows Popen.stdin.write() to raise "broken pipe".
My questions are about _nonblocking() methods e.g., it is not obvious whether 
write_nonblocking() should mask EPIPE as EOF. If it does then It should be documented.
msg223799 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2014-07-23 23:51
> I have implemented and would continue to lean towards continuing to hide BrokenPipeError on the additional API endpoints.

FYI asyncio.Process.communicate() ignores BrokenPipeError and ConnectionResetError, whereas asyncio.Process.stdin.drain() (coroutine to wait until all bytes are written) raises a BrokenPipeError or ConnectionResetError if the child process exited. I think subprocess has the same design.

(I modified recently asyncio to ignore BrokenPipeError in communicate(), it was a bug.)
msg223827 - (view) Author: Akira Li (akira) * Date: 2014-07-24 11:28
STINNER Victor <report@bugs.python.org> writes:
>
>> I have implemented and would continue to lean towards continuing to
> hide BrokenPipeError on the additional API endpoints.
>
> FYI asyncio.Process.communicate() ignores BrokenPipeError and
> ConnectionResetError, whereas asyncio.Process.stdin.drain() (coroutine
> to wait until all bytes are written) raises a BrokenPipeError or
> ConnectionResetError if the child process exited. I think subprocess
> has the same design.

Do Popen.write_nonblocking() and Popen.read_nonblocking() methods
belong to the second category? Should they raise BrokenPipeError?
msg223828 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2014-07-24 11:45
> Do Popen.write_nonblocking() and Popen.read_nonblocking() methods
> belong to the second category? Should they raise BrokenPipeError?

IMO when you write directly to stdin and read from stdout/stderr of a
child process, your code should be written to handle BrokenPipeError.
You must decide how to handle them. Otherwise, you have to poll
manually the child process using proc.poll() which is less efficient.

If you forget to poll the process, your program may enter an unlimited
loop which only occur in some cases (bug in the child process which
exits before reading the whole stdin input).
msg238874 - (view) Author: Josiah Carlson (josiahcarlson) * (Python triager) Date: 2015-03-22 05:44
Okay, I'm sorry for falling asleep at the wheel on this. At this point, I don't expect this to go in for 3.5 - but if it has a chance, I'll do what I need to do to get it there. Let me know.

I agree with the .communicate() not raising on broken pipe, and read/write_nonblocking raising on broken pipe. I'll get a patch in with the comments on the existing code review, along with those changes on Monday or Tuesday.
msg238887 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015-03-22 08:50
Regarding special cases on the reading side, I think there should be an easy way to distinguish them. The existing RawIOBase way is probably as good as any, unless you want to take the chance to invent a better API:

* EOF (pipe closed, no more data ever) should be indicated by returning b""
* Empty pipe (still open, future data possible) should be indicated by returning None

Beware that BlockingIOError is normally not raised by Python’s standard library, despite the BufferedIOBase documentation, and I think it should say that way (the implementation, that is). See Issue 13322.

Regarding handling a broken pipe condition, it would be nice to have a way of signalling that to the caller, whether via an exception or other means. For example if the subprocess is a video player which gets closed early by the end user, it would be good to know to stop wasting time and bandwidth generating and piping video the the player.
msg239277 - (view) Author: Josiah Carlson (josiahcarlson) * (Python triager) Date: 2015-03-25 18:32
I'm going to be honest; seeing None being returned from a pipe read feels *really* broken to me. When I get None returned from an IO read operation, my first instinct is "there can't be anything else coming, why else would it return None?"

After changing writing to raise BrokenPipeError on a closed pipe, I've also changed reading to do the same and added a test to verify the behavior.
msg239280 - (view) Author: Akira Li (akira) * Date: 2015-03-25 19:56
> I'm going to be honest; seeing None being returned from a pipe read feels *really* broken to me. When I get None returned from an IO read operation, my first instinct is "there can't be anything else coming, why else would it return None?"

It is how it is done in similar cases (returning `None` to indicate 
"would block"). Do you know a better method that would allow to 
distinguish between EOF (no future data, ever) and "would block"
(future data possible) without calling process.poll()?
msg239282 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015-03-25 21:55
Returning None for non blocking I/O is standard in Python.
msg239299 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015-03-26 02:02
Josiah’s code now seems to handle the four special cases like this:

* Reading a closed pipe raises BrokenPipeError. This is novel to me, and bends the current definition of the exception. But it appears to be the way Windows works at a lower level, and does make a bit of sense to me. However it is inconsistent with how normal reading from files, pipes and sockets works, which return the special value b"" in the equivalent case. Also various communications protocols use an empty packet to signal EOF.

* Reading an empty pipe returns b"". This is consistent with partial reads, which I like. However it could be surprising for someone used to b"" signalling EOF. Alternatives are returning None like RawIOBase.read(), and raising BlockingIOError like os.write() and socket.send().

* Writing to a closed pipe raises BrokenPipeError. This is consistent with how other APIs work, but since it is an asynchronous condition, a naive program may work sometimes without handling it, and then see suprious tracebacks or worse in other situations. Maybe dropping the data and setting a flag could be an alternative.

* Writing to a full pipe returns zero. This is consistent with partial writes, which I like. Alternatives are returning None like RawIOBase.write(), and raising BlockingIOError like os.write(), socket.send(), BufferedIOBase.write().
msg239366 - (view) Author: Josiah Carlson (josiahcarlson) * (Python triager) Date: 2015-03-26 23:39
Non-blocking IO returning a None on "no data currently available" is new to me. It is new to me simply because I don't recall it ever happening in Python 2.x with any socket IO that I ever dealt with, and socket IO is my primary source for non-blocking IO experience.

From my experience, reading from a socket would always return a string, unless the connection was closed/broken, at which point you got a bad file descriptor exception. Writing would return 0 if the buffer was full, or a bad file descriptor exception if the connection was closed/broken. The implementation of asyncore/asynchat at least until 3.0 shows this in practice, and I doubt this has changed in the intervening years since I updated those libraries for 3.0.

My choice to implement this based on BrokenPipeError was based on reading Victor(haypo)'s comments from 2014-07-23 and 2014-07-24, in which he stated that in asyncio, when it comes to process management, .communicate() hides exceptions so that communication can finish, but direct reading and writing surface those exceptions.

I was attempting to mimic the asyncio interface simply because it made sense to me coming from the socket world, and because asyncio is where people are likely to go if the non-blocking subprocess support in subprocess is insufficient for their needs.

Note that I also initially resisted raising an exception in those methods, but Victor(haypo)'s comments changed my mind.

> Do you know a better method that would allow to distinguish between EOF (no future data, ever) and "would block" (future data possible) without calling process.poll()?

Better is subjective when it comes to API, but different? Yes, it's already implemented in patch #8. BrokenPipeError exception when no more data is sendable/receivable. A 0 or b'' when writing to a full buffer, or reading from an empty buffer. This API tries to be the same as the asyncio.subprocess.Process() behavior when accessing stdin, stdout, and stderr directly.

> Returning None for non blocking I/O is standard in Python.

In some contexts, yes. In others, no. The asyncio.StreamReader() coroutines .read(), .readline(), and .readexactly() return strings. Raw async or sync sockets don't return None on read. SSL-wrapped async or sync sockets don't return None on read. Asyncio low-level socket operations don't yield None on a (currently empty) read.


In the context of the subprocess module as it exists in 3.4 (and 3.5 as it is unpatched), the only object that returns None on read when no data is available, but where data *might* be available in the future, is an underlying posix pipe (on posix platforms) - which isn't generally used directly.

The purpose of this patch is to expose stdin, stdout, and stderr in a way that allows non-blocking reads and writes from the subprocess that also plays nicely with .communicate() as necessary. Directly exposing the pipes doesn't work due to API inconsistencies between Windows and posix, so we have to add a layer.

I would argue that this layer of non-blocking (Windows and posix) pipe abstraction has much more in common with how asyncio deals with process IO, or how sockets deal with IO, than it does with pipes in general (the vast majority of which are blocking), so I would contend that it should have a similar API (no returning None).

That said, considering that the expected use of non-blocking subprocess communication *does not* include using a multiplexer (select, poll, etc.), I have the sense that a few other higher-level methods might be warranted to ease use substantially, and which I would expect people would end up using much more often than the lower-level direct read/write operations (different names might make more sense here):
.write_all(data, timeout=None)
.read_available(bufsize=-1) (and .read_stderr_available)
.read_exactly(bufsize, timeout=None) (and .read_stderr_exactly)
msg239372 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015-03-27 01:31
Yes, I think the special return value of None is only standard to the “io” module, which was added in Python 2.6 and 3. And even there it is inconsistent.

For what it’s worth, don’t have strong opinions on the special non-blocking and broken pipe cases, as long as they can be differentiated. So the current situation is fine by me. Just that potentially surprising differences with other APIs [e.g. read() -> b"" means empty buffer, not closed pipe] need documenting.

The asyncio.StreamReader.read() etc methods are documented as coroutines, so the problem of differentiating an empty buffer from a closed pipe does not exist there.

Just a thought: would it make sense for these methods to be decoupled from the subprocess class, and generalized to apply to any pipes, as a kind of Windows vs POSIX abstraction layer? Then you wouldn’t have to add duplicate methods like read_stderr_available() and read_stderr_exactly() all the time. You might also have a more portable way of doing non-blocking reads from sys.stdin, etc as well.
msg341245 - (view) Author: Josiah Carlson (josiahcarlson) * (Python triager) Date: 2019-05-02 04:12
Someone else can resurrect this concept and/ore patch if they care about this feature. Best of luck to future readers.
History
Date User Action Args
2022-04-11 14:56:11adminsetgithub: 41922
2019-05-02 13:03:55vstinnersetresolution: out of date
2019-05-02 04:12:38josiahcarlsonsetstatus: open -> closed

messages: + msg341245
stage: test needed -> resolved
2015-03-27 01:31:11martin.pantersetmessages: + msg239372
2015-03-26 23:39:11josiahcarlsonsetmessages: + msg239366
2015-03-26 02:02:02martin.pantersetmessages: + msg239299
2015-03-25 21:55:30vstinnersetmessages: + msg239282
2015-03-25 19:56:08akirasetmessages: + msg239280
2015-03-25 18:32:19josiahcarlsonsetfiles: + subprocess_8.patch

messages: + msg239277
2015-03-22 08:50:40martin.pantersetmessages: + msg238887
2015-03-22 05:44:37josiahcarlsonsetmessages: + msg238874
2014-07-24 11:45:05vstinnersetmessages: + msg223828
2014-07-24 11:28:33akirasetmessages: + msg223827
2014-07-23 23:51:33vstinnersetmessages: + msg223799
2014-06-19 11:12:27vstinnersettitle: asynchronous Subprocess -> add non-blocking read and write methods to subprocess.Popen
2014-06-06 11:11:35giampaolo.rodolasetcomponents: + asyncio
2014-06-03 02:44:02akirasetmessages: + msg219656
2014-06-02 18:47:09josiahcarlsonsetmessages: + msg219616
2014-05-31 00:57:57akirasetmessages: + msg219432
2014-05-30 00:07:47gvanrossumsetmessages: + msg219374
2014-05-30 00:04:21josiahcarlsonsetfiles: + subprocess_7.patch

messages: + msg219373
2014-05-21 01:27:53vstinnersetnosy: + gvanrossum, yselivanov
messages: + msg218861
2014-05-20 00:21:36josiahcarlsonsetmessages: + msg218826
2014-04-17 14:35:52josiahcarlsonsetmessages: + msg216692
2014-04-17 07:22:45sbtsetmessages: + msg216669
2014-04-17 07:08:06josiahcarlsonsetfiles: + subprocess_6.patch

messages: + msg216665
2014-04-17 06:52:05martin.pantersetnosy: + martin.panter
2014-04-16 07:28:27vstinnersetmessages: + msg216445
2014-04-16 06:47:56vstinnersetnosy: + vstinner
messages: + msg216442
2014-04-15 00:49:20josiahcarlsonsetfiles: + subprocess_5.patch
2014-04-14 21:50:37josiahcarlsonsetfiles: + subprocess_4.patch

messages: + msg216226
2014-04-13 00:34:06josiahcarlsonsetfiles: + subprocess_3.patch

messages: + msg215990
2014-04-12 11:32:56sbtsetmessages: + msg215969
2014-04-12 00:04:00josiahcarlsonsetmessages: + msg215956
2014-04-09 18:25:15sbtsetmessages: + msg215833
2014-04-09 16:40:13josiahcarlsonsetfiles: + subprocess_2.patch

messages: + msg215826
2014-04-07 16:15:01akirasetmessages: + msg215711
2014-04-07 14:33:13akirasetnosy: + akira
messages: + msg215709
2014-04-07 12:27:24giampaolo.rodolasetmessages: + msg215708
2014-04-06 20:41:32josiahcarlsonsetfiles: + subprocess.patch

messages: + msg215668
2014-04-05 06:52:39josiahcarlsonsetmessages: + msg215582
2014-04-04 07:01:56josiahcarlsonsetmessages: + msg215496
2014-04-03 07:38:43sbtsetmessages: + msg215426
2014-04-03 02:55:11eryksunsetnosy: + eryksun
messages: + msg215420
2014-04-03 00:25:20josiahcarlsonsetmessages: + msg215411
2014-03-31 07:13:17v+pythonsetnosy: + v+python
2014-03-30 13:14:15janzertsetnosy: + janzert
2014-03-30 11:28:34sbtsetmessages: + msg215178
2014-03-30 06:54:10josiahcarlsonsetversions: + Python 3.5, - Python 3.3, Python 3.4
2014-03-30 06:54:01josiahcarlsonsetmessages: + msg215171
2014-02-03 19:05:58BreamoreBoysetnosy: - BreamoreBoy
2012-12-08 00:12:49techtoniksetmessages: + msg177139
versions: + Python 3.4
2012-05-23 13:52:09sbtsetmessages: + msg161409
2012-05-23 11:05:55rosslagerwallsetmessages: + msg161400
2012-05-22 21:03:00cvrebertsetnosy: + cvrebert
2012-05-22 16:07:59ericpruittsetmessages: + msg161362
2012-05-22 14:27:40sbtsetmessages: + msg161356
2012-05-22 14:17:05sbtsetnosy: + sbt
messages: + msg161352
2012-05-22 09:20:35rosslagerwallsetnosy: + rosslagerwall
2012-05-22 09:20:23rosslagerwallsetassignee: astrand ->
2012-05-22 05:44:27techtoniksetnosy: + techtonik
messages: + msg161326
2011-01-27 13:51:44giampaolo.rodolasetnosy: josiahcarlson, astrand, parameter, giampaolo.rodola, ajaksu2, ooooooooo, r.david.murray, ericpruitt, BreamoreBoy, Andrew.Boettcher
versions: + Python 3.3, - Python 3.2
2010-10-25 14:36:48Andrew.Boettchersetnosy: + Andrew.Boettcher
2010-08-21 14:14:09BreamoreBoysetnosy: + BreamoreBoy

messages: + msg114495
versions: - Python 2.7
2009-06-16 17:13:19r.david.murraysetnosy: + r.david.murray

messages: + msg89433
versions: + Python 3.2, - Python 3.1
2009-06-13 01:50:15ericpruittsetnosy: + ericpruitt
messages: + msg89309
2009-03-30 23:41:41josiahcarlsonsetmessages: + msg84719
2009-03-30 23:12:47giampaolo.rodolasetnosy: + giampaolo.rodola
2009-03-30 22:36:40ajaksu2setcomponents: + Library (Lib), - None
versions: + Python 3.1, Python 2.7
keywords: + patch
nosy: + ajaksu2

messages: + msg84692
stage: test needed
2005-04-28 20:40:52josiahcarlsoncreate