This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Adding a recvexactly() to socket.socket: receive exactly n bytes
Type: enhancement Stage: test needed
Components: Extension Modules Versions: Python 3.2
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: giampaolo.rodola, irmen, loewis, martin.panter, pfalcon, pitrou, vstinner
Priority: normal Keywords: patch

Created on 2005-01-16 04:02 by irmen, last changed 2022-04-11 14:56 by admin.

Files
File name Uploaded Description Edit
socketmodulepatch.txt irmen, 2010-04-05 15:07 patch for Modules/socketmodule.c
libpatch.txt irmen, 2010-04-05 15:08 patch for Lib/socket.py
docpatch.txt irmen, 2010-04-05 15:08 patch for Doc/library/socket.rst
recvall.patch vstinner, 2015-09-03 15:47 review
Messages (16)
msg47558 - (view) Author: Irmen de Jong (irmen) (Python triager) Date: 2005-01-16 04:02
This patch is a first take at adding a recvall method
to the socket object, to mirror the existence of the
sendall method.

If the MSG_WAITALL flag is available, the recvall
method just calls recv() with that flag.

If it is not available, it uses an internal loop
(during the loop, threads are allowed, so this improves
concurrency).

Having this method makes Python code much simpler;
before you had to test for MSG_WAITALL yourself and
write your own loop in Python if the flag is not there
(on Windows for instance).
(also, having the loop in C improves performance and
concurrency compared to the same loop in Python)

Note: the patch hasn't been tested very well yet.

(code is based on a separate extension module found
here: http://www.it-ernst.de/python/ )
msg47559 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2005-02-24 20:55
Logged In: YES 
user_id=21627

I like the feature (but see below). The patch is incomplete,
though:
- there are no changes to Doc/lib/libsocket.tex
- there are no changes to Lib/test/test_socket.py

Furthermore, the patch is also wrong: if a later recv call
fails, all data read so far are discarded. I believe this is
different from the WAITALL flag, which I hope will preserve
the data in the socket, for a subsequent recv call. As
keeping the data in the socket seems unimplementable, the
partial data should somehow be returned to the application.

A note on coding style: please omit the spaces after the
opening paren and before the closing in

while ( (bytes_got<total_size) && (n > 0) )


msg99844 - (view) Author: A.M. Kuchling (akuchling) * (Python committer) Date: 2010-02-22 21:45
Irmen, do you want to update this patch for the current Python trunk, taking Martin's comments into account?
msg101178 - (view) Author: Irmen de Jong (irmen) (Python triager) Date: 2010-03-16 18:36
Sure, I'll give it another go.

I've not done any c-development for quite a while though, so I have to pick up the pieces and see how far I can get. Also, I don't have any compiler for Windows so maybe I'll need someone else to validate the patch on Windows for me, once I've got something together.
msg101206 - (view) Author: Irmen de Jong (irmen) (Python triager) Date: 2010-03-17 00:14
Ok I've looked at it again and think I can build an acceptable patch this time. However there are 2 things that I'm not sure of:

1) how to return the partial data to the application if the recv() loop fails before completion. Because the method will probably raise an exception on failure, as usual, it seems to me that the best place to put the partial data is inside the exception object. I can't think of another easy and safe way for the application to retrieve it otherwise. But, how is this achieved in code? I'll be using set_error() to return an error from my sock_recvall function I suppose.

2) the trunk is Python 2.7, should I make a separate patch for 3.x?
msg102374 - (view) Author: Irmen de Jong (irmen) (Python triager) Date: 2010-04-05 14:45
Ok I think I've got the code and doc changes ready. I added a recvall and a recvall_into method to the socket module. Any partially received data in case of errors is returned to the application as part of the args for a new exception, socket.partialdataerror.

Still need to work on some unit tests for these new methods.
msg102382 - (view) Author: Jean-Paul Calderone (exarkun) * (Python committer) Date: 2010-04-05 16:01
Just a couple comments:

  * If MSG_WAITALL is defined and a signal interrupts recv, will a string shorter than requested will be returned by sock_recvall?
  * Since MSG_WAITALL is already exposed to Python (when the underlying platform provides it), I wonder if this could all be implemented more simply in pure Python.  Can you elaborate on the motivation to use C?

Someone should do another review when there are unit tests.
msg102391 - (view) Author: Irmen de Jong (irmen) (Python triager) Date: 2010-04-05 17:58
Currently if MSG_WAITALL is defined, recvall() just calls recv() internally with the extra flag. Maybe that isn't the smartest thing to do because it duplicates recv's behavior on errors. Which is: release the data and raise an error.
Would it be nicer to have recvall() release the data and raise an error, or to let it return the partial data? Either way, I think the behavior should be the same regardless of MSG_WAITALL being available. This is not yet the case.

Why C: this started out by making the (very) old patch that I once wrote for socketmodule.c up to date with the current codebase, and taking Martin's comments into account.
The old patch was small and straightforward. Unfortunately the new one turned out bigger and more complex than I thought. For instance I'm not particularly happy with the way recvall returns the partial data on fail. It uses a new exception for that but the code has some trickery to replace the socket.error exception that is initially raised. I'm not sure if my code is the right way to do this, it needs some review. I do think that putting it into the exception object is the only safe way of returning it to the application, unless the semantics on error are changed as mentioned above. Maybe it could be made simpler then.
In any case, it probably is a good idea to see if a pure python solution (perhaps just some additions to Lib/socket.py?) would be better. Will put some effort into this.
msg114395 - (view) Author: Mark Lawrence (BreamoreBoy) * Date: 2010-08-19 18:44
@Irmen if you do proceed with this it should be against the py3k trunk.
msg234117 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2015-01-16 08:39
I'm frankly not sure why this is useful. If you want a guaranteed read size you should use the buffered layer - i.e. socket.makefile(). No need to complicate the raw socket implementation.
msg234126 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015-01-16 11:16
The patch uses the flag MSG_WAITALL for recv() if available. Extract of the manual page:

       MSG_WAITALL (since Linux 2.2)
              This flag requests that  the  operation  block  until  the  full
              request  is  satisfied.  However, the call may still return less
              data than requested if a signal is caught, an error  or  discon-
              nect  occurs,  or the next data to be received is of a different
              type than that returned.

It looks interesting, but it doesn't guarantee that you will always get exactly the expected size. You still have to call again recv() to get more data if a signal was received.


Jean-Paul Calderone wrote:
> Since MSG_WAITALL is already exposed to Python (when the underlying platform provides it), I wonder if this could all be implemented more simply in pure Python.  Can you elaborate on the motivation to use C?

sendall() is implemented in C while it would be possible to implement it in Python. The same rationale can be used on a large part of the stdlib :-) (The io module is implemented in Python in Python 2.6!)

The C gives you a full control on the GIL, signal handle, and it might be faster.


Antoine Pitrou wrote:
> I'm frankly not sure why this is useful.

recvall() allows to easily fix existing code: just replace recv() with recvall(), no need to refactor code to call makefile() which has a different API (ex: read/recv, write/send).

The addition is small and well defined.

--

About the exception: asyncio.StreamReader.read_exactly() raises an IncompleteReadError which contains the read bytes and inherits from EOFError: see
https://docs.python.org/dev/library/asyncio-stream.html#asyncio.StreamReader.readexactly
and
https://docs.python.org/dev/library/asyncio-stream.html#asyncio.IncompleteReadError

The following issue discussed the design on this exception in asyncio:
https://code.google.com/p/tulip/issues/detail?id=111

http.client uses an IncompleteRead (which inherits from HTTPException):
https://docs.python.org/dev/library/http.client.html#http.client.IncompleteRead
msg234157 - (view) Author: Irmen de Jong (irmen) (Python triager) Date: 2015-01-17 01:50
I created the patch about 5 years ago and in the meantime a few things have happened:
- I've not touched C for a very long time now
- I've learned that MSG_WAITALL may be unreliable on certain systems, so any implementation of recvall depending on MSG_WAITALL may inexplicably fail on such systems
- I've been using a python implementation of a custom recv loop in Pyro4 for years
- it is unclear that a C implementation will provide a measurable performance benefit because I think most of the time is spent in the network I/O anyway, and the GIL is released when doing a normal recv (I hope?)

In other words, I will never follow up on my original C-based patch from 5 years ago. I do still like the idea of having a reliable recvall in the stdlib instead of having to code a page long one in my own code.
msg239975 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015-04-03 12:02
> - I've learned that MSG_WAITALL may be unreliable on certain systems, so any implementation of recvall depending on MSG_WAITALL may inexplicably fail on such systems

Something else occurred since 5 years: the PEP 475 was accepted, it makes Python more reliable when it receives signals.

If recv(WAIT_ALL) is interrupted by a signal and returns less bytes, we must call PyErr_CheckSignal(). If the signal handler raises an exception, drop read data and raises the exception. If the signal handler does not raise an exception, we now *must* retry recv(WAIT_ALL) (with a shorter length, to not read too much data).

The IncompleteRead exception is still needed if the socket is closed before receiving the requested number of bytes.
msg249662 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015-09-03 15:47
recvall.patch: implement socket.socket.recvall() in pure Python.

It raises a new socket.IncompleteReadError (copied from asyncio) exception if the connection is closed before we got the expected number of bytes.

The patch has unit tests, document the new method and the new exception.

TODO: I don't like how the method handles timeout. The method must fail if it takes longer than socket.gettimeout() seconds, whereas currently the timeout is reset each time we got data from the server.

If the idea of the new socket method is accepted, I will reimplement it in C. In C, it's more easy to implement the timeout as I want.

In Python, the socket timeout cannot be changed temporary, because it would impact other threads which may use the same socket.

I changed how socket.sendall() handle timeout in Python 3.5, it is now the maximum total duration to send all data. The timeout is no more reset each time we send a packet. Related discussion:
https://mail.python.org/pipermail/python-dev/2015-April/139001.html

See also the issue #23236 which adds a timeout reset each time we get data to the asyncio read() method. It will be complementary to the existing "wait(read(), timeout)" timeout method, it's for a different use case.
msg250402 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015-09-10 18:49
Oh, in fact recvall() is a bad name. The io module uses the "readall()" name to really read all content of a file, whereas "recvall(N)" here only read up to N bytes.

It would be better to reuse the same name than asyncio, "readexactly(N)":
https://docs.python.org/dev/library/asyncio-stream.html#asyncio.StreamReader.readexactly

asyncio and http.client already have their IncompleteRead exceptions. Maybe it would be time to add a builtin exception?
msg254428 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015-11-10 02:44
I can’t say I’ve often wanted this kind of method for socket objects. I guess this would treat a zero-length message (e.g. UDP datagram) as end-of-stream. Maybe it would be more useful as a general function or method for RawIOBase (maybe also BufferedIOBase) streams.

As for the exception, I have used the existing EOFError in the past for similar purposes. After all, an unexpected EOF error at a low level often means an incomplete or truncated data error at a higher level.
History
Date User Action Args
2022-04-11 14:56:09adminsetgithub: 41448
2016-04-08 22:49:31pfalconsetnosy: + pfalcon
2015-11-10 02:44:21martin.pantersetnosy: + martin.panter
messages: + msg254428
2015-09-18 17:51:06giampaolo.rodolasetnosy: + giampaolo.rodola
2015-09-10 18:49:19vstinnersetmessages: + msg250402
title: Adding the missing socket.recvall() method -> Adding a recvexactly() to socket.socket: receive exactly n bytes
2015-09-03 15:47:43vstinnersetfiles: + recvall.patch

messages: + msg249662
2015-04-03 12:02:55vstinnersetmessages: + msg239975
2015-01-19 16:51:43exarkunsetnosy: - exarkun
2015-01-17 01:50:10irmensetmessages: + msg234157
2015-01-16 11:16:21vstinnersetmessages: + msg234126
2015-01-16 08:39:25pitrousetnosy: + pitrou
messages: + msg234117
2015-01-06 12:01:17vstinnersetnosy: + vstinner
2014-12-31 16:24:42akuchlingsetnosy: - akuchling
2014-02-03 19:49:15BreamoreBoysetnosy: - BreamoreBoy
2010-08-19 18:44:10BreamoreBoysetnosy: + BreamoreBoy

messages: + msg114395
versions: + Python 3.2, - Python 2.7
2010-04-05 17:58:21irmensetmessages: + msg102391
2010-04-05 16:01:48exarkunsetnosy: + exarkun
messages: + msg102382
2010-04-05 15:09:00irmensetfiles: + docpatch.txt
2010-04-05 15:08:25irmensetfiles: + libpatch.txt
2010-04-05 15:07:48irmensetfiles: + socketmodulepatch.txt
2010-04-05 15:04:37irmensetfiles: - patch.txt
2010-04-05 14:45:10irmensetmessages: + msg102374
2010-03-17 00:14:00irmensetmessages: + msg101206
2010-03-16 18:36:01irmensetmessages: + msg101178
2010-02-22 21:45:27akuchlingsetnosy: + akuchling
messages: + msg99844
2009-02-14 19:09:04ajaksu2setstage: test needed
type: enhancement
versions: + Python 2.7, - Python 2.5
2005-01-16 04:02:31irmencreate