This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Iterator on Fileobject gives no MemoryError
Type: Stage:
Components: Library (Lib) Versions: Python 2.4
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: catlee, georg.brandl, zypher
Priority: high Keywords:

Created on 2005-04-06 17:55 by zypher, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Messages (3)
msg24919 - (view) Author: Folke Lemaitre (zypher) Date: 2005-04-06 17:55
The following problem has only been tested on linux.
Suppose at a certain time that your machine can
allocate a maximum of X megabytes of memory. Allocating
more than X should result in python MemoryErrors. Also
suppose you have a file containing one big line taking
more than X bytes (Large binary file for example).
In this case, if you read lines from a file through the
file objects iterator, you do NOT get the expected
MemoryError as result, but an empty list.

To reproduce, create a file twice as big as your
machines memory and disable the swap.

If you run the following code:
#    import os.path
#
#    def test(input):
#        print "Testing %s (%sMB)"%(repr(input),
os.path.getsize(input)/(1024.0*1024.0))
#        count = 0
#        for line in open(input):
#            count = count + 1
#       print "  >> Total Number of Lines: %s"%count
#
#    if __name__ == "__main__":
#        test('test.small')
#        test('test.big')

you'll get something like:
# folke@wladimir devel $ python2.4 bug.py
# Testing 'test.small' (20.0MB)
#   >> Total Number of Lines: 1
# Testing 'test.big' (2000.0MB)
#   >> Total Number of Lines: 0

msg24920 - (view) Author: Chris AtLee (catlee) * Date: 2006-03-31 19:53
Logged In: YES 
user_id=186532

This can be fixed by having the readahead method (used by
readahead_get_line_skip, used by file_iternext) raising a
MemoryError if it can't allocate enough room for the line.

Index: Objects/fileobject.c
===================================================================
--- Objects/fileobject.c        (revision 43486)
+++ Objects/fileobject.c        (working copy)
@@ -1797,7 +1797,8 @@

 /* Make sure that file has a readahead buffer with at least
one byte
    (unless at EOF) and no more than bufsize.  Returns
negative value on
-   error */
+   error.  Will raise a MemoryError if bufsize bytes cannot be
+   allocated. */
 static int
 readahead(PyFileObject *f, int bufsize)
 {
@@ -1810,6 +1811,7 @@
                        drop_readahead(f);
        }
        if ((f->f_buf = PyMem_Malloc(bufsize)) == NULL) {
+                PyErr_NoMemory();
                return -1;
        }
        Py_BEGIN_ALLOW_THREADS
msg24921 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2006-03-31 20:31
Logged In: YES 
user_id=849994

Applied your patch in revs 43506, 43507.
History
Date User Action Args
2022-04-11 14:56:10adminsetgithub: 41815
2005-04-06 17:55:44zyphercreate