Issue661796
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2003-01-03 19:22 by nnorwitz, last changed 2022-04-10 16:06 by admin. This issue is now closed.
Files | ||||
---|---|---|---|---|
File name | Uploaded | Description | Edit | |
bz.diff | nnorwitz, 2003-01-05 20:09 | patch 2, this should be correct now | ||
bz2test.diff | nnorwitz, 2003-01-10 23:55 | update test for the fd leak |
Messages (15) | |||
---|---|---|---|
msg42252 - (view) | Author: Neal Norwitz (nnorwitz) * | Date: 2003-01-03 19:22 | |
The attached patch fixes BZ2File() leaking the fd and PyFileObject* info (name, read ahead buffer) when an object is deleted. I'm not sure the patch fixes these problems in the best way. It exposes most of the implementation from fileobject.c::file_dealloc() through a private API _PyFile_Dealloc(). BZ2File derives from PyFileObject. Make sure the file: 'empty' exists and do: from bz2 import * A simple test which demonstrates the problem is: for i in range(100000): o = BZ2File('empty') ; del o Without the patch, you quickly get: IOError: [Errno 24] Too many open files: 'empty' You can modify this to: for i in range(100000): o = BZ2File('empty') ; o.close() ; del o Now you can see the memory leaks. |
|||
msg42253 - (view) | Author: Martin v. Löwis (loewis) * | Date: 2003-01-03 19:48 | |
Logged In: YES user_id=21627 Assigning to the author of the module for review. Gustavo, if you can't find the time for a review, feel free to unassign it. |
|||
msg42254 - (view) | Author: Neal Norwitz (nnorwitz) * | Date: 2003-01-05 20:09 | |
Logged In: YES user_id=33168 I've found a better solution: change the last line of BZ2File_dealloc() from: ((PyObject*)self)->ob_type->tp_free((PyObject *)self); to PyFile_Type.tp_dealloc((PyObject *)self); Updated patch attached. |
|||
msg42255 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2003-01-05 21:11 | |
Logged In: YES user_id=6380 Um, I don't understand why the BZ2File class inherits from FileObject. It doesn't seem to be inheriting any code, and there's no reason to inherit from FileObject just to make a file-like object. Maybe it was a misunderstanding? |
|||
msg42256 - (view) | Author: Gustavo Niemeyer (niemeyer) * | Date: 2003-01-20 20:44 | |
Logged In: YES user_id=7887 Sorry about the delay. I was on a not-so-short vacation. nnorwitz: Thanks for the patch. I'll check the problem (for my own understanding) and most certainly apply it. gvanrossum: BZ2File used to share a lot of code with FileObject. Unfortunately (for the BZ2File side), FileObject changed considerably post 2.2, and some of the code had to be moved to BZ2File itself. It still uses some code from FileObject, though (open, close, seek, part of the softspace handling, access to attributes and some common methods). |
|||
msg42257 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2003-01-21 19:24 | |
Logged In: YES user_id=6380 FileObject sharing: I see. But I still think it's a bad idea, and you're better off using stdio methods directly (as the evolution of the FileObject implementation has already shown). |
|||
msg42258 - (view) | Author: Gustavo Niemeyer (niemeyer) * | Date: 2003-01-22 15:39 | |
Logged In: YES user_id=7887 Ok. Do you want me to work on this for 2.3? |
|||
msg42259 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2003-01-22 17:29 | |
Logged In: YES user_id=6380 If you could, that would be great. |
|||
msg42260 - (view) | Author: Gustavo Niemeyer (niemeyer) * | Date: 2003-02-02 04:21 | |
Logged In: YES user_id=7887 Ok, I've started to "unparent" the bz2 code. OTOH, I've just realised what a lot of code, full of #ifdef's, I'll have to copy, mostly unchanged. The following functions file_new(), file_init(), open_the_file(), fill_file_fields(), dircheck(), close_file(), get_newlines(), _portable_fseek(), file_seek(), file_repr(), and the following arrays file_memberlist[], file_getsetlist[] My question is, do you think it's better to maintain all this code duplicated than maintaining bz2 up to date? I'll do the job, if you still think this is great. |
|||
msg42261 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2003-02-02 12:57 | |
Logged In: YES user_id=6380 I don't understand. Why would you want to copy all the implementation details of those functions? Have a look at how the zlib module emulates a file. |
|||
msg42262 - (view) | Author: Gustavo Niemeyer (niemeyer) * | Date: 2003-02-03 12:07 | |
Logged In: YES user_id=7887 zlib module doesn't emulate a file at C level. It also does not implement as many features as the BZ2File does (universal newlines, buffered IO, etc). The strange thing is, I thought it was a good thing to develop inheriting FileObject that way, since it would save code and maintenance, while being fast. I also thought it was a good model to make BZ2File behave as a subclass (I even mentioned that in the documentation). I'm surprised for being wrong. |
|||
msg42263 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2003-02-03 14:51 | |
Logged In: YES user_id=6380 IMO, being a subclass of a concrete class typically only makes sense if you can maintain the meaning of the instance variables defined by the base class. In this case I don't believe that's the case. If looks like you are using subclassing from the file class where you should really be using an instance variable that points to a regular file object. For example, operations on the file descriptor returned by fileno() have a very different effect for a BZ2file than for a regular file (since they manipulate the compressed file rather than the uncompressed byte stream represented by the BZ2file). You may also confuse other parts of the Python runtime that assume that when they see a file instance they can extract the underlying stdio FILE* from it and manipulate that. |
|||
msg42264 - (view) | Author: Gustavo Niemeyer (niemeyer) * | Date: 2003-02-03 15:52 | |
Logged In: YES user_id=7887 Ok.. I can't find a good way to workaround the presented reasons. I'll work to unparent BZ2File. |
|||
msg42265 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2003-02-03 16:09 | |
Logged In: YES user_id=6380 Thanks! I expect you'll be surprised at the gained clarity of the resulting code. |
|||
msg42266 - (view) | Author: Gustavo Niemeyer (niemeyer) * | Date: 2003-02-11 18:55 | |
Logged In: YES user_id=7887 Fixed in the following revisions: Modules/bz2module.c: 1.15 Doc/lib/libbz2.tex: 1.6 Thanks to everybody. |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-10 16:06:06 | admin | set | github: 37709 |
2003-01-03 19:22:48 | nnorwitz | create |