This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: small seek tweak upon reads (gzip)
Type: Stage:
Components: Library (Lib) Versions: Python 2.2
process
Status: closed Resolution: rejected
Dependencies: Superseder:
Assigned To: nascheme Nosy List: icode, nascheme
Priority: normal Keywords: patch

Created on 2002-03-22 08:04 by icode, last changed 2022-04-10 16:05 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
gzip.py.patch icode, 2002-03-22 08:04 small gzip.py patch
Messages (5)
msg39323 - (view) Author: Todd Warner (icode) Date: 2002-03-22 08:04
Upon actual read of a gzipped file, there is a check
to see if you are already at the end of the file. This
is done by saving your position, seeking to the end,
and comparing that tell(). It is more efficient to
simply increment position + 1.

Efficiency gain is nearly insignificant, but this
patch will greatly decrease the size of my next one. :)

NOTE: all version of gzip.py do this.
msg39324 - (view) Author: Neil Schemenauer (nascheme) * (Python committer) Date: 2002-03-24 22:35
Logged In: YES 
user_id=35752

This looks like a pointless change to me.  It's probably
less efficient with the patch because there is an extra
Python int add.  Why don't you just submit the real patch? :)

Rejected.
msg39325 - (view) Author: Todd Warner (icode) Date: 2002-03-25 01:21
Logged In: YES 
user_id=87721

It is more efficient for the majority of gzipped files
(if very small files are not in the majority).

The "real" patch will be (once I give it a bit more
polish/tuning --- using in production code soon) a class
called GzipStream. Ie. it will allow high level access to
any arbitrary file-like "stream" (eg. a gzipped
socket stream) which are not generally "seekable". I do
this via inheriting GzipFile and extending upon it...
but I rewrite the _read method with a one line change.

Anyway, that is my logic. Question to you: should this be
included within gzip.py or in its own module (eg. 
gzipstream)?
msg39326 - (view) Author: Todd Warner (icode) Date: 2002-03-25 01:30
Logged In: YES 
user_id=87721

It is more efficient for the majority of gzipped files
(if very small files are not in the majority).

The "real" patch will be (once I give it a bit more
polish/tuning --- using in production code soon) a class
called GzipStream. Ie. it will allow high level access to
any arbitrary file-like "stream" (eg. a gzipped
socket stream) which are not generally "seekable". I do
this via inheriting GzipFile and extending upon it...
but I rewrite the _read method with a one line change.

Anyway, that is my logic. Question to you: should this be
included within gzip.py or in its own module (eg. 
gzipstream)?
msg39327 - (view) Author: Neil Schemenauer (nascheme) * (Python committer) Date: 2002-03-25 03:33
Logged In: YES 
user_id=35752

Why would it be more efficient?  Assuming the OS is not
implemented by a silly person, a seek just updates
an offset in the in-memory file descriptor structure.

Regarding your GzipStream, it sounds like making it part
of gzip.py would be okay.
History
Date User Action Args
2022-04-10 16:05:08adminsetgithub: 36310
2002-03-22 08:04:39icodecreate