This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Discovered typo in zlib test.
Type: Stage:
Components: None Versions:
process
Status: closed Resolution: accepted
Dependencies: Superseder:
Assigned To: Nosy List: gvanrossum, mwh, scott_daniels, tim.peters
Priority: high Keywords:

Created on 2002-11-18 18:36 by scott_daniels, last changed 2022-04-10 16:05 by admin. This issue is now closed.

Messages (9)
msg13357 - (view) Author: Scott David Daniels (scott_daniels) * Date: 2002-11-18 18:36
In test_zlib.py (while chasing what appears to be a
documentation problem), I found a flaw in the code
that tests max_length limitted output from a 
decompression object test.

Starting at line 100, the existing code is:
...
bufs.append(deco.flush())
decomp2 = ''.join(buf)
if decomp2 != buf:
    print "max_length decompressobj failed"
else:
    print "max_length decompressobj succeeded"
...

This test will always succeed (''.join(str) == str).

What seems obviously meant is:
...
bufs.append(deco.flush())
decomp2 = ''.join(bufs)
if decomp2 != buf:
    print "max_length decompressobj failed"
else:
    print "max_length decompressobj succeeded"
...

msg13358 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2002-11-18 20:56
Logged In: YES 
user_id=31435

Good eye!  Unfortunately, when I repair that, the test fails 
here -- decomp2 is a proper prefix of buf then.  It's "too 
short".  Raised priority but left unassigned (someone who 
understands zlib better will have to jump in).
msg13359 - (view) Author: Michael Hudson (mwh) (Python committer) Date: 2002-11-19 09:50
Logged In: YES 
user_id=6656

Assuming Tim meant what he said, not what he did: raising
priority.
msg13360 - (view) Author: Scott David Daniels (scott_daniels) * Date: 2002-11-19 21:40
Logged In: YES 
user_id=493818

OK, The reason the code fails after the fix is as follows:
deco.flush() does not return any output.
The loop control says, "while we have more unexamined 
source."

However, the decompressor can consume all of the input 
before it has provided all of its output.  So, there are two 
possible fixes:

1) Minimal change:  Change the loop control to say, in 
effect, "While we have more input or are producing output.":
Around line 92:
Change:
    bufs = []
    while cb:
        max_length = 1 + len(cb)/10
To:
    bufs = []
    while cb or chunk:
        max_length = 1 + len(cb)/10


2) More reasonable(?) change:  After the insertion loop, (just 
before the flush) extract all remaining output:
Around line 99:
        cb = deco.unconsumed_tail
    bufs.append(deco.flush())
    decomp2 = ''.join(bufs)
Becomes:
        cb = deco.unconsumed_tail
    while bufs[-1]:
        bufs.append(deco.decompress('', max_length))
    bufs.append(deco.flush())
    decomp2 = ''.join(bufs)


---
Note, in any case, the
    bufs.append(deco.flush())
originally on line 100 _always_ appends a zero-length string.
I would suggest changing it to simply:
    deco.flush()

    
msg13361 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2002-11-19 22:08
Logged In: YES 
user_id=31435

I'd say it's a bug in flush(), or in the docs, then.  They say "all 
pending input is processed, and a string containing the 
remaining uncompressed output is returned", and that's not 
happening here.  That's pretty clearly what the author of this 
test *expected* to happen, too (the original bug you 
discovered looked much more like a typo than a failure to 
understand the module interface).
msg13362 - (view) Author: Scott David Daniels (scott_daniels) * Date: 2002-12-12 18:32
Logged In: YES 
user_id=493818

The following replacement for PyZlib_unflush addresses some 
of the problem, but leaves unaddressed the issues of what 
unused_data and unconsumed_tail should be at the end of the 
routine.

static PyObject *
PyZlib_unflush(compobject *self, PyObject *args)
{
    int err, length = DEFAULTALLOC;
    PyObject * retval = NULL;
    unsigned long start_total_out;

    if (!PyArg_ParseTuple(args, "|i:flush", &length))
	return NULL;
    if (!(retval = PyString_FromStringAndSize(NULL, length)))
	return NULL;


    ENTER_ZLIB

    start_total_out = self->zst.total_out;
    self->zst.avail_out = length;
    self->zst.next_out = (Byte *)PyString_AS_STRING(retval);

    Py_BEGIN_ALLOW_THREADS
    err = inflate(&(self->zst), Z_FINISH);
    Py_END_ALLOW_THREADS

    /* while Z_OK and the output buffer is full, there might
be more output,
       so extend the output buffer and try again */
    while ((err == Z_OK || err == Z_BUF_ERROR) &&
self->zst.avail_out == 0) {
	if (_PyString_Resize(&retval, length << 1) < 0)
	    goto error;
	self->zst.next_out = (Byte *)PyString_AS_STRING(retval) +
length;
	self->zst.avail_out = length;
	length = length << 1;

	Py_BEGIN_ALLOW_THREADS
	err = inflate(&(self->zst), Z_FINISH);
	Py_END_ALLOW_THREADS
    }

    /* Maybe _always_ call inflateEnd if clearing
is_initialized */
    if (err == Z_STREAM_END) {
	err = inflateEnd(&(self->zst));
	if (err != Z_OK) {
	    zlib_error(self->zst, err, "from inflateEnd()");
	    Py_DECREF(retval);
	    retval = NULL;
	    goto error;
	}
    }
    self->is_initialised = 0;
    _PyString_Resize(&retval, self->zst.total_out -
start_total_out);
    /* ??? Here fix unused_data and unconsumed_tail */

error:

    LEAVE_ZLIB

    return retval;
}

msg13363 - (view) Author: Scott David Daniels (scott_daniels) * Date: 2003-02-01 03:33
Logged In: YES 
user_id=493818

I've attached a PyUnit-style replacement for test_zlib.py that 
tests this condition to Patch #678531,  which provides a
context 
diff updating zlibmodule.c to fix this problem.  Hope this
is enough 
to get the fix reviewed.
msg13364 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2003-02-03 03:22
Logged In: YES 
user_id=31435

It looks like the problem with getting this reviewed is that 
no current developer understands the zlib code or its test.  
I bumped the priority in a sneaky attempt to attract more 
interest <wink>.  *If* nobody else bites soon, and I can 
make time, I'll assign it to myself.

Your efforts are appreciated!  Even if it doesn't seem like it.
msg13365 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2003-02-03 20:52
Logged In: YES 
user_id=6380

Fixed in CVS.
History
Date User Action Args
2022-04-10 16:05:54adminsetgithub: 37496
2002-11-18 18:36:27scott_danielscreate