This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Fix bug in encodings.search_function
Type: Stage:
Components: Library (Lib) Versions: Python 2.3
process
Status: closed Resolution: rejected
Dependencies: Superseder:
Assigned To: Nosy List: geertj, loewis
Priority: normal Keywords: patch

Created on 2002-06-20 11:39 by geertj, last changed 2022-04-10 16:05 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
encodings-2.2.1.diff geertj, 2002-06-20 11:39 Patch for 2.2.1
encodings-HEAD.diff geertj, 2002-06-20 11:40 Patch for HEAD
Messages (6)
msg40357 - (view) Author: Geert Jansen (geertj) * Date: 2002-06-20 11:39
Hi,

there seems to be a bug in the default encoding search 
function (search_function in encodings/__init__.py. The 
function tries to load a module with the name of the 
encoding, but it doesn't require that this module is in the 
encodings/ directory. This leads to trouble when you try 
to use an encoding that has the name of a module in the 
search path.

To demonstrate, save the following line to test.py:

print 'Just testing'.encode('test')

and run it. This results in a CodecRegistryError 
exception: "module "test" (test.pyc) failed to register"

The bug is present in 2.2.1 and in HEAD. In HEAD there 
was actually a bugfix for this but it was incomplete.

Patches for 2.2.1 and HEAD attached.

Greetings,
Geert Jansen
msg40358 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2002-07-28 11:33
Logged In: YES 
user_id=21627

Thanks for the patch; applied as __init__.py 1.9 and 1.6.12.1.
msg40359 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2002-07-29 13:31
Logged In: YES 
user_id=21627

It's actually not a bug to pass a module outside of
encodings/; the standard search function is supposed to find
other modules as well. So I have to rever thsi change.
msg40360 - (view) Author: Geert Jansen (geertj) * Date: 2002-07-29 17:45
Logged In: YES 
user_id=537938

Hi Martin,

Isn't it wrong to let the module namespace "leak" into the 
encodings namespace? This leads to very unexpected 
behaviour. Why should it be forbidden to have a module with 
the same name as an encoding? This seems rather arbitrary 
and solely an implementation detail.

It is still very easy to add an encoding outside the encodings/ 
directory using the codecs.register() function. Or maybe there 
is another solution?
msg40361 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2002-07-30 07:47
Logged In: YES 
user_id=21627

Not sure what you mean by "leak". It is certainly desirable
that modules carry the same name as encodings; in fact,
*every* encoding implemented so far has a module with the
same name.

People have been using u"text".encode("japanese.sjis"),
given that the JapaneseCodecs package installs itself into a
Python package "japanese". That must continue to work. In
particular, you patch broke test.test_charmapcodec; make
sure you test your patches before submitting them.

To solve the problem of .encode("test") giving a registry
error, I have now changed the search_function to ignore
modules that don't have a getregentry function.

msg40362 - (view) Author: Geert Jansen (geertj) * Date: 2002-07-30 08:16
Logged In: YES 
user_id=537938

I meant by "leak" that the module namespace and the 
encoding namespace are different namespaces and should 
therefore be insolated from each other. Symbols from one 
namespace should not turn up in the other. This is all IMHO 
of course.

But thanks for fixing this problem. Next time I send in a patch 
I'll make sure I run the test suite too... Sorry for that.
History
Date User Action Args
2022-04-10 16:05:26adminsetgithub: 36775
2002-06-20 11:39:54geertjcreate