Issue476326
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2001-10-30 11:25 by pboddie, last changed 2022-04-10 16:04 by admin. This issue is now closed.
Messages (11) | |||
---|---|---|---|
msg7252 - (view) | Author: Paul Boddie (pboddie) | Date: 2001-10-30 11:25 | |
When a Unicode string is passed as the module name to imp.find_module, the function fails to import the named module even when it exists in the specified path, returning the error message "No module named ..." as a result. The problem in Python 2.0 can be traced to line 922 of Python/import.c which ensures that any strings involved in the find_module function must be standard Python strings and not Unicode strings, since it tests the type of path components against &PyString_Type explicitly. Interestingly, the __import__ built-in function seems to work with Unicode strings. Either way, it would be great if this could be documented or even fixed, but I don't know what the policy is on Unicode module names (even when they only contain ASCII-compatible characters). |
|||
msg7253 - (view) | Author: Marc-Andre Lemburg (lemburg) * | Date: 2001-12-01 23:01 | |
Logged In: YES user_id=38388 I guess Python should not except non-ASCII module names, so conversion of Unicode to ASCII should be appropriate. Would it suffice to only test this in find_module() or do you think that I need to dig deeper into the import mechanism ? |
|||
msg7254 - (view) | Author: Paul Boddie (pboddie) | Date: 2001-12-03 10:59 | |
Logged In: YES user_id=226443 For my purposes, I just wrapped the module name in a 'str' function call. I had Unicode strings because I was using text from an XML document and then attempting to use such text with the import mechanism. One issue is whether Python would ever support importing from files which have non-ASCII filenames. I can imagine that certain operating systems support Unicode filenames, for example, but then the Python language probably doesn't support such filenames as the basis for module names when used with the 'import' statement and other related statements. So, there's a wider issue of text encodings in (C)Python scripts as part of the "comprehensive" solution to this problem; the easy solution is just to enforce ASCII-only module names. |
|||
msg7255 - (view) | Author: Martin v. Löwis (loewis) * | Date: 2002-01-05 08:04 | |
Logged In: YES user_id=21627 I cannot reproduce the problem in Python 2.1: >>> import imp >>> imp.find_module(u"string") (<open file '/usr/local/lib/python2.2/string.py', mode 'r' at 0x816e070>, '/usr/local/lib/python2.2/string.py', ('.py', 'r', 1)) I don't think __import__ should accept non-ASCII names. It may be reasonable to further restrict import to verify that the argument is a NAME, in the sense of the Python lexis; doing so is not important, either. I cannot see any further problem in this report, so I suggest to close it as fixed. The test in line 922 only checks the path, not the module name. |
|||
msg7256 - (view) | Author: Paul Boddie (pboddie) | Date: 2002-01-07 10:43 | |
Logged In: YES user_id=226443 It must have been fixed between Python 2.0 and Python 2.1, then, but I can't find any obvious indication of this in Python/import.c. The platform probably shouldn't matter in this case, but I was using Red Hat Linux 6.1 on Intel. |
|||
msg7257 - (view) | Author: Marc-Andre Lemburg (lemburg) * | Date: 2002-01-07 10:55 | |
Logged In: YES user_id=38388 The find_module() code doesn't seem to have changed between the releases, so it should work in Python 2.0 as well. The only parts I see in the source code which require strings are the sys.path handling APIs. The optional second argument to find_module() will also only accept strings. Perhaps that's where your problem originated ? Python 2.0 (#1, Jan 19 2001, 17:54:27) [GCC 2.95.2 19991024 (release)] on linux2 Type "copyright", "credits" or "license" for more information. >>> import imp >>> imp.find_module(u'platform') (<open file '/home/lemburg/bin/platform.py', mode 'r' at 0x8191a78>, '/home/lemburg/bin/platform.py', ('.py', 'r', 1)) > Can you give an example which demonstrates the problem ? |
|||
msg7258 - (view) | Author: Paul Boddie (pboddie) | Date: 2002-01-07 13:09 | |
Logged In: YES user_id=226443 My apologies: I should have been clearer in my description. Here's a test case for Python 2.1 on Windows which demonstrates the problem: import sys, imp ascii_dir = "D:\\Private\\Vaults" unicode_dir = u"D:\\Private\\Vaults" # First test: Unicode sys.path value. sys.path.append(unicode_dir) imp.find_module(u"VaultsSearch") # fails imp.find_module("VaultsSearch") # fails sys.path.remove(unicode_dir) # Second test: ASCII sys.path value. sys.path.append(ascii_dir) imp.find_module(u"VaultsSearch") # succeeds imp.find_module("VaultsSearch") # succeeds sys.path.remove(ascii_dir) |
|||
msg7259 - (view) | Author: Walter Dörwald (doerwalter) * | Date: 2002-09-04 19:03 | |
Logged In: YES user_id=89016 import.c 2.207 should have fixed this problem, so I hope we can close this bug now. |
|||
msg7260 - (view) | Author: Marc-Andre Lemburg (lemburg) * | Date: 2002-09-05 07:26 | |
Logged In: YES user_id=38388 No time to check; can you do this, Walter ? Thanks. |
|||
msg7261 - (view) | Author: Walter Dörwald (doerwalter) * | Date: 2002-09-05 15:45 | |
Logged In: YES user_id=89016 It seems to work under Linux, can anyone check it under Windows with non-ascii directory names? > echo >/tmp/foo.py "print 'foo'" > ./python Python 2.3a0 (#12, Sep 4 2002, 22:04:22) [GCC 2.96 20000731 (Red Hat Linux 7.3 2.96-110)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import sys, imp >>> sys.path.append(u"/tmp") >>> imp.find_module("foo") (<open file '/tmp/foo.py', mode 'U' at 0x40078448>, '/tmp/foo.py', ('.py', 'U', 1)) |
|||
msg7262 - (view) | Author: Walter Dörwald (doerwalter) * | Date: 2002-09-05 17:22 | |
Logged In: YES user_id=89016 It works on Windows 2000 too: Python 2.3a0 (#29, Sep 5 2002, 18:43:40) [MSC 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import sys, os, imp [8623 refs] >>> os.mkdir(u"c:\\n\xfcx") [10435 refs] >>> sys.path.append(u"c:\\n\xfcx") [10436 refs] >>> open(u"c:\\n\xfcx\\hurz.py", "wb").write('print "hurz"') [10567 refs] >>> imp.find_module(u"hurz") (<open file 'c:\n³x\hurz.py', mode 'U' at 0x007DC488>, 'c:\\n\xfcx\\hurz.py', ('.py', 'U', 1)) [10580 refs] >>> The repr() of the file seems a little strange, but looking in the Explorer I see the correct directory name. Using a Unicode character that is outside the Latin-1 range instead \xfc fails in the os.mkdir() call, because this is a problem of the "mbcs" encoding, which returns ?, which is illegal in directory names. On Linux it works with non-ascii directory names too, if the appropriate locale.setlocale is called: >>> import os, sys, imp, locale >>> locale.setlocale(locale.LC_ALL, 'de_DE') >>> os.mkdir(u"/tmp/gürk") >>> open(u"/tmp/gürk/hurz.py", "wb").write("print 'hurz'") >>> sys.path.append(u"/tmp/gürk") >>> imp.find_module(u"hurz") (<open file '/tmp/gürk/hurz.py', mode 'U' at 0x400ce9f8>, '/tmp/g\xfcrk/hurz.py', ('.py', 'U', 1)) |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-10 16:04:35 | admin | set | github: 35426 |
2001-10-30 11:25:35 | pboddie | create |