Issue1231336
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2005-07-02 01:55 by nyamatongwe, last changed 2022-04-11 14:56 by admin. This issue is now closed.
Files | ||||
---|---|---|---|---|
File name | Uploaded | Description | Edit | |
diffs.txt | nyamatongwe, 2005-07-02 01:55 | Differences from current CVS containing code and doc | ||
diffu.txt | nyamatongwe, 2005-07-05 04:25 | |||
diffu.txt | nyamatongwe, 2005-07-05 04:27 | Diff that also handles -c, -O, and -E. |
Messages (13) | |||
---|---|---|---|
msg48550 - (view) | Author: Neil Hodgson (nyamatongwe) | Date: 2005-07-02 01:55 | |
Most installations of Windows (2000, XP) are unicode native with narrow character APIs only providing a distorted view of the system. Python does not currently provide access to some basic features through wide character calls and so may see distorted values. This patch adds unicode compatibility for sys.argv, os.environ, and os.system. os.system accepts a unicode argument in the same way as described in PEP 277 for file APIs. For sys.argv and os.environ, new parallel unicode attributes sys.argvu and os.environu are added as it would cause too many problems to use unicode values for the existing attributes or to use unicode only for non-ASCII values. The features are only enabled on unicode native versions of Windows. The three features are demonstrated at http://www.scintilla.org/pyunicode.png The patch contains some documentation additions for sys.argvu and os.environu. There are no test cases as test cases involving running extra processes can be messy and fail for uninteresting reasons. |
|||
msg48551 - (view) | Author: Neil Hodgson (nyamatongwe) | Date: 2005-07-05 04:25 | |
Logged In: YES user_id=12579 There are problems in sys.argvu as the current argument processing code removes some option arguments where these are processed by python. This can be almost fixed by storing the argc last elements into sys.argvu. However, when using [-c command], the command is removed from sys.argv as this allows the Python code to determine that it is either running with a command line command ("-c") or the name of the file. Attached patch fixes these problems. |
|||
msg48552 - (view) | Author: Neil Hodgson (nyamatongwe) | Date: 2005-07-05 04:27 | |
Logged In: YES user_id=12579 Added a description to diff file. |
|||
msg48553 - (view) | Author: Martin v. Löwis (loewis) * | Date: 2005-07-11 16:42 | |
Logged In: YES user_id=21627 For os.environ, I think I would prefer a solution where Unicode keys result in Unicode values and string keys result in string values, with the canonical conversion through "mbcs" in place. For argv, I agree something should be done, but I'm not certain that the introduction of argvu is the best thing to do; this should be dicsussed on python-dev, and with all people originally involved in PEP 277. The change to system() is not mentioned at all in your message. It doesn't seem to belong into this patch, either, so please submit it as a separate patch. If system() is changed to support Unicode commands, I think spawn*() should be changed as well. These seem less debatable, as they come as natural extensions to PEP 277 (i.e. pass Unicode through to the system). |
|||
msg48554 - (view) | Author: Neil Hodgson (nyamatongwe) | Date: 2005-07-14 10:35 | |
Logged In: YES user_id=12579 os.environ is a dictionary and unicode keys can not be discerned from string keys. For sys.argv it appears that there is no support for the "parallel universe" approach with sys.argvu and I expect one of the "promotion" models will be chosen. The patch should be rejected (or parked?) until consensus emerges. os.system was only included to allow testing but I saw difficulties in writing robust unit tests for these features so didn't include any. |
|||
msg48555 - (view) | Author: Martin v. Löwis (loewis) * | Date: 2005-07-14 18:10 | |
Logged In: YES user_id=21627 os.environ is not a dictionary, it is a UserDict.IterableUserDict. Discerning strings and Unicode object would well be possible. As you are not willing to discuss the issues on python-dev, I'm rejecting the patch. |
|||
msg48556 - (view) | Author: Marc-Andre Lemburg (lemburg) * | Date: 2005-07-14 18:33 | |
Logged In: YES user_id=38388 Just as data point: the idea of using the type of a dictionary key to determine the resulting return type is a really bad design idea - just like the idea to let functions determine their return type based on the types of their input parameters. These things should always be made explicit, e.g. os.environ.get_unicode(), sys.argv.get_unicode() etc. However, as the discussion on python-dev shows, we may not need this kind of approach at all. Cheers, Marc-Andre. |
|||
msg48557 - (view) | Author: Neil Hodgson (nyamatongwe) | Date: 2005-07-15 00:17 | |
Logged In: YES user_id=12579 I thought that posixmodule.c was creating os.environ but now see the code in os.py. "As you are not willing to discuss the issues on python-dev". Eh? I thought that was what I was doing in the "Adding the 'path' module" thread. You can reject the patch due to the discussion on python-dev but I don't think the given reason is valid. |
|||
msg48558 - (view) | Author: Martin v. Löwis (loewis) * | Date: 2005-07-15 05:17 | |
Logged In: YES user_id=21627 Sorry, I missed the discussion; reopening. |
|||
msg48559 - (view) | Author: Martin v. Löwis (loewis) * | Date: 2005-08-09 15:08 | |
Logged In: YES user_id=21627 I think the discussion came to the following conclusion: environu should not be added, instead, os.environ should have Unicode where necessary (i.e. non-ASCII), I guess this applies both to keys and to values. Are you interested in revising the patch in this direction? |
|||
msg48560 - (view) | Author: Neil Hodgson (nyamatongwe) | Date: 2005-08-09 23:23 | |
Logged In: YES user_id=12579 Marc-Andre Lemburg's point of view that os.environ use unicode when the string is outside Python's default encoding attracted most support. For the reasons given in the discussion, I feel this will cause problems for users. It is more difficult to code than a CP_ACP or non-ASCII test and there would be flow-on work for other calls such as open that would need to convert from the default encoding to Unicode. Due to the size of these changes and my doubts about this being the correct design, I don't want to work on its implementation. |
|||
msg48561 - (view) | Author: Martin v. Löwis (loewis) * | Date: 2005-08-10 07:17 | |
Logged In: YES user_id=21627 Ok, I think we have to reject this patch, then, and wait for somebody to write a PEP. |
|||
msg48562 - (view) | Author: Neil Hodgson (nyamatongwe) | Date: 2005-08-10 07:24 | |
Logged In: YES user_id=12579 Yes, the scope of the changes needed requires a PEP and transition plan and needs to make sense in moving towards the all-unicode string future. |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:56:12 | admin | set | github: 42154 |
2005-07-02 01:55:44 | nyamatongwe | create |