This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Some incorrect national characters (Polish) in unicodedata
Type: Stage:
Components: Unicode Versions:
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: mhammond Nosy List: admindomeny, lemburg, loewis, mhammond
Priority: normal Keywords:

Created on 2007-06-26 18:45 by admindomeny, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
python_unicode_polish.JPG admindomeny, 2007-06-26 18:45 pythonwin screenshot
test.py admindomeny, 2007-06-27 17:25 testing Polish unicode chars in Pythonwin/IDLE
Messages (5)
msg32401 - (view) Author: admindomeny (admindomeny) Date: 2007-06-26 18:45
Hello,

This problem regards pythonwin (I haven't checked whether unix/commandline python is affected), Python 2.5.1.

Examples on attached screenshot.

E.g. print u'\N{LATIN SMALL LETTER A WITH CIRCUMFLEX}' prints wrong character (latin small a with some caret above it it seems) as well as 

print unicodedata.name( / latin small letter a with circumflex, typed in Windows using Polish "programmer's keyboard" / ) produces 'SUPERSCRIPT ONE', which is obviously incorrect.

msg32402 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2007-06-27 08:28
This sounds more like a problem with entry of Unicode characters in pythonwin than the unicodedata module.

Please create a test.py file with the character using e.g. UTF-8 as source code encoding and run that through the Python interpreter directly to see if the problem persists.
msg32403 - (view) Author: admindomeny (admindomeny) Date: 2007-06-27 17:25
You were correct, the attached test file for Polish national characters shows correctt character encodings when ran in Pythonwin and edited correctly Unicode with Polish characters from Unicode Unicode.

The problem of entering characters in Pythonwin remains, however (OS: Win XP SP2, Polish edition): I have tried changing fonts to what are Unicode fonts as far as I know (Times New Roman, Arial, etc), including CE fonts as well. It doesn't work.

I made sure that Polish Programmer's Keyboard is turned on which gives me correct encoding in almost all Windows applications, including Unicode editors like UniRed. Still, Pythonwin shell in particular thinks that AltGr+a (standard way of entering 'LATIN SMALL LETTER A WITH OGONEK') is actually 'SUPERSCRIPT ONE' for example.

So, to summarize:

1. IDLE edits the text in Unicode correctly provided there's a #-*- coding: utf-8 -*- header in first line.

2. Pythonwin executes that file correctly.

3. Pythonwin enters national characters INCORRECTLY (at least as far Polish is concerned, but I suspect it's also the case with other languages).


File Added: test.py
msg32404 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2007-06-27 19:38
Assigning to Mark Hammond who wrote Pythonwin.
msg32405 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2007-06-30 17:34
Actually, we should close this here. Please report it through the PythonWin bugtracker.
History
Date User Action Args
2022-04-11 14:56:25adminsetgithub: 45131
2007-06-26 18:45:20admindomenycreate