This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Non-ASCII characters bugs
Type: Stage:
Components: IDLE Versions: Python 2.3
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: loewis Nosy List: barto, loewis, rhettinger
Priority: normal Keywords:

Created on 2003-07-20 20:23 by barto, last changed 2022-04-10 16:10 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
test_01.py barto, 2003-07-21 14:10 file with non-ASCII characters created with IDLE 0.8
test.zip barto, 2003-07-22 21:40 test_01.py and test_02.py
clean.py loewis, 2003-07-23 04:39
eol.diff loewis, 2003-08-03 18:13
Messages (16)
msg17137 - (view) Author: Bartolome Sintes Marco (barto) Date: 2003-07-20 20:23
I have downloaded and installed Python 2.3 RC1 in a
Spanish Windows 98 SE computer. IDLE 1.0 does not work
very well:

a) When I open with IDLE 1.0 RC1 a program written
with IDLE 0.8, Spanish non-ASCII characters (like
voyels with accents) are changed to wrong characters.
Some examples:
í -> á
ó -> ó
ú -> ú

b) With IDLE 1.0 rc1 I can create a new .py file with
non-ASCII characters and save it, but if I reload the
same file  and I modify it, then I can not save it
(neither save it as). If I delete the non-ASCII
characters, then I can save (or save it) without problems.

If you need more information my adress is
BartolomeSintes at ono.com

Thanks for your great work!
msg17138 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2003-07-20 21:10
Logged In: YES 
user_id=21627

Why do you think this is a bug in IDLE 1.0, when it is IDLE
0.8 which displays the data incorrectly? Can you report what
notepad.exe thinks about these files? Or better, can you
please attach one such file here?

Can you try adding a line
# -*- coding: iso-8859-1 -*-

as the first line of your file before saving it in IDLE, and
see whether this changes anything?

Did IDLE ever propose to add such a line?

Did you somehow change the default configuration of IDLE, 
through Options/Configure IDLE/General? Did you edit
site.py, or sitecustomize.py to change the default encoding`?
msg17139 - (view) Author: Bartolome Sintes Marco (barto) Date: 2003-07-20 22:47
Logged In: YES 
user_id=624347

bug a)
When I began to use IDLE 0.8, I could not use non-ASCII 
characters. As I read in http://www.python.org/cgi-
bin/faqw.py?
query=4.102&querytype=simple&casefold=yes&req=search I 
created a sitecustomize.py file in lib/site-packages and then I 
have been able to use non-ASCII characters (in IDLE 0.8) in 
my programs.
Now when I open these programs with IDLE 1.0rc1, it shows 
wrong non-ASCII characters (and yes, Notepad shows the 
same wrong characters). I thought it was a IDLE 1.0rc1 bug, 
but surely you are right and it is an 0.8 IDLE bug. The 
question is if there is an easy way to solve this problem 
(apart find and replace) and if it should be stated somewhere.

bug b)
You are right. IDLE 1.0rc1 shows a warning if the file is 
created with IDLE 1.0rc1. I made a mistake in my report. 
What I have done is opening with IDLE 1.0rc1 a file created 
with IDLE 0.8, deleting the non-ASCII characters and saving 
it. But if I add later some non-ASCII characters to this file 
with IDLE 1.0rc1, then save does not work (even if I manually 
add the # -*- coding: cp1252 -*- line) and it does not show 
any warning.

I think I have not changed the default configuration of IDLE 
1.0rc1 (Options/General/Default source encoding is set to 
none). I have not added the old sitecustomize.py to the 
Python 2.3rc1 installation .
msg17140 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2003-07-20 22:51
Logged In: YES 
user_id=80475

Part b) very closely resembles the problem I was having (and 
can no longer reproduce) in editting Doc/tut/tut.tex.

Something in the save procedure silently aborts the save if 
there is something it doesn't like in the file (encoding issues, 
sizes, name issues, python syntax highlighting issues, or 
somesuch).

msg17141 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2003-07-20 23:05
Logged In: YES 
user_id=21627

rhettinger: In IDLE 1.0, IDLE should not ever refuse to safe
the file, unless the declared encoding (through the coding:
declaration) cannot support the characters in the file.

barto: I still don't understand what you did, either in case
a) or case b). Can you give precise data?

For a), please report what specific encoding you have put
into sitecustomize.py, what code page your windows
installation uses, and please attach a file that you have
created with IDLE 0.8.

For b), what do you mean by "save does not work"? Was there
some error message? If so, what did it say? If not, what
else was wrong.
msg17142 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2003-07-20 23:28
Logged In: YES 
user_id=80475

If Bartolomé is having the same issue I had (on Py2.3b2, but 
disappeared on PY2.3c1), then the symptoms were:
   Open Doc/tut/tut.tex
   Make minor edits
   Observe:
    - the stars by the filename show a need for a save
    - the editor window shows the changes
    Save (using Cntl-S, File Save, or FileSaveAs)
    Observe:
    - no windows pop-up refusing to save
    - the stars remain
    - the syntax highlighting indicates that no valid python 
syntax  was found
    Exit IDLE
    - get prompted to save and say YES
    - the exit occurs
    Look at the file
    - note that the changes did not take

For a brief while, I could re-create this reliably.  No one else 
has been able to reproduce (leading me to think this was 
specific to WinMe).  I could also reproduce it on small files 
and files like "c:\a.tmp" (leading me to think it was 
independent of o.s. pathnaming conventions).  For a while, I 
could also reproduce it on a fresh build, but that stopped as 
soon that the last release candidate changes went in.

Sorry for the circuituous report, but if I knew what it was, it 
would be fixed already.  Also, since no one else could 
reproduce it, I was beginning to suspect my own machine.

If Bartolomé is having save problems without getting a 
warning window then, he is having the same issue.
msg17143 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2003-07-21 07:02
Logged In: YES 
user_id=80475

I'll continue to work on this one to see if I can use the 
debugger to isolate the problem.
msg17144 - (view) Author: Bartolome Sintes Marco (barto) Date: 2003-07-21 14:10
Logged In: YES 
user_id=624347

Let's start from the beginning:
1. I uninstall Python from my system (Windows 98 SE Spanish)
In my autoexec.bat I can read
  mode con codepage prepare=((850)C:\WINDOWS\COMMAND\ega.cpi)
  mode con codepage select=850
  keyb sp,,C:\WINDOWS\COMMAND\keyboard.sys
2. I install Python 2.2.3
3. I open IDLE 0.8
4. I create the following file:
  print "¡hola, mundo!"
  print "áéíóú"
  # Hello world in Spanish, with non-ASCII characters
5. If I try to save this file as test01.py, I get the
following IDLE message
 Exception in Tkinter callback
Traceback (most recent call last):
  File "C:\ARCHIVOS DE
PROGRAMA\PYTHON223\lib\lib-tk\Tkinter.py", line 1316, in
__call__
    return apply(self.func, args)
  File "C:\ARCHIV~1\PYTHON~1\Tools\idle\IOBinding.py", line
126, in save
    self.save_as(event)
  File "C:\ARCHIV~1\PYTHON~1\Tools\idle\IOBinding.py", line
136, in save_as
    if self.writefile(filename):
  File "C:\ARCHIV~1\PYTHON~1\Tools\idle\IOBinding.py", line
151, in writefile
    chars = str(self.text.get("1.0", "end-1c"))
UnicodeError: ASCII encoding error: ordinal not in range(128)
6. I add the following sitecustomize.py file to
lib/site-packages:
  # Set the string encoding used by the Unicode implementation.
  # The default is 'ascii'
  encoding = "ascii" # <= CHANGE THIS if you wish
  # Enable to support locale aware default string encodings.
  import locale
  loc = locale.getdefaultlocale()
  if loc[1]:
      encoding = loc[1]
  print encoding
  if encoding != "ascii":
      import sys
      sys.setdefaultencoding(encoding)
7. I close IDLE 0.8 (without saving test01.py)
8. I open again IDLE 0.8
9. I create the following file:
  print "¡hola, mundo!"
  print "áéíóú"
  # Hello world in Spanish, with non-ASCII characters
10. I save the file as test01.py (no messages, no warnings)
11. I close IDLE 0.8
12. I open IDLE 0.8. I load test01.py and run it. Everything
is OK.
13. I uninstall Python 2.2.3 and install Python 2.3rc1
14. I open IDLE 1.0rc1 and open test01.py. That is what I see
  print "¡hola, mundo!"
  print "áéíóú"
  # Hello world in Spanish, with non-ASCII characters

=> I think this behaviour can be an IDLE bug.

15. I try to save test01.py as test02.py. IDLE shows me the
standard save as dialog window. When I click the Save
button, nothing happens (the file name is still test01.py
and a * is shown before the file name)

=> I think this behaviour can be an IDLE bug, too.

16 I delete non-ASCII characters. The program is now:
  print "hola, mundo!"
  print 
17. I save it as test02.py without problems.
18. I close test02.py and open it again. I add non ASCII
characters. The program is now:
  print "¡hola, mundo!"
  print "áéíóú"
19. I try to save it as test03.py. IDLE shows me the warning
message "Non-ASCII found, yet no encoding declared". I click
in "Edit my file". The program is now:
  # -*- coding: cp1252 -*-
  print "¡hola, mundo!"
  print "áéíóú"
I try to save it again as test03.py. IDLE shows me the
standard save as dialog window. When I click the Save
button, nothing happens (the file name is still test02.py
and a * is shown before the file name)

=> I think this behaviour can be an IDLE bug, too.

I am sending attached the test01.py file.

By the way, IDLE creates a .idlerc folder (with a
recent-files.lst file in it) in the folder where I have
created test02.py file. Is it normal?
msg17145 - (view) Author: Bartolome Sintes Marco (barto) Date: 2003-07-21 14:15
Logged In: YES 
user_id=624347

Sorry, but there is at least a mistake in my previous message.
Item 15 should be:
15. I try to save test01.py as test02.py. IDLE shows me the
standard save as dialog window. When I click the Save
button, nothing happens (the file name is still test01.py)

=> I think this behaviour can be an IDLE bug, too.
msg17146 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2003-07-22 20:45
Logged In: YES 
user_id=21627

I cannot reproduce the problem, with Windows ME German
edition, ActivePython 2.2.1, and Python 2.3b2. In
particular, the file opens just fine in step 15, and hence I
cannot execute step 16. Step 19 succeeds with saving the file.

Can you please attach the three files, preferably as a
single ZIP file (see Check to Upload and Attach a File below)?

Can you also report what sys.getdefaultencoding() is in your
two installations?
msg17147 - (view) Author: Bartolome Sintes Marco (barto) Date: 2003-07-22 21:40
Logged In: YES 
user_id=624347

1. I have added getdefaultencoding() to test_01.py:

from sys import *
print getdefaultencoding()
print "¡hola, mundo!"
print "áéíóú"

In Python 2.2.3 (IDLE 0.8) the ouput of the program is now:
cp1252
¡hola, mundo!
áéíóú
(as expected)

2. In Python 2.3rc1 (IDLE 1.0rc1) the output of test_01.py is:
ascii
¡hola, mundo!
áéíóú

But I can not save it as test_02.py (as I have explained in
my previous post).

3. I delete the non-ASCII characters. test_02.py is now:
from sys import *
print getdefaultencoding()
print "hola, mundo!"
print 

Now I can save it in IDLE 1.0rc1 and the output is:
ascii
hola, mundo!

4. In IDLE 1.0rc1 I add non-ASCII characters. The program is
now:
from sys import *
print getdefaultencoding()
print "¡hola, mundo!"
print "áéíóú"

But I can not save it as test_03.py (as I have explained in
my previous post).

5. I am sending attached test_01.py and test_02.py in a zip
file.
msg17148 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2003-07-22 21:53
Logged In: YES 
user_id=21627

I finally understand what is going on. The file test_01.py
is completely corrupted. The second line is encoded in
Latin-1, whereas the last line is encoded in UTF-8. There is
no way IDLE 1.0 could possibly process it in a meaningful way.

The problem with saving it still needs to be addressed.

msg17149 - (view) Author: Bartolome Sintes Marco (barto) Date: 2003-07-22 22:22
Logged In: YES 
user_id=624347

What does it mean "completely corrupted"? IDLE 0.8 can open
and execute it. Is there a way of "cleaning" it?
Do you need more information from me? I will be available
until Saturday the 26th of July.
Best regards.
msg17150 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2003-07-23 04:39
Logged In: YES 
user_id=21627

IDLE 0.8 interprets this example because of a bug; the data
are still bogus. You can use the attached clean.py to
convert the file into iso-8859-1.
msg17151 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2003-08-03 18:13
Logged In: YES 
user_id=21627

Attached is a patch that fixes the problem that IDLE is
unable to save test_01.py. I'll apply it as soon as the
maintenance branch is open.
msg17152 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2003-08-05 05:52
Logged In: YES 
user_id=21627

This is now fixed in IOBinding.py 1.20 and 1.19.8.1.
History
Date User Action Args
2022-04-10 16:10:03adminsetgithub: 38884
2003-07-20 20:23:55bartocreate