This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: cannot find or replace umlauts
Type: Stage:
Components: Library (Lib) Versions: Python 2.3
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: doerwalter, tyuk
Priority: normal Keywords:

Created on 2004-04-18 11:04 by tyuk, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Messages (2)
msg20535 - (view) Author: Eleonora (tyuk) Date: 2004-04-18 11:04
#codecs1.py 
# -*- coding: <utf-8> -*- 
import codecs 
import string 
from string import * 
import re 
 
s = 'Cica Ûl daß Õrölt Ûz' 
print s 
---s = unicode(s,"iso-8859-1")  ------- this is 
needed!! 
print s 
print s.lower() 
print find(s,u'daß') 
s = replace(s, u'daß', u'dass') 
print s 
----------------------- 
 
Without the unicode conversion a normal string cannot 
be searched or replaced by other than ascii chars (up 
to 128). This is very bad praxis. At least iso-8859-1 
should be the default codec, not ascii. 
 
Please inform me over email (eleonora46@gmx.net) 
about the processing of this issue, thanks. 
msg20536 - (view) Author: Walter Dörwald (doerwalter) * (Python committer) Date: 2004-04-19 11:41
Logged In: YES 
user_id=89016

If you replace s = 'Cica Ûl daß Õrölt Ûz' with s = u'Cica Ûl daß 
Õrölt Ûz' (note the u prefix), you can drop the s=unicode(...) 
line. Specifying an encoding header will only change the 
unicode literals in the script, not the str literals. Note that 
ASCII was choosen as the default encoding, because it helps 
to detect conversion error.
History
Date User Action Args
2022-04-11 14:56:03adminsetgithub: 40166
2004-04-18 11:04:27tyukcreate