This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: sub[n] not working as expected.
Type: Stage:
Components: Regular Expressions Versions: Python 2.2
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: effbot Nosy List: effbot, nicfit, nowonder
Priority: normal Keywords:

Created on 2002-08-24 22:14 by nicfit, last changed 2022-04-10 16:05 by admin. This issue is now closed.

Messages (2)
msg12158 - (view) Author: Travis Shirk (nicfit) Date: 2002-08-24 22:14
I'm running into what looks to be a bug in the python
2.2 re module.
These examples should demonstrate the problem.

Using Python 1.5.2:
import re;
data =
"\xFF\x00\xE0\xD3\xD3\xE4\x95\xFF\x00\x00\x11\xFF\x00\xF5"
data1 =
re.compile(r"\xFF\x00([\xE0-\xFF])").sub(r"\xFF\1", data);
print data1
'\377\340\323\323\344\225\377\000\000\021\377\365'


This output is exactly what I expect, but now see what
happens in 
2.2.1:
import re;
data =
"\xFF\x00\xE0\xD3\xD3\xE4\x95\xFF\x00\x00\x11\xFF\x00\xF5"
data1 =
re.compile(r"\xFF\x00([\xE0-\xFF])").sub(r"\xFF\1", data);
print data1
'\\xFF\xe0\xd3\xd3\xe4\x95\xff\x00\x00\x11\\xFF\xf5'

I like the hex output over the octal in 1.5, but the
substitution is
clearly wrong.  Notice each spot containing "\\" in the
last result.
msg12159 - (view) Author: Peter Schneider-Kamp (nowonder) * (Python triager) Date: 2002-08-27 16:22
Logged In: YES 
user_id=14463

The substitution is correct. Notice that the r"..." raw
string given to sub in this example has length 6, not length
3! As you can see from the case, \\xFF is a string of length
4 and has no close relationship to the singleton string \xff.

If you use .sub("\xFF\\1", data) instead you will achieve
the desired result.

Note that the raw string passed to re.compile() also does
not contain the character \xff itself, but as described in
the documentation, re is able to parse the \xHH-style
character escapes.
History
Date User Action Args
2022-04-10 16:05:37adminsetgithub: 37084
2002-08-24 22:14:57nicfitcreate