This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: re seems broken on 64-bit machines/linux
Type: Stage:
Components: Regular Expressions Versions: Python 2.3
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: loewis Nosy List: loewis, misa
Priority: high Keywords:

Created on 2004-04-08 17:02 by misa, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Messages (4)
msg20443 - (view) Author: Mihai Ibanescu (misa) Date: 2004-04-08 17:02
Hello,

Tested the following piece of code both on ia64 and amd64.

python -c 'import re; print
re.compile(u"[\u002E\u3002\uFF0E\uFF61]").split("a.b.c")'

Expected result:
['a', 'b', 'c']

Actual result: varies depending on the version of glibc
(probably).
On glibc-2.3.2 (Red Hat Enterprise Linux 3 WS) I get back:
['a.b.c']

On glibc-2.3.3 (Fedora Core 2) I get back:
['a.b.', '']

This doesn't happen on i386 architectures.

The above string that I try to compile comes from
encodings/idna.py (and it is used to split a domain
name into components).

Let me know if you need more information on how to
reproduce this.
msg20444 - (view) Author: Mihai Ibanescu (misa) Date: 2004-04-08 17:11
Logged In: YES 
user_id=205865

In both cases, python was built with UCS4, but it happens
with UCS2 just as well.
msg20445 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2004-05-07 07:19
Logged In: YES 
user_id=21627

This is fixed now in

sre_compile.py 1.55
test_re.py 1.49
sre.h 2.25
sre_compile.py 1.49.6.1
NEWS 1.831.4.106
sre.h 2.22.16.1

Are you sure it happens in UCS-2 mode as well? The fix I
made can only apply to UCS-4 mode, and is then independent
of the C library.
msg20446 - (view) Author: Mihai Ibanescu (misa) Date: 2004-05-07 11:09
Logged In: YES 
user_id=205865

I will check to see how UCS2, I remember testing it though
and issuing the same results. Thanks for the fixes, testing
them...
History
Date User Action Args
2022-04-11 14:56:03adminsetgithub: 40128
2004-04-08 17:02:19misacreate