Issue1212411
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2005-06-01 05:13 by karamana, last changed 2022-04-11 14:56 by admin. This issue is now closed.
Messages (6) | |||
---|---|---|---|
msg25458 - (view) | Author: Vijay Kumar (karamana) | Date: 2005-06-01 05:13 | |
The regular expression "|hello|world" incorrectly gives a match, owing to the starting '|'. Below is a sample program which highlights this. The correct result behavior is to return None: If the leading '|' is removed then the result is correct. ----- import re m = re.search("|hello|world","This is a simple sentence") print m m2 = re.search("hello|world","This is a simple sentence") print m2 ---- output --- <_sre.SRE_Match object at 0x00B71F70> None ---------- The first one is incorrect. Should have returned a None. |
|||
msg25459 - (view) | Author: Tim Peters (tim.peters) * | Date: 2005-06-01 05:19 | |
Logged In: YES user_id=31435 I expect you'll find that, e.g., Perl does the same thing: a "missing" alternative is treated as an empty string, and an empty string always matches. What basis do you have for claiming it should not match (beyond just repeating that it should not <wink>)? |
|||
msg25460 - (view) | Author: Raymond Hettinger (rhettinger) * | Date: 2005-06-01 21:39 | |
Logged In: YES user_id=80475 The current behavior best matches my expectations. One other datapoint, AWK handles it the same way. Recommend closing this as Invalid. |
|||
msg25461 - (view) | Author: Vijay Kumar (karamana) | Date: 2005-06-01 22:59 | |
Logged In: YES user_id=404715 I think what you are saying is correct in terms of a formal sense, but it makes sense to distinguish between a useful match and an empty match. May be there can be an additional method isEmptyMatch() in the match object which can be used to detect this. Also this one does not work: Gives a compile error m = re.search("[]","This is a simple sentence") print m wherease this one returns None: m = re.search("[|]","This is a simple sentence") print m So the empty match is not consistent :) (don't know if I should wink ) |
|||
msg25462 - (view) | Author: Georg Brandl (georg.brandl) * | Date: 2005-06-02 06:49 | |
Logged In: YES user_id=1188172 Your example is wrong. "[]" is an error because it is an empty character group. "[|]" is a valid character group which matches the literal "|", it is equivalent to r"\|". Between [ and ] most character lose their special meaning. I'm also in favour of the current behaviour, recommend closing. |
|||
msg25463 - (view) | Author: Tim Peters (tim.peters) * | Date: 2005-06-03 14:53 | |
Logged In: YES user_id=31435 I'm closing this as not-a-bug. The current behavior makes sense, matches how other regexp packages work, and can't be changed regardless without breaking existing code. Note that (mid|)night isn't the same as (mid)?night in the case where "mid" doesn't match. That's one reason the first form actually gets used (in the first form group 1 matches an empty string, in the second form group 1 doesn't match at all). As birkenfeld said, character classes are entirely different gimmicks. |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:56:11 | admin | set | github: 42039 |
2005-06-01 05:13:29 | karamana | create |