This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: (?(id/name)yes|no) re implementation
Type: Stage:
Components: Library (Lib) Versions: Python 2.3
process
Status: closed Resolution: accepted
Dependencies: Superseder:
Assigned To: niemeyer Nosy List: loewis, niemeyer
Priority: normal Keywords: patch

Created on 2002-06-24 01:41 by niemeyer, last changed 2022-04-10 16:05 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
python-2.3a0-grouprefexists.patch niemeyer, 2002-06-24 01:41
python-2.3b1-grouprefexists.patch niemeyer, 2003-06-14 05:37
Messages (11)
msg40400 - (view) Author: Gustavo Niemeyer (niemeyer) * (Python committer) Date: 2002-06-24 01:41
This patch implements a regular expression feature, which allows   
some interesting patterns, in the same way as implemented in perl.   
For example, (?(1)yes|no) matches with "yes" if group "1" exists, and   
with "no", if it doesn't. Without this feature, the regular expression   
must be duplicated to get the same results. In addition to perl's 
feature, it will also accept a Python named group as argument. 
   
Here's an example:   
   
(<)?\w+@\w+(\.\w+)+(?(1)>)   
  
This is a poor email matching regular expression, which will match   
with or without the "<>" symbols.   
   
msg40401 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2003-04-19 08:50
Logged In: YES 
user_id=21627

If you add new opcodes, you should also change SRE_MAGIC.
msg40402 - (view) Author: Gustavo Niemeyer (niemeyer) * (Python committer) Date: 2003-04-19 21:30
Logged In: YES 
user_id=7887

That patch is around for a long time. Should I work on it,
fixing that problem, and apply it? Do you agree with the
feature inclusion?

I remember that the main reason for implementing this is
because it is hard to achieve the same results without it.
You have to rewrite the whole match twice inside an or'ed group
(e.g. "(<... match email ...>|... match email ...)").
msg40403 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2003-04-20 06:23
Logged In: YES 
user_id=21627

I like the patch in principle, but I have a number of
additional concerns:
- there are no test cases
- the feature is declared experimental in perlre(1). Why?
- Shouldn't there be a semantic restriction that the back
  reference is only allowed if it points to a group that is
known
  to precede? I.e. is

  (X)|(?(1)Y)

  valid? If not, the restriction should be atleast documented, 
  but if possible, it should also be implemented.
msg40404 - (view) Author: Gustavo Niemeyer (niemeyer) * (Python committer) Date: 2003-04-20 08:12
Logged In: YES 
user_id=7887

About the test cases, they're missing indeed. I can write
some while applying the patch.

About being experimental, IIRC, it is listed like
experimental in the Perl documentation for several years,
and will probably stay like this forever. :-) Anyway, IMO
this shouldn't affect our evaluation of the importance of
that feature for Python's sre.

About semantic restriction, do you mean check if the
backreference is lesser than the current group? Should be
doable. OTOH, I don't understand your example. In
"(X)|(?(1)Y)", there's no sense in using (?(1), as it will
always be false.
msg40405 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2003-04-20 21:57
Logged In: YES 
user_id=21627

Exactly: My example makes no sense, it will always be false
since the reference is to an alternative that cannot be
simultaneously be taken. Therefore, I think this should be
an error.
msg40406 - (view) Author: Gustavo Niemeyer (niemeyer) * (Python committer) Date: 2003-04-20 22:09
Logged In: YES 
user_id=7887

I see. I'll try to improve the patch with your suggestions
as soon as I get some time to work on it. Thanks for your
support.
msg40407 - (view) Author: Gustavo Niemeyer (niemeyer) * (Python committer) Date: 2003-06-14 03:52
Logged In: YES 
user_id=7887

Martin, I've checked your concern about making "(X)|(?(1)Y)"
an error, and unfortunately the current framework doesn't
implement enough state information to catch this. Notice
that this is not implemented in very similar cases, like
"(X)|\1", which does exactly the same thing as "(X)|(?(1)X)".

I'll be applying that patch as soon as I check it against
the current HEAD, and implement some tests (and before it
completes its first year of life 8-).

Thanks!
msg40408 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2003-06-14 05:15
Logged In: YES 
user_id=21627

Please don't apply the patch before 2.3; this is in beta
now, so no new features are allowed (unless you get BDFL
permission, of course).
msg40409 - (view) Author: Gustavo Niemeyer (niemeyer) * (Python committer) Date: 2003-06-14 05:37
Logged In: YES 
user_id=7887

Ack!! I'm not going to ask Guido if you belive it's not
worth for 2.3.

I'm attaching a new version of the patch, updated to the
current HEAD, and including tests.

Thanks for your attention!
msg40410 - (view) Author: Gustavo Niemeyer (niemeyer) * (Python committer) Date: 2003-10-19 14:23
Logged In: YES 
user_id=7887

Comitted with patch #757624.
History
Date User Action Args
2022-04-10 16:05:26adminsetgithub: 36790
2002-06-24 01:41:31niemeyercreate