This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Wrong expression with \w+?
Type: Stage:
Components: Regular Expressions Versions: Python 2.4
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: niemeyer Nosy List: effbot, engel_re, niemeyer
Priority: normal Keywords:

Created on 2005-01-18 16:49 by engel_re, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Messages (3)
msg23975 - (view) Author: rengel (engel_re) Date: 2005-01-18 16:49
str = 'match the url www.junit.org with following regex'
regex = re.compile('(www\.\w+?\.\w+?)')
print regex.sub('<span class="url">\\1</span>', str)
# It produces
match the url <span class="url">www.junit.o</span>rg
with following regex
# It should produce
match the url <span class="url">www.junit.org</span>
with following regex

msg23976 - (view) Author: Fredrik Lundh (effbot) * (Python committer) Date: 2005-01-18 17:23
Logged In: YES 
user_id=38376

No, it shouldn't.  "+?" means the shortest possible match 
that's one character or more.  If you want the longest 
possible match, get rid of the "?".

(in this case, I'd use "(www[.\w]*)")

</F>
msg23977 - (view) Author: Gustavo Niemeyer (niemeyer) * (Python committer) Date: 2005-01-18 17:25
Logged In: YES 
user_id=7887

There's nothing wrong with this result. You asked for a non-greedy match 
(you've used '\w+?', not '\w+'), and SRE gave you the minimum possible 
match. 
History
Date User Action Args
2022-04-11 14:56:09adminsetgithub: 41458
2005-01-18 16:49:01engel_recreate