This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Can't exclude words before capture group
Type: Stage:
Components: Regular Expressions Versions: Python 2.4
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: niemeyer Nosy List: ctimmerman, georg.brandl, niemeyer
Priority: normal Keywords:

Created on 2006-11-15 14:27 by ctimmerman, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Messages (4)
msg30553 - (view) Author: Cees Timmerman (ctimmerman) Date: 2006-11-15 14:27
Python 2.4.3 (#2, Oct  6 2006, 07:52:30)
[GCC 4.0.3 (Ubuntu 4.0.3-1ubuntu5)] on linux2

Tried:

>>> re.findall(r'(?!def)\b(\S+)\(', "def bla(): dof blu()")

>>> re.findall(r'(?:def){0}\b(\S+)\(', "def bla(): dof blu()")

Result:

['bla', 'blu']

Expected:

['blu']


Why doesn't (?!) work like it does here?:

>>> re.findall(r'\b(\S+): (?!bad)', "bob: bad; suzy: good")
['suzy']


Wouldn't it be nice if (^) worked?

>>> re.findall(r'\b(\S+): (^bad)', "bob: bad; suzy: good")
[]

[^()] does, sorta. Also not before a capture group:

>>> re.findall(r'\b(\S+): [^(bad)]', "bob: bad; suzy: good")
['suzy']
>>> re.findall(r'[^(def)]\b(\S+)\(', "def bla(): dof blu()")
['bla', 'blu']
>>> re.findall(r'[^(def)] (\S+)\(', "def bla(): dof blu()")
[]
>>> re.findall(r'(^def) (\S+)\(', "def bla(): dof blu()")
[('def', 'bla')]
msg30554 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2006-11-15 17:20
What you want is
>>> re.findall(r'(?<!def)\s(\S+)\(', "def bla(): dof blu()")

\b doesn't match a space, it matches an end of word. And to do a lookbehind assertion, use (?<[!]...)

Why on earth do you expect that (^...) works?
[^(bad)] is something entirely different, it's a character class excluding (, b, a, d and ).
msg30555 - (view) Author: Cees Timmerman (ctimmerman) Date: 2006-11-17 10:50
I tried (^ because [^ works. (^ doesn't seem to do anything. To match ^ inside () you need to use (\^), anyway.
msg30556 - (view) Author: Cees Timmerman (ctimmerman) Date: 2006-11-17 11:35
Btw, thanks for the explanation, and I think you meant (?<[!=]

My final pattern:
>>> re.findall(r'(?<!def)[\W]+([.\w]+)\(', "def bla(): [ryawry.aet().blu()]")
['ryawry.aet', 'blu']
History
Date User Action Args
2022-04-11 14:56:21adminsetgithub: 44234
2006-11-15 14:27:41ctimmermancreate