I ran into the following problem trying to parse an ms
outlook mail
box. Cut down to its bare essentials:
> cat tst.py
import re
mstr = (11000*' ') + 'A'
pattern = re.compile('.*?A')
pattern.search(mstr)
> python tst.py
Traceback (most recent call last):
File "tst.py", line 5, in ?
pattern.search(mstr)
RuntimeError: maximum recursion limit exceeded
> python
Python 2.2.1c1 (#6, Jul 20 2002, 09:40:07)
[GCC 2.96 20000731 (Mandrake Linux 8.1 2.96-0.62mdk)]
on linux2
Type "help", "copyright", "credits" or "license" for
more information.
>>>
The combination of a longish string with ".*?" gives
the error. Using
".*" is ok.
Could "non-greedy" matching be implemented non-recursively?
If I understand correctly, the limit exceeded is
USE_RECURSION_LIMIT in Modules/_sre.c. It is slightly
confusing
because we also have the Python recursion limit (my
first reaction
was to bump it up with sys.setrecursionlimit(), but
that of course
didn't help).
|