This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: non greedy match bug
Type: Stage:
Components: Regular Expressions Versions: Python 2.3
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: niemeyer Nosy List: brett.cannon, niemeyer, rjroy
Priority: normal Keywords:

Created on 2002-08-30 14:44 by rjroy, last changed 2022-04-10 16:05 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
reference.pdf rjroy, 2002-08-30 14:44 file that will trigger error
Messages (4)
msg12215 - (view) Author: Robert Roy (rjroy) Date: 2002-08-30 14:44
When using the following re to extract all objects from a 
PDF file, I get a maximum recursion limit exceeded error.

Attached is a pdf file that will reproduce the error.

If I do import pre as re, it works fine.

platform is Win2k, Python 2.2.1 build #34

#######
import re

GETOBJECT = re.compile(r'\d+\s+\d+\s+obj.+?endobj', 
re.I|re.S|re.M)

pdf = open('userguide.pdf', 'rb').read()
all = GETOBJECT.findall(pdf)
print len(all)
msg12216 - (view) Author: Robert Roy (rjroy) Date: 2003-02-14 18:56
Logged In: YES 
user_id=352797

The max recursion limit problem in the re module is well-known.  
Until this limitation in the implementation is removed, to work 
around it check

http://www.python.org/dev/doc/devel/lib/module-re.html
http://python/org/sf/493252
msg12217 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2003-05-21 05:54
Logged In: YES 
user_id=357491

Closing this since hitting the recursion limit is not a bug.
msg12218 - (view) Author: Gustavo Niemeyer (niemeyer) * (Python committer) Date: 2003-05-24 16:52
Logged In: YES 
user_id=7887

As Gary Herron correctly pointed me out, this was fixed in
2.3 with the  introduction of a new opcode to handle single
character non-greedy matching.

This won't be fixed in 2.2.3, but hopefully will be
backported to 2.2.4 together with other regular expression
fixes.
History
Date User Action Args
2022-04-10 16:05:38adminsetgithub: 37117
2002-08-30 14:44:20rjroycreate