This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Python crashes in pyexpat.c if malformed XML is parsed
Type: Stage:
Components: Extension Modules Versions: Python 2.3
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: nnorwitz Nosy List: fdrake, nnorwitz, pdecat
Priority: high Keywords: patch

Created on 2005-03-31 10:59 by pdecat, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
pyexpat-patches.txt pdecat, 2005-03-31 10:59 Patches for r231,r232,r233,r234,r235 (pyexpat.c-2.83) and HEAD (pyexpat.c-2.90)
Messages (6)
msg48099 - (view) Author: pdecat (pdecat) Date: 2005-03-31 10:59
If a malformed XML file (containing non unicode
characters) is parsed with pyexpat, python crashes.

Most details on request.
msg48100 - (view) Author: pdecat (pdecat) Date: 2005-03-31 12:18
Logged In: YES 
user_id=1210681

Maybe security related, as it can lead to denial of service:
it crashes a Zope server using the ParsedXML product simply
by uploading a malformed XML file.
msg48101 - (view) Author: pdecat (pdecat) Date: 2005-03-31 13:00
Logged In: YES 
user_id=1210681

STRING_CONV_FUNC returns NULL if the string is contains
non-ascii and non-unicode characters.
msg48102 - (view) Author: Fred Drake (fdrake) (Python committer) Date: 2005-08-23 03:59
Logged In: YES 
user_id=3066

I realize this has sat too long; sorry.

Can you send an example XML file for which this crashes for you?

Do you let Expat determine and handle the encoding itself,
or do you override the detected encoding when you create the
parser?
msg48103 - (view) Author: pdecat (pdecat) Date: 2005-08-23 10:29
Logged In: YES 
user_id=1210681

Sorry, I don't have the malformed XML file anymore. I've
tried and failed to reproduce the problem by hand.

The ParsedXML product lets expat determine the encoding
itself : 
    def createParser(self):
        """Create a new parser object."""
        return expat.ParserCreate()

If you have a look at my patch, it's a real simple one-liner
(ok two ;) that checks the return value of the
STRING_CONV_FUNC is not NULL before using it.
msg48104 - (view) Author: Neal Norwitz (nnorwitz) * (Python committer) Date: 2006-01-10 07:14
Logged In: YES 
user_id=33168

A similar patch was commited to pyexpat.c in 39631 from
Patch #1309009.  Thanks.
History
Date User Action Args
2022-04-11 14:56:10adminsetgithub: 41782
2005-03-31 10:59:05pdecatcreate