This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Incomplete list of escape sequences
Type: Stage:
Components: XML Versions:
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: comcol, fdrake, loewis
Priority: normal Keywords:

Created on 2002-03-06 13:55 by comcol, last changed 2022-04-10 16:05 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
err.py comcol, 2002-03-06 13:55 Code which reproduces error
Messages (3)
msg9545 - (view) Author: Mark Carter (comcol) Date: 2002-03-06 13:55
There exist some special character tags (i.e. 
beginning with &) that cause exceptions in minidom - 
and probably in other modeules, too. Example below:

=== PYTHON CODE ===
import xml.dom.minidom 

def do(text):
	print "Processing: ", text
	dom = xml.dom.minidom.parseString(text)
	print "... ok"

do("<body> this is ok </body>") #ok
do("<body> &lt; &gt; &amp; &quot; </body>") #ok
do("<body> &pound;  </body>") # error


=== STDOUT ===
Processing:  <body> this is ok </body>
... ok
Processing:  <body> &lt; &gt; &amp; &quot; </body>
... ok
Processing:  <body> &pound;  </body>

=== STDERR ===
Traceback (most recent call last):
  File "err.py", line 10, in ?
    do("<body> &pound;  </body>") # exception
  File "err.py", line 5, in do
    dom = xml.dom.minidom.parseString(text)
  File "C:\PYTHON22\lib\xml\dom\minidom.py", line 965, 
in parseString
    return _doparse(pulldom.parseString, args, kwargs)
  File "C:\PYTHON22\lib\xml\dom\minidom.py", line 952, 
in _doparse
    toktype, rootNode = events.getEvent()
  File "C:\PYTHON22\lib\xml\dom\pulldom.py", line 255, 
in getEvent
    self.parser.feed(buf)
  File "C:\PYTHON22\lib\xml\sax\expatreader.py", line 
111, in feed
    self._err_handler.fatalError(exc)
  File "C:\PYTHON22\lib\xml\sax\handler.py", line 38, 
in fatalError
    raise exception
xml.sax._exceptions.SAXParseException: <unknown>:1:7: 
undefined entity

=== COMMENTS ===
It is also my observation that special character tags 
(aka HTML escape 
sequences) translations are scattered "hither and 
thither" throughout 
modules
in the XML subdirectory, and that it would be better 
if they were all
put in one place.

I might be persuaded to help with the maintenance work 
that this 
would require!
msg9546 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2002-03-06 21:58
Logged In: YES 
user_id=21627

I cannot understand the problem. The parser rightfully
complains about &pound; - this is not one of the predefined
entities of XML. Please refer to the XML spec; only amp, lt,
and gt are predefined in XML. Everything else should and
does produce an error.
msg9547 - (view) Author: Fred Drake (fdrake) (Python committer) Date: 2002-06-17 15:25
Logged In: YES 
user_id=3066

"&pound;" is not defined in the XML spec., as Martin points out.

The comment about "special character tags" at the end of the
initial report is not clear; if you still feel that there is
a problem there, please open a new bug report and be
specific.  Examples would be helpful.
History
Date User Action Args
2022-04-10 16:05:04adminsetgithub: 36212
2002-03-06 13:55:21comcolcreate