Issue549725
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2002-04-28 08:41 by smurf, last changed 2022-04-10 16:05 by admin. This issue is now closed.
Messages (5) | |||
---|---|---|---|
msg10598 - (view) | Author: Matthias Urlichs (smurf) | Date: 2002-04-28 08:41 | |
>>> import sys >>> from sxml.xml2py import parseFile # this is a simple wrapper to xml.dom.minidom.parse() >>> x=parseFile(sys.stdin) <?xml version="1.0" ?> <foo><![CDATA[dies ist ein bar ]]></foo> ^D >>> x [<DOM Element: foo at 1076384172>] >>> x.childNodes[0].childNodes [<DOM Text node "dies ist">, <DOM Text node "\n">, <DOM Text node "ein bar">, <DOM Text node "\n">] >>> I was expecting a CDATASection node here. (In fact, my code would like to depend on it.) |
|||
msg10599 - (view) | Author: Martin v. Löwis (loewis) * | Date: 2002-04-28 13:59 | |
Logged In: YES user_id=21627 What is sxml? Why is this a bug in Python? Notice that the use of CDATA in the DOM is completely optional - the DOM tree represents your document correctly. Code relying on CDATA is broken. |
|||
msg10600 - (view) | Author: Matthias Urlichs (smurf) | Date: 2002-04-28 18:55 | |
Logged In: YES user_id=10327 SXML is a project of mine. As I said, it's just a simple wrapper for minidom. Why should CDATA handling be optional? It seems that it should be _easier_ to package the string into one CDATASection element. Instead, four Text elements are used -- the first line, the first linefeed, the second line, and the second linefeed. It's additional effort, and I'd like to turn it off if I don't want it. |
|||
msg10601 - (view) | Author: Martin v. Löwis (loewis) * | Date: 2002-04-28 19:57 | |
Logged In: YES user_id=21627 Support for CDATA sections is optional because the DOM spec says so, see http://www.w3.org/TR/2000/REC-DOM-Level-2-Core-20001113/core.html#ID-E067D597 Notice that the DOM spec itself is silent on the issue of building DOM trees. In DOM Level 3, there is a feature to control whether CDATA sections are created or not, but minidom is not targeted at DOM level 3 (and DOM level 3 is not completed). The DOM tree is build based on the information that the XML parser produces, which happens to be Expat. This parser, in turn, does not support reporting CDATA section boundaries. You could try to use a different XML parser. Notice that the minidom builder uses the SAX API, which supports reporting of CDATA section boundaries as an option only, as well. So you'd not only need a different parser, but also a different DOM builder. If you absolutely need this functionality, you can use 4DOM with xmlproc, from PyXML. If you don't like several subsequent Text nodes, you can use the DOM element .normalize method to merge them. Notice that .normalize would not merge CDATA sections. In any case, this is clearly not a bug in minidom. |
|||
msg10602 - (view) | Author: Matthias Urlichs (smurf) | Date: 2002-04-28 20:03 | |
Logged In: YES user_id=10327 Oh well... I'm therefore closing this bug. |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-10 16:05:16 | admin | set | github: 36512 |
2002-04-28 08:41:56 | smurf | create |