This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: minidom.py writes TEXT_NODE wrong
Type: Stage:
Components: XML Versions: Python 2.3
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: bpreusing, loewis
Priority: normal Keywords:

Created on 2004-04-18 12:11 by bpreusing, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Messages (4)
msg20537 - (view) Author: Bernd Preusing (bpreusing) Date: 2004-04-18 12:11
Hi,

if I read in a

<tag>value</tag>

it is written with indent and line feeds like this:
domdoc.writexml(of, "", "\t", "\n", self.encoding)

<tag>
     value
</tag>

This behaviour destroys the value, since white space
and line feed belong to the value after the next
reading.

I could circumvent this with strip(), but every XML
validator raises an error, if value is an enumeration
or boolean.

CDATA has a similar problem.

Thanks,
  Bernd
msg20538 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2004-04-20 19:53
Logged In: YES 
user_id=21627

This is not a bug. Passing white-space arguments to writexml
is specifically added for the purpose of pretty-printing.
This, by design, will add whitespace to the output document
that was not present in the input document.

If you don't want pretty-printing, just don't pass these
additional parameters.

>>> import xml.dom.minidom
>>> d=xml.dom.minidom.parseString("<tag>value</tag>")
>>> import sys
>>> d.writexml(sys.stdout)
<?xml version="1.0" ?>
<tag>value</tag>>>>
>>> d.writexml(sys.stdout);print
<?xml version="1.0" ?>
<tag>value</tag>

msg20539 - (view) Author: Bernd Preusing (bpreusing) Date: 2004-04-21 08:35
Logged In: YES 
user_id=879395

So I should put a 5 megabyte XML file into one single line?
You are kidding!

White-space between tags is only relevant between end node
tags (text or CDATA).
I have a modified myminidom.py now, which tests if the child is 
a text node. So the output looks very pretty like this:

<element>
  <tag>value</tag>
</element>

msg20540 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2004-04-21 21:33
Logged In: YES 
user_id=21627

If you want line breaks at certain points where pretty
printing would not normally put them, you either need to
traverse the tree yourself and put out XML in the form you
like, or you can add explicit text nodes to the tree where
you think they belong.

Alternatively, you can, of course, modify minidom and use
the modified implementation instead. 

I personally see no problem with having an XML file of 5 MB
with no line breaks. Python will parse such a file just as
efficiently as it parses a file with line breaks; most
likely, all other XML applications have no problems with
that, either.
History
Date User Action Args
2022-04-11 14:56:03adminsetgithub: 40167
2004-04-18 12:11:01bpreusingcreate