This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: xml parser bug
Type: Stage:
Components: XML Versions: Python 2.3
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: dtefft, loewis
Priority: normal Keywords:

Created on 2004-04-22 19:35 by dtefft, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
xml_parse_blast.py dtefft, 2004-04-22 19:38
failed_xml.txt dtefft, 2004-04-22 19:38
Messages (4)
msg20560 - (view) Author: David Tefft (dtefft) Date: 2004-04-22 19:35
I am using minidom to parse an xml file.  When I run
the script on a linux machine the script truncates a
string.  When I run the script on a Mac running OSX the
script behaves the way I expect.  Anyone encounter this
problem.  My suspicion is the reason for the difference
is the 32 vs 64 bit processors.

Dave

PS I would attach the xml file but it is quite large.
msg20561 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2004-04-24 08:16
Logged In: YES 
user_id=21627

I run the script on Linux, as "python xml_parse_blast.py
failed_xml.txt", and it produces no output. This is because
en(qseq) == len(midline) in all cases.

So what is the expected output of the script?
msg20562 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2004-04-24 08:20
Logged In: YES 
user_id=21627

As a follow-up, I notice a bug in your script. To access the
content of an element, you do .firstChild.nodeValue. This is
incorrect: The content could be split over multiple text nodes.

Older Python versions indeed did split element content over
multiple text nodes. To obtain the text content of an
element, you need to iterate over all children, find out
which of them are text nodes, and concatenate their node values.

If you know that you won't have any comments, processing
instructions, or CDATA sections in the input, you can
alternatively invoke .normalize() on the document, which
will collapse subsequent text nodes into single ones.

Assuming that this is the phenomenon you are seeing, it is a
bug in your script, so I close this report as invalid.
msg20563 - (view) Author: David Tefft (dtefft) Date: 2004-04-24 12:11
Logged In: YES 
user_id=966295

There should be no output.  However when I run the script on a linux box 
there is output.  When I run it on Mac OSX it behaves properly

Thanks for your input.
History
Date User Action Args
2022-04-11 14:56:03adminsetgithub: 40176
2004-04-22 19:35:52dtefftcreate