xml.dom.minidom was unable to parse an xml file that came from an example provided by an official organism.(http://www.iptc.org/IPTC4XMP)
The parsed file was somewhat hairy, but I have been able to reproduce the bug with a simplified
version, attached. (ends with .xmp: its supposed
to be an xmp file, the xmp standard being built on
xml. Well, thats the short story).
The offending part is the one that goes: xmpPLUS='....'
it triggers an exception: ValueError: too many values to unpack,
in _parse_ns_name. Some debugging showed an obvious mistake
in the scanning of the name argument, that goes beyond the closing
" ' ".
I digged a little further thru a pdb session, but the bug seems to be located in c code.
Thats the very first time I report a bug, chances are I provide too much or too little information...
To whoever it may concern, here is the invoking code:
from xml.dom import minidom
...
class xmp(dict):
def __init__(self, inStream):
xmldoc = minidom.parse(inStream)
....
x = xmp('/home/pierre/devt/port/IPTCCore-Full/x.xmp')
traceback:
/home/pierre/devt/fileInfo/svnRep/branches/xml/xmpLib/xmpLib.py in __init__(self, inStream)
26 def __init__(self, inStream):
27 print minidom
---> 28 xmldoc = minidom.parse(inStream)
29 xmpmeta = xmldoc.childNodes[1]
30 rdf = xmpmeta.childNodes[1]
/home/pierre/devt/fileInfo/svnRep/branches/xml/xmpLib/nxml/dom/minidom.py in parse(file, parser, bufsize)
/home/pierre/devt/fileInfo/svnRep/branches/xml/xmpLib/xml/dom/expatbuilder.py in parse(file, namespaces)
922 fp = open(file, 'rb')
923 try:
--> 924 result = builder.parseFile(fp)
925 finally:
926 fp.close()
/home/pierre/devt/fileInfo/svnRep/branches/xml/xmpLib/xml/dom/expatbuilder.py in parseFile(self, file)
205 if not buffer:
206 break
--> 207 parser.Parse(buffer, 0)
208 if first_buffer and self.document.documentElement:
209 self._setup_subset(buffer)
/home/pierre/devt/fileInfo/svnRep/branches/xml/xmpLib/xml/dom/expatbuilder.py in start_element_handler(self, name, attributes)
743 def start_element_handler(self, name, attributes):
744 if ' ' in name:
--> 745 uri, localname, prefix, qname = _parse_ns_name(self, name)
746 else:
747 uri = EMPTY_NAMESPACE
/home/pierre/devt/fileInfo/svnRep/branches/xml/xmpLib/xml/dom/expatbuilder.py in _parse_ns_name(builder, name)
125 localname = intern(localname, localname)
126 else:
--> 127 uri, localname = parts
128 prefix = EMPTY_PREFIX
129 qname = localname = intern(localname, localname)
ValueError: too many values to unpack
The offending c statement:
/usr/src/packages/BUILD/Python-2.4/Modules/pyexpat.c(582)StartElement()
The returned 'name':
(Pdb) name
Out[5]: u'XMP Photographic Licensing Universal System (xmpPLUS, http://ns.adobe.com/xap/1.0/PLUS/) CreditLineReq xmpPLUS'
Its obvious the scanning went beyond the attribute.
|