I'm including a python session that I cut paste from my
xterm. Essentially the problem boils down to when
QUOTE_NONE is set for the csv reader, and it encounters
a quote immediately after a separator, it assumes that
it is in a quoted field, and keeps going till it finds
the matching end quote. Either this is a bug, or the
meaning of QUOTE_NONE is not clear. My patch for it is
to check for QUOTE_NONE immediately after the
delimiter, and if so the state machine skips to
IN_FIELD state. The patch is against 2.3.3
134 wooster:~> python
Python 2.3.3 (#1, Dec 30 2004, 14:12:38)
[GCC 3.3.5 (Debian 1:3.3.5-5)] on linux2
Type "help", "copyright", "credits" or "license" for
more information.
>>> import csv
>>>
>>> class plain_dialect(csv.Dialect):
... delimiter="\t"
... escapechar="\\"
... doublequote=False
... skipinitialspace=True
... lineterminator="\n"
... quoting=csv.QUOTE_NONE
... quotechar="'"
...
>>> csv.register_dialect("plain", plain_dialect)
>>> import StringIO
>>> s = StringIO.StringIO()
>>> w = csv.writer(s, dialect="plain")
>>> w.writerow(["foo", "'bar"])
>>> s.seek(0)
>>> s.read()
"foo\t'bar\n"
>>> s.seek(0)
>>> r = csv.reader(s, dialect="plain")
>>> r.next()
Traceback (most recent call last):
File "<stdin>", line 1, in ?
_csv.Error: newline inside string
>>>
After patching:
135 wooster:~> python
Python 2.3.3 (#1, Dec 30 2004, 14:12:38)
[GCC 3.3.5 (Debian 1:3.3.5-5)] on linux2
Type "help", "copyright", "credits" or "license" for
more information.
>>> import csv
>>> class plain_dialect(csv.Dialect):
... delimiter="\t"
... escapechar="\\"
... doublequote=False
... skipinitialspace=True
... lineterminator="\n"
... quoting=csv.QUOTE_NONE
... quotechar="'"
...
>>> csv.register_dialect("plain", plain_dialect)
>>> import StringIO
>>> s = StringIO.StringIO()
>>> w = csv.writer(s, dialect="plain")
>>> w.writerow(["foo", "'bar"])
>>> s.seek(0)
>>> s.read()
"foo\t'bar\n"
>>> s.seek(0)
>>> r = csv.reader(s, dialect="plain")
>>> r.next()
['foo', "'bar"]
>>>
|