This makes cgi.parse_header() properly unquote header
parameters. E.g.,
given a header:
content-disposition: attachment;
filename="weird\\file\"name"
parse_header() should return ('attachment',
{'filename': 'weird\\file"name'}),
but the current parse_header() just strips the ""s, it
doesn't unquote the \s,
so you get too many \s in the output.
This could possibly use rfc822.unquote, but
rfc822.unquote doesn't unquote \s
either! This looks like a bug since immediately
preceeding the definition of
rfc822.unquote is
# XXX Should fix unquote() and quote() to be really
conformant.
The email package uses rfc822.unquote, which means its
msg.get_filename()
has the "too many \s" problem. So maybe I'll include a
patch for rfc822.py as
well, and send a patch to Barry for email/Util.py.
A quick glance through RFC 822 sheds no light on the
use of <> for quoted
*strings*, only for addr-spec. So I'm not sure what
kind of quoting goes in
in them, and "none" seems a reasonable guess, so I
didn't change it.
So does that make unquote() "really conformant"? Dunno...
I've got this feeling like rfc822.py is sort of
supposed to be subsumed by the
email package anyway...
Oh, and according to the RFC, the other thing not
allowed in ""s
is newline, but I wasn't sure if that was quoted with a
\. If so, that's
an easy fix. I have a feeling newlines in quoted
strings aren't a
great idea anyway since, even if legal, many simple parsers
will probably not like them.
|