[forwarded from http://bugs.debian.org/254757]
mailbox.py _fromlinepattern regexp does not support positive GMT offsets. the pattern didn't change in 2.5.
bug submitter writes:
archivemail incorrectly splits up messages in my mbox-format mail
archvies.
I use Squirrelmail, which seems to create mbox lines that look like
this:
>From mangled@clarke.tinyplanet.ca Mon Jan 26 12:29:24 2004 -0400
The "-0400" appears to be throwing it off. If the first message of an
mbox file has such a line on it, archivemail flat out stops, saying the
file is not mbox.
If the later messages in an mbox file are in this style, they are not
counted, and archivemail thinks that the preceding message is just kind
of long, and the decision to archive or not is broken.
I have stumbled on this bug when I wanted to archive my mails on a
Sarge system. And since my TZ is positive, the regexp did not work. I
think the correct regexp for /usr/lib/python2.3/mailbox.py should be:
_fromlinepattern = r"From \s*[^\s]+\s+\w\w\w\s+\w\w\w\s+\d?\d\s+" \
r"\d?\d:\d\d(:\d\d)?(\s+[^\s]+)?\s+\d\d\d\d\s*((\+|-)\d\d\d\d)?\s*$"
This should handle positive and negative timezones in From lines. I
have tested it successfully with an email beginning with this line:
From fred@athena.olympe.fr Mon May 31 13:24:50 2004 +0200
as well as one withouth TZ reference.
|