This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: email package does not work with mailbox
Type: Stage:
Components: Library (Lib) Versions: Python 2.2
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: barry Nosy List: barry, paul.moore, tim.peters
Priority: normal Keywords:

Created on 2002-07-26 09:19 by paul.moore, last changed 2022-04-10 16:05 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
sample.zip paul.moore, 2002-07-26 09:19 Example demonstrating the problem
Messages (6)
msg11698 - (view) Author: Paul Moore (paul.moore) * (Python committer) Date: 2002-07-26 09:19
The email.message_from_file function does not seem to 
work correctly when used as the "factory" argument to 
the mailbox.UnixMailbox function. The "From_" 
separator line gets included in the preceding mail 
message.

For a demonstration of this, see the attached zip file. 
The Python code loads the first message in the mailbox 
file and then displays the final line. This shows the 
From_ line from the subsequent message (correctly 
quoted with a ">"!!!).

This is with Python 2.2 on Windows 2000. The Python 
banner line is

Python 2.2 (#28, Dec 21 2001, 12:21:22) [MSC 32 bit 
(Intel)] on win32
msg11699 - (view) Author: Paul Moore (paul.moore) * (Python committer) Date: 2002-07-26 14:06
Logged In: YES 
user_id=113328

This is a text vs binary file issue with the mailbox module, not 
with email. The "fp" parameter to the [Portable]UnixMailbox 
class must be opened in binary mode, or incorrect results are 
obtained.

This is caused by the implementation of _Subfile.read() in 
mailbox.py, which assumes that number of characters read is 
the same as the difference between tell() values, an 
assumption which fails for text mode files on Windows.

I would consider this behaviour a bug, and would prefer it to 
be fixed. However, if it is not to be fixed, the documentation 
should be changed to note that the "fp" parameter for the 
mailbox constructor must be opened in binary mode.

For MMDF and Babyl format mailboxes, it's arguably correct, 
as the file format is binary. For Unix mailboxes, the file is text 
format, so opening the file in text mode is not unreasonable.

Question: is the universal newline support in Python 2.3 going 
to mess this up further?
msg11700 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2002-07-26 15:59
Logged In: YES 
user_id=12800

I don't have time to comment except to say that
email.message_from_file() definitely works as the factory
function for mailbox.UnixMailbox.  As proof, I have the
following code in Mailman:

class Mailbox(mailbox.PortableUnixMailbox):
    def __init__(self, fp):
        mailbox.PortableUnixMailbox.__init__(self, fp,
email.message_from_file)

so I know it works. :)

I'll look at the comments Paul's made when I get a chance. 
I'm assigning this bug report to me.
msg11701 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2003-03-10 17:05
Logged In: YES 
user_id=12800

Since this makes no difference for *nix and I don't have a
Windows box to play with, I'm assigning to Tim.  He can
thank me later. 

Paul's explanation seems reasonable, but without a patch, I
doubt any of us have motivation to fix it.  I'd be happy
with a documentation change.
msg11702 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2004-02-17 00:06
Logged In: YES 
user_id=31435

Sorry for the delay.  Paul is certainly correct that the only 
thing you can reliably do with a tell() result from a file opened 
in text mode is seek() to it later.  Arithmetic on tell() results 
makes no sense for text-mode files.  That's ANSI C rules, and 
they matter on Windows.

Since we're going on two years without a patch here, sounds 
like changing the docs is better than nothing.
msg11703 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2004-05-10 23:12
Logged In: YES 
user_id=12800

It doesn't look like anybody has the motivation to fix
mailbox.py so I will (finally!) add this note to
libmailbox.tex (for both Python 2.3 and 2.4):

\note{For reasons of this module's internal implementation,
you will probably
want to open the \var{fp} object in binary mode.  This is
especially important
on Windows.}
History
Date User Action Args
2022-04-10 16:05:31adminsetgithub: 36930
2002-07-26 09:19:06paul.moorecreate