This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: FeedParser problem on end boundaries w/o newline
Type: Stage:
Components: Library (Lib) Versions: Python 2.4
process
Status: closed Resolution: wont fix
Dependencies: Superseder:
Assigned To: barry Nosy List: barry, binaryfeed, tlau
Priority: normal Keywords:

Created on 2004-11-24 17:27 by tlau, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
email-newline.py tlau, 2004-11-24 17:27 Test case (requires procmail)
1072623.py barry, 2004-11-28 00:03
Messages (7)
msg23302 - (view) Author: Tessa Lau (tlau) Date: 2004-11-24 17:27
(Python 2.3.4, Linux Debian unstable)

The email module's as_string() method generates
messages that do not include a trailing CRLF on the
last line.  This causes problems when Python-created
messages are piped to procmail and delivered to an mbox.

The attached test script illustrates this behavior. 
You must have a working procmail configured to deliver
mail to an mbox (the default configuration will work).
 If you then read the resulting mailbox with /bin/mail,
it appears as if there is only one message in the
mailbox, instead of two.  The second is concatenated on
to the end of the first.  The mbox does not contain a
blank line between the first message and the second. 
Pop servers require this blank line delimiter between
messages.

You could argue that this is procmail's responsibility,
but as far as I can tell from reading RFC 2822, each
line in an email message must terminate in CRLF, and
Python's email module is violating that spec.
msg23303 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2004-11-28 00:03
Logged In: YES 
user_id=12800

Changing the summary of this issue to reflect the real
problem.   The attachment 1072623.py illustrates that if the
end boundary of a string being parsed by the FeedParser does
not have a trailing newline, the parser doesn't recognize
this as an end boundary.  It just so happens that your
example produces a message string with that property (which
isn't a bug).

The fix is about as trivial as you can get: one character.
msg23304 - (view) Author: Tessa Lau (tlau) Date: 2004-11-28 00:45
Logged In: YES 
user_id=112896

My original bugreport had to do with email generation, not
parsing.  Python seems to be generating email that is not
compliant with the RFC spec.  In my situation, parsing is
done by 3rd party programs (procmail and /bin/mail) which
also fail to deal gracefully with the lack of trailing newline.
msg23305 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2004-11-29 01:10
Logged In: YES 
user_id=12800

I must have been thinking about the message you posted to
the email sig, which uncovered the bug I commented on, and
fixed.  In the case of the original bug report, I don't
believe this is fixable.  There is, however a simple
workaround for you.  In your sample code, set the outer
message's epilogue to the empty string, e.g.:

msg1.epilogue = ''

...

msg2.epilogue = ''

This will cause the Generator to add a newline at the end of
the outer message.  We can't make that change in the Message
class because doing so would break inner message flattening.

However, if someone were to come up with a patch that fixes
this problem yet doesn't break any of the 217 tests in the
current test suite, I'd be happy to look at it.  As it is,
nothing will be changed for Python 2.4. final.
msg23306 - (view) Author: Tessa Lau (tlau) Date: 2004-11-29 13:23
Logged In: YES 
user_id=112896

It must have been someone else on the email sig; I haven't
posted there recently.

Thanks for the workaround.  However, it only fixes the
problem for MIME messages, but not for non-MIME messages. 
The second message constructed in my test script still lacks
a trailing newline.

I can work around it after the message is generated by
checking for a final newline on the string and adding it if
it's missing, but that seems clunky.
msg23307 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2004-11-30 22:21
Logged In: YES 
user_id=12800

Yep, it was someone else's posting, sorry about that.

As for the trailing newline on non-MIME messages, you will
need to make sure that your payload is terminated with a
newline.  Generator won't do that on the basis that
maintaining idempotency (what goes in equals what goes out)
is more important.
msg23308 - (view) Author: Jeffrey Wescott (binaryfeed) Date: 2005-05-19 21:42
Logged In: YES 
user_id=189789

Well, idempotency is completely borked.  If I do:

f = file('somemessage')
import email
m = email.message_from_file(f)
f2 = file('newmsg', 'w')
f2.write(m.as_string())

somemessage and newmsg will be *different*.

This is hardly "what goes in equals what goes out".
History
Date User Action Args
2022-04-11 14:56:08adminsetgithub: 41225
2004-11-24 17:27:45tlaucreate