This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Fix for 1002475 (Feedparser not handling \r\n correctly)
Type: Stage:
Components: Library (Lib) Versions: Python 2.4
process
Status: closed Resolution: accepted
Dependencies: Superseder:
Assigned To: barry Nosy List: anadelonbrin, barry, sjoerd
Priority: normal Keywords: patch

Created on 2004-08-05 01:27 by anadelonbrin, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
FeedParser.py.diff anadelonbrin, 2004-08-05 01:46 New patch for lib\email\FeedParser.py
test_email.py.diff anadelonbrin, 2004-08-05 02:09 Improved unittests for this patch.
Messages (5)
msg46537 - (view) Author: Tony Meyer (anadelonbrin) Date: 2004-08-05 01:27
Python 2.4a1/Anon CVS of 5 Aug 2004.  WinXP SP1, not
that that matters.

This is a patch to fix bug:

[ 1002475 ] email message parser doesn't handle \r\n
correctly
http://sourceforge.net/tracker/?group_id=5470&atid=105470&func=detail&aid=1002475

Basically, FeedParser does a rstrip() on the last
header value, but not any of the others.  Given that
RFC822 wants the headers to end with \r\n, it's likely
that most will, and this means that all the headers
except for the last end with \r.  This one line patch
just does the extra rstrip.

Note that you might want to implement the fix yourself
- it doesn't fit on a line, and so it's wrapped in an
ugly way - maybe you can think of something prettier :)
 This does fix the problem, though (here, at least,
using the test case outlined in the bug report).
msg46538 - (view) Author: Tony Meyer (anadelonbrin) Date: 2004-08-05 01:46
Logged In: YES 
user_id=552329

Attaching new patch, because that one breaks one of the
tests.  The rstrip() needs to be rstrip('\r\n') or it strips
other whitespace that isn't allowed there.
msg46539 - (view) Author: Tony Meyer (anadelonbrin) Date: 2004-08-05 01:48
Logged In: YES 
user_id=552329

Also attached is a patch for test_email.py.  This adds two
new tests (which fail before the FeedParser.py patch, and
pass afterwards).

The first tests that the \r\n characters are correctly removed.
The second is a copy of an existing test to check
continuations with whitespace, but checks if it works with
the continuation as the last header.
msg46540 - (view) Author: Tony Meyer (anadelonbrin) Date: 2004-08-05 02:09
Logged In: YES 
user_id=552329

Second test was in the wrong place, so moving.
msg46541 - (view) Author: Sjoerd Mullender (sjoerd) * (Python committer) Date: 2004-08-06 06:39
Logged In: YES 
user_id=43607

I've only tried the patch, not the test, but it seems to
work well.
History
Date User Action Args
2022-04-11 14:56:06adminsetgithub: 40689
2004-08-05 01:27:29anadelonbrincreate