This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Universal line ending mode duplicates all line endings
Type: Stage:
Components: Windows Versions: Python 2.5
process
Status: closed Resolution: wont fix
Dependencies: Superseder:
Assigned To: Nosy List: draghuram, georg.brandl, ggenellina, gjb1002
Priority: normal Keywords:

Created on 2007-05-07 14:50 by gjb1002, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Messages (11)
msg31965 - (view) Author: Geoffrey Bache (gjb1002) Date: 2007-05-07 14:50
On Windows XP, reading a file produced by Windows XP with universal line endings produces twice as many lines!

Python 2.5.1 (r251:54863, Apr 18 2007, 08:51:08) [MSC v.1310 32 bit (Intel)] on win32
Type "copyright", "credits" or "license()" for more information.
>>> open("winlineend").read()
'Hello\r\n'
>>> open("winlineend", "rU").read()
'Hello\n\n'


I would expect the last to give "Hello\n". 
msg31966 - (view) Author: Raghuram Devarakonda (draghuram) (Python triager) Date: 2007-05-07 17:39

I created a file "test.txt" with notepad whose contents are "hello\r\n". Both open().read() and open("rU").read() returned 'hello\n'. I tested with both 2.5 and 2.5.1 (installed using installers from python.org) and the result is same on both. Can you elaborate your test case more? How is this file "winlineend" created?
msg31967 - (view) Author: Geoffrey Bache (gjb1002) Date: 2007-05-07 20:27
I create the file as follows: 

>>> import os
>>> file = open("test.txt", "w")
>>> file.write("Hello" + os.linesep)
>>> file.close()
>>> open("test.txt").read()
'Hello\r\n'
>>> open("test.txt", "rU").read()
'Hello\n\n'
msg31968 - (view) Author: Gabriel Genellina (ggenellina) Date: 2007-05-07 21:15
> file.write("Hello" + os.linesep)

The line separator for a file opened in *text*mode* is *always* \n in Python code. It gets converted to os.linesep when you write the file; and conversely, os.linesep get translated into a single \n when you read the file.
On Windows, os.linesep is "\r\n". The argument to file.write above is  "Hello\r\n". That "\n" gets translated into "\r\n" when it is written, so the actual file contents will be "Hello\r\n\n"

In short: if you open a file in text mode (the default) *don't* use os.linesep to terminate lines. Only use os.linesep when writing text files open in binary mode (and that's not a common case).
msg31969 - (view) Author: Geoffrey Bache (gjb1002) Date: 2007-05-07 21:36
I see. This seems like something of a gotcha to me: is it documented anywhere? Seems to make os.linesep not very useful in fact, especially if "\n" will work fine everywhere and is shorter. 

It isn't at all obvious that file.write(os.linesep) will in fact write two lines...
msg31970 - (view) Author: Gabriel Genellina (ggenellina) Date: 2007-05-08 01:48
Yes, it's not so obvious, and the same question was posted a few days ago on comp.lang.python.
I'll review the documentation and make it more clearly stated.
msg31971 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2007-05-11 11:06
To the OP: what does "open("winlineend", "rb").read()" return?
msg31972 - (view) Author: Geoffrey Bache (gjb1002) Date: 2007-05-11 18:34
It returns "Hello\r\r\n" (not "Hello\r\n\n" as was suggested earlier) Don't know quite how it does this from simply writing os.linesep.
msg31973 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2007-05-11 18:41
Okay, that settles it.

When you write "\r\n" to a Windows text file, it writes "\r\r\n"
(since "\n" is converted to "\r\n"). When universal newline mode
sees that, it thinks of it as Mac-lineend followed by a Windows-
lineend.

The doc fix has been committed, closing this one as "won't fix".
msg31974 - (view) Author: Geoffrey Bache (gjb1002) Date: 2007-05-11 18:50
Which doc has been changed? Can I review it somewhere? 
msg31975 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2007-05-11 19:38
The doc for os.linesep. The change is in Subversion, and will be at docs.python.org/dev when the build script runs the next time.
History
Date User Action Args
2022-04-11 14:56:24adminsetgithub: 44939
2007-05-07 14:50:39gjb1002create