This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: lines run together on input
Type: Stage:
Components: Library (Lib) Versions: Python 2.2
process
Status: closed Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: gvanrossum, thomasbhickey
Priority: low Keywords:

Created on 2003-02-20 21:12 by thomasbhickey, last changed 2022-04-10 16:06 by admin. This issue is now closed.

Messages (4)
msg14718 - (view) Author: Thomas B Hickey (thomasbhickey) Date: 2003-02-20 21:12
Using  'for line in file('xxxxx'):'  on a large (>4 Gbyte) text file 
3/4 of the way into the file occasionally two lines will be 
merged into one.  Happens consistently at the same lines.  
Counting lines this way always comes up short.  Changing 
the open to 'rb' fixes the problem.  These lines are 
terminated by '\n' even though we are running on a PC.

Haven't tried the universal newline mode in 2.3a, but would 
be willing to.

Running on Windows NT 2000, 2.2.2 (#37 Oct 14, 2002).  
Same problem seen in 2.2.1.
msg14719 - (view) Author: Thomas B Hickey (thomasbhickey) Date: 2003-02-22 23:25
Logged In: YES 
user_id=274109

We have repeated this same problem with C, so the bug does 
not seem to be peculiar to python.
msg14720 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2003-02-23 14:11
Logged In: YES 
user_id=6380

If as you suggest it's a bug in the C library, we can't do
anything about this. Are you sure it's not a problem with
your data?
msg14721 - (view) Author: Thomas B Hickey (thomasbhickey) Date: 2003-02-24 14:30
Logged In: YES 
user_id=274109

I agree this isn't a Python problem.  Does not seem to be a 
problem with the data.  We've seen it now in two text files of over 
4 gig, and doing a count of /n & /r gives the expected correct 
results.  Wouldn't have reported it if I'd seen the C problem 
earlier.

Thanks for the attention.  We've been processing 40gig+ files 
with Python regularly (in binary mode!)  with great success.

--Th
History
Date User Action Args
2022-04-10 16:06:58adminsetgithub: 38018
2003-02-20 21:12:19thomasbhickeycreate