This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: tarfile stops iteration with some longfiles
Type: Stage:
Components: Library (Lib) Versions: Python 2.4
process
Status: closed Resolution: out of date
Dependencies: Superseder:
Assigned To: Nosy List: faik, georg.brandl
Priority: normal Keywords: patch

Created on 2006-06-21 11:44 by faik, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
longfiles.tar faik, 2006-06-21 11:46 prepared tar archive for test case
tarfile_longfilename.patch faik, 2006-06-21 11:47 python-2.4.3 Lib/tarfile.py longfilename bugfix
Messages (2)
msg50520 - (view) Author: Faik Uygur (faik) Date: 2006-06-21 11:44
tarfile.py in python 2.4.3 version has a bug that makes 
it to stop iteration while reading members. 

If a file's name is longer than 100 bytes, after 
reading ././@LongLink header of the file in the tar 
archive, tarfile tries to read the actual header. But if the 
100 byte of the file's name ends with "/" and it is a 
regular file (type is a file type). Tarfile changes it's type 
to a directory by thinking that it is some old tar archive 
format's directory member (because it seems to end 
with a "/").

I created a tar archive to test this case. You can find it 
in the attachments.

My test code does this:

#!/usr/bin/python
import tarfile
import sys
tar = tarfile.open(sys.argv[1], "r")
tar.list()

If I run it with the prepared tar file, i get this output:

faik@pardus tmp $ ./tarlist.py longfiles.tar
-rwxr-xr-x faik/users          0 2006-06-21 13:03:59 
this.is.a.very.long.directory.name/
-rwxr-xr-x faik/users          0 2006-06-21 13:06:17 
this.is.a.very.long.directory.name/this.is.another.very.long.directory.name/
-rwxr-xr-x faik/users          0 2006-06-21 13:08:21 
this.is.a.very.long.directory.name/this.is.another.very.long.directory.name/and.this.is.another.one/
-rw-r--r-- faik/users         19 2006-06-21 13:08:41 
this.is.a.very.long.directory.name/this.is.another.very.long.directory.name/and.this.is.another.one/NEWS

But if I run tar(1) with tvf options, i get this output:

faik@pardus tmp $ tar tvf longfiles.tar
drwxr-xr-x faik/users        0 2006-06-21 13:03:59 
this.is.a.very.long.directory.name/
drwxr-xr-x faik/users        0 2006-06-21 13:06:17 
this.is.a.very.long.directory.name/this.is.another.very.long.directory.name/
drwxr-xr-x faik/users        0 2006-06-21 13:08:21 
this.is.a.very.long.directory.name/this.is.another.very.long.directory.name/and.this.is.another.one/
-rw-r--r-- faik/users       19 2006-06-21 13:08:41 
this.is.a.very.long.directory.name/this.is.another.very.long.directory.name/and.this.is.another.one/NEWS
-rw-r--r-- faik/users       18 2006-06-21 13:10:10 
this.is.a.very.long.directory.name/this.is.another.very.long.directory.name/and.this.is.another.one/COPYING
-rw-r--r-- faik/users       26 2006-06-21 13:09:05 
this.is.a.very.long.directory.name/this.is.another.very.long.directory.name/and.this.is.another.one/README

tarfile.py ends iteration with this member 
file: "this.is.a.very.long.directory.name/this.is.another.very.long.directory.name/and.this.is.another.one/NEWS".
Because the full directory path name length that 
contains the NEWS file is exactly 100 bytes long.

There is also an attachment for the fix of the bug. 
msg50521 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2006-06-21 17:47
Logged In: YES 
user_id=849994

This is fixed in SVN, you can try the 2.5 beta to verify it.
History
Date User Action Args
2022-04-11 14:56:18adminsetgithub: 43534
2006-06-21 11:44:38faikcreate