This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: tarfile exception on large .tar files
Type: Stage:
Components: Library (Lib) Versions: Python 2.3
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: akuchling Nosy List: akuchling, johahn, johanfo
Priority: normal Keywords:

Created on 2003-10-13 11:20 by johanfo, last changed 2022-04-10 16:11 by admin. This issue is now closed.

Messages (3)
msg18615 - (view) Author: Johan Fredrik Öhman (johanfo) Date: 2003-10-13 11:20
The following exception is thrown when I write a lot of 
data > 10Gb directly to my tapestreamer.  Is this 
normal? 

Traceback (most recent call last):
  File "/usr/local/metroField/fieldPlugins/Backup.py", line 
184, in run
    self._doBackup()
  File "/usr/local/metroField/fieldPlugins/Backup.py", line 
333, in _doBackup
    arc.close()
  File "/usr/local/metroField/fieldPlugins/Backup.py", line 
533, in close
    self.tf.close()
  File "/usr/local/lib/python2.3/tarfile.py", line 1009, in 
close
    self.fileobj.close()
  File "/usr/local/lib/python2.3/tarfile.py", line 360, in 
close
    self.fileobj.write(struct.pack("<L", self.pos))
OverflowError: long int too large to convert
msg18616 - (view) Author: Johan M. Hahn (johahn) Date: 2003-10-15 20:17
Logged In: YES 
user_id=887415

Hi, I think I've found the correct solution to the problem 
(though I havn't actually tested it). Looking in tarfile.py...




358:   if self.type == "gz":


359:       self.fileobj.write(struct.pack("<l", self.crc))


360:       self.fileobj.write(struct.pack("<L", self.pos))




...shows that this error only occurs when using .gz extensions. 
Testing shows that this error occurs when self.pos > sys.
maxint*2+2, that is for files larger than 4Gb. This is not good 
since the newest tar and gzip versions can handle files larger 
than that.


   According to the gzip file format spec from www.wotsit.org, 
the last 4 bytes of a gzip file "contains the size of the original 
(uncompressed) input data modulo 2^32". All that has to be 
done is to perform this calculation prior to the call to struct.
pack. Here is my proposed fix:




358:   if self.type == "gz":


359:       self.fileobj.write(struct.pack("<l", self.crc))


360:       self.fileobj.write(struct.pack("<L", self.pos % 2**32)
)




   I also noted that in Jython 2.1 struct.pack('<L', sys.
maxint*2+2) does not raise an OverflowError but wraps around 
and returns '\x00\x00\x00\x00'. This results in the correct size 
calculation for gzip but to silent the overflow is probably not a 
good idea.




...johahn
msg18617 - (view) Author: A.M. Kuchling (akuchling) * (Python committer) Date: 2003-10-24 17:39
Logged In: YES 
user_id=11375

Thanks for reporting this bug; your suggested change seems to be correct.
Applied to CVS as rev. 1.9 of tarfile.py

The Jython bug should be reported to whatever bug tracker the Jython 
developers use; they won't see the bug if it's in this bug tracker.  Try looking at 
www.jython.org.
History
Date User Action Args
2022-04-10 16:11:42adminsetgithub: 39400
2003-10-13 11:20:53johanfocreate