How can I recover files from a corrupted .tar.gz archive?
Asked Answered
R

3

31

I have a large number of files in a .tar.gz archive. Checking the file type with the command

file SMS.tar.gz

gives the response

gzip compressed data - deflate method , max compression

When I try to extract the archive with gunzip, after a delay I receive the message

gunzip: SMS.tar.gz: unexpected end of file

Is there any way to recover even part of the archive?

Rubato answered 14/10, 2008 at 14:30 Comment(0)
N
21

Are you sure that it is a gzip file? I would first run 'file SMS.tar.gz' to validate that.

Then I would read the The gzip Recovery Toolkit page.

Newsworthy answered 14/10, 2008 at 14:32 Comment(1)
gzrecover does not come installed on Mac OS. However, Liudvikas Bukys's method worked fine. Had tcpdump piped into gzip, killed with Control-C, unexpected EOF trying to decompress pipee file.Lobate
B
40

Recovery is possible but it depends on what caused the corruption.

If the file is just truncated, getting some partial result out is not too hard; just run

gunzip < SMS.tar.gz > SMS.tar.partial

which will give some output despite the error at the end.

If the compressed file has large missing blocks, it's basically hopeless after the bad block.

If the compressed file is systematically corrupted in small ways (e.g. transferring the binary file in ASCII mode, which smashes carriage returns and newlines throughout the file), it is possible to recover but requires quite a bit of custom programming, it's really only worth it if you have absolutely no other recourse (no backups) and the data is worth a lot of effort. (I have done it successfully.) I mentioned this scenario in a previous question.

The answers for .zip files differ somewhat, since zip archives have multiple separately-compressed members, so there's more hope (though most commercial tools are rather bogus, they eliminate warnings by patching CRCs, not by recovering good data). But your question was about a .tar.gz file, which is an archive with one big member.

Barbital answered 21/10, 2008 at 18:29 Comment(1)
There will most likely be an unreadable file after this procedure. Fortunately, there is a tool to fix this and get the partial data from it too: riaschissl.bestsolution.at/2015/03/…Fisc
N
21

Are you sure that it is a gzip file? I would first run 'file SMS.tar.gz' to validate that.

Then I would read the The gzip Recovery Toolkit page.

Newsworthy answered 14/10, 2008 at 14:32 Comment(1)
gzrecover does not come installed on Mac OS. However, Liudvikas Bukys's method worked fine. Had tcpdump piped into gzip, killed with Control-C, unexpected EOF trying to decompress pipee file.Lobate
G
6

Here is one possible scenario that we encountered. We had a tar.gz file that would not decompress, trying to unzip gave the error:

gzip -d A.tar.gz
gzip: A.tar.gz: invalid compressed data--format violated

I figured out that the file may been originally uploaded over a non binary ftp connection (we don't know for sure).

The solution was relatively simple using the unix dos2unix utility

dos2unix A.tar.gz
dos2unix: converting file A.tar.gz to UNIX format ...
tar -xvf A.tar
file1.txt
file2.txt 
....etc.

It worked! This is one slim possibility, and maybe worth a try - it may help somebody out there.

Gabel answered 20/9, 2013 at 11:7 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.