How to compare the content of a tarball with a folder
Asked Answered
D

7

39

How can I compare a tar file (already compressed) of the original folder with the original folder?

First I created archive file using

tar -kzcvf directory_name.zip directory_name

Then I tried to compare using

tar -diff -vf directory_name.zip directory_name

But it didn't work.

Dynamotor answered 8/2, 2012 at 3:34 Comment(1)
Just try "-d" option instead of -diff. Use "-dvf" in your case.Katti
T
57

--compare (-d) is more handy for that.

tar --compare --file=archive-file.tar

works if archive-file.tar is in the directory it was created. To compare archive-file.tar against a remote target (eg if you have moved archive-file.tar to /some/where/) use the -C parameter:

tar --compare --file=archive-file.tar -C /some/where/

If you want to see tar working, use -v without -v only errors (missing files/folders) are reported.

Tipp: This works with compressed tar.bz/ tar.gz archives, too.

Trappist answered 31/1, 2014 at 17:45 Comment(4)
this is the answer, and it answers a more general and importante question: "How to compare the content of a tarball with a folder", which is more general and include this question. So I think the question should be rephrased and this answer acceptedNanji
by the way, do you know how to get rid of the GID and UID comparison?Nanji
I get "Warning: Cannot stat: No such file or directory" either which way I try.Orton
Important: This does not report new files in the directory that are not present in the tar file.Wernick
C
12

It should be --diff

Try this (without the last directory_name):

tar --diff -vf directory_name.zip

The problem is that the --diff command only looks for differences on the existing files among the tar file and the folder. So, if a new file is added to the folder, the diff command does not report this.

Cazares answered 14/2, 2013 at 12:59 Comment(1)
Does this command report differences in file metadata, i.e. ownership, permissions, symlink targets, device node properties, etc.? I'm planning a backup of a linux root filesystem, and I want to make sure all this metadata is correct.Interrelation
A
7

To ignore differences in some or all of the metadata (user, time, permissions), you can pipe the result to awk:

tar --compare --file=archive-file.tar -C /some/where/ | awk '!/Mode/ && !/Uid/ && !/Gid/ && !/time/'

That should output only the true differences between the tar and the directory /some/where/

Athodyd answered 5/9, 2020 at 3:51 Comment(0)
L
4

The method of pix is way slow for large compressed tar files, because it extracts each file individually. I use the tar --diff method loking for files with different modification time and extract and diff only these. The files are extracted into a folder base.orig where base is either the top level folder of the tar file or teh given comparison folder. This results in diffs including the date of the original file.

Here is the script:

#!/bin/bash
set -o nounset

# Print usage

if [ "$#" -lt 1 ] ; then
  echo 'Diff a tar (or compressed tar) file with a folder'
  echo 'difftar-folder.sh <tarfile> [<folder>] [strip]'
  echo default for folder is .
  echo default for strip is 0.
  echo 'strip must be 0 or 1.'
  exit 1
fi

# Parse parameters

tarfile=$1

if [ "$#" -ge 2 ] ; then
  folder=$2
else
  folder=.
fi

if [ "$#" -ge 3 ] ; then
  strip=$3
else
  strip=0
fi

# Get path prefix if --strip is used

if [ "$strip" -gt 0 ] ; then
  prefix=`tar -t -f $tarfile | head -1`
else
  prefix=
fi

# Original folder

if [ "$strip" -gt 0 ] ; then
  orig=${prefix%/}.orig
elif [ "$folder" = "." ] ; then
  orig=${tarfile##*/}
  orig=./${orig%%.tar*}.orig
elif [ "$folder" = "" ] ; then
  orig=${tarfile##*/}
  orig=${orig%%.tar*}.orig
else
  orig=$folder.orig
fi
echo $orig
mkdir -p "$orig"


# Make sure tar uses english output (for Mod time differs)
export LC_ALL=C

# Search all files with a deviating modification time using tar --diff
tar --diff -a -f "$tarfile" --strip $strip --directory "$folder" | grep "Mod time differs" | while read -r file ; do
  # Substitute ': Mod time differs' with nothing
  file=${file/: Mod time differs/}
  # Check if file exists
  if [ -f "$folder/$file" ] ; then 
    # Extract original file
    tar -x -a -f "$tarfile" --strip $strip --directory "$orig" "$prefix$file"
    # Compute diff
    diff -u "$orig/$file" "$folder/$file" 
  fi
done
Leishmaniasis answered 7/10, 2016 at 9:10 Comment(0)
M
1

I recently needed a better compare than what "tar --diff" produced so I made this short script:

#!/bin/bash
tar tf "$1" | while read ; do 
  if [ "${REPLY%/}" = "$REPLY" ] ; then 
    tar xOf "$1" "$REPLY" | diff -u - "$REPLY" 
  fi
done
Marley answered 11/9, 2014 at 14:38 Comment(1)
@staticx $REPLY is created by the read command (in the while test). It contains the full line, so in this case it is the current filename from the tar t command.Marley
D
0

You may use diff's --compare (-diff, d) option. You have to take some care because diff compares only files specified on the command line, and only those that simultaneously exist inside the archive. For example, new existing files are not reported. Often I prefer the approach of pix to take more control.

However, unlike pix and Michael Soegtrop, I do not think you have to extract any file.

The following code test diff's ability to compare files.

touch refF; setTM12 () { touch -r refF F1 F2; };

# create the files
echo a1a > F1; echo a2a > F2; echo a3a>F3; echo a4a>F4; setTM12;

tar cf tarF F1 F2 F3 F4;

# do not change times of F1 F2
# modify F1 F2 F3, change the mtime of F4
echo mod > F1; echo longer > F2; setTM12;
sleep 2; echo XXX > F3; touch F4;

tar -df tarF F1 F2 F3 F4

F1: Contents differ
F2: Size differs
F3: Mod time differs
F3: Contents differ
F4: Mod time differs

You may need to know that Size differs implicitly tags files whose contents differs, for example F2.

-v is handy option, two.

tar -vdf tarF F1 F2 F3 F4
F1
F1: Contents differ
F2
F2: Size differs          <--- Means that the Contents differ, too !
F3
F3: Mod time differs
F3: Contents differ
F4
F4: Mod time differs
Dg answered 8/8, 2023 at 9:51 Comment(0)
M
-1

The easy way is to write:

  • tar df file This compares the file with the current working directory, and tell us about if any of the files has been removed.
  • tar df file -C path/folder This compares the file with the folder.
Millpond answered 17/2, 2022 at 4:7 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.