All,
There is a application which generates it's export dumps.I need to write a script that will compare the previous days dump against the latest and if there are differences among them i have to some basic manipulations of moving and deleting sort of stuff.
I have tried finding a suitable way of doing it and the method i tried was :
$var_com=diff (get-content D:\local\prodexport2 -encoding Byte) (get-content D:\local\prodexport2 -encoding Byte)
I tried the Compare-Object cmdlet as well. I notice a very high memory usage and eventually i get a message System.OutOfMemoryException
after few minutes. Has one of you done something similer ?. Some thoughts please.
There was a thread which mentioned about a has comparison which i have no idea as to how to go about.
Thanks in advance folks
Osp
Another method is to compare the MD5 hashes of the files:
$Filepath1 = 'c:\testfiles\testfile.txt'
$Filepath2 = 'c:\testfiles\testfile1.txt'
$hashes =
foreach ($Filepath in $Filepath1,$Filepath2)
{
$MD5 = [Security.Cryptography.HashAlgorithm]::Create( "MD5" )
$stream = ([IO.StreamReader]"$Filepath").BaseStream
-join ($MD5.ComputeHash($stream) |
ForEach { "{0:x2}" -f $_ })
$stream.Close()
}
if ($hashes[0] -eq $hashes[1])
{'Files Match'}
cd somewhere
and then $FilePath1 = 'testfile.txt'
) but the StreamReader doesn't pick up Powershell's change of folder and thinks it is relative to my home folder instead. The fix is to use $Filepath1 = Get-Item 'testfile.txt'
instead and then Powershell passes the correct absolute path to StreamReader. –
Rugging With PowerShell 4 you can use native commandlets to do this:
function CompareFiles {
param(
[string]$Filepath1,
[string]$Filepath2
)
if ((Get-FileHash $Filepath1).Hash -eq (Get-FileHash $Filepath2).Hash) {
Write-Host 'Files Match' -ForegroundColor Green
} else {
Write-Host 'Files do not match' -ForegroundColor Red
}
}
PS C:> CompareFiles .\20131104.csv .\20131104-copy.csv
Files Match
PS C:> CompareFiles .\20131104.csv .\20131107.csv
Files do not match
You could easily modify the above function to return a $true or $false value if you want to use this programmatically on a large scale
EDIT
After seeing this answer, I just wanted to supply larger scale version that simply returns true or false:
function CompareFiles
{
param
(
[parameter(
Mandatory = $true,
HelpMessage = "Specifies the 1st file to compare. Make sure it's an absolute path with the file name and its extension."
)]
[string]
$file1,
[parameter(
Mandatory = $true,
HelpMessage = "Specifies the 2nd file to compare. Make sure it's an absolute path with the file name and its extension."
)]
[string]
$file2
)
( Get-FileHash $file1 ).Hash -eq ( Get-FileHash $file2 ).Hash
}
You could use fc.exe. It comes with Windows. Here's how you would use it:
fc.exe /b d:\local\prodexport2 d:\local\prodexport1 > $null
if (!$?) {
"The files are different"
}
if (!$?)
and replace it with if ($LastExitCode -eq 0)
. Check out https://mcmap.net/q/16008/-lastexitcode-0-but-false-in-powershell-redirecting-stderr-to-stdout-gives-nativecommanderror and all the answers. –
Lette $null = fc.exe ...
? –
Guido Another method is to compare the MD5 hashes of the files:
$Filepath1 = 'c:\testfiles\testfile.txt'
$Filepath2 = 'c:\testfiles\testfile1.txt'
$hashes =
foreach ($Filepath in $Filepath1,$Filepath2)
{
$MD5 = [Security.Cryptography.HashAlgorithm]::Create( "MD5" )
$stream = ([IO.StreamReader]"$Filepath").BaseStream
-join ($MD5.ComputeHash($stream) |
ForEach { "{0:x2}" -f $_ })
$stream.Close()
}
if ($hashes[0] -eq $hashes[1])
{'Files Match'}
cd somewhere
and then $FilePath1 = 'testfile.txt'
) but the StreamReader doesn't pick up Powershell's change of folder and thinks it is relative to my home folder instead. The fix is to use $Filepath1 = Get-Item 'testfile.txt'
instead and then Powershell passes the correct absolute path to StreamReader. –
Rugging A while back I wrote an article on a buffered comparison routine to compare two files with PowerShell:
function FilesAreEqual {
param(
[System.IO.FileInfo] $first,
[System.IO.FileInfo] $second,
[uint32] $bufferSize = 524288)
if ($first.Length -ne $second.Length) return $false
if ( $bufferSize -eq 0 ) $bufferSize = 524288
$fs1 = $first.OpenRead()
$fs2 = $second.OpenRead()
$one = New-Object byte[] $bufferSize
$two = New-Object byte[] $bufferSize
$equal = $true
do {
$bytesRead = $fs1.Read($one, 0, $bufferSize)
$fs2.Read($two, 0, $bufferSize) | out-null
if ( -Not [System.Linq.Enumerable]::SequenceEqual($one, $two)) {
$equal = $false
}
} while ($equal -and $bytesRead -eq $bufferSize)
$fs1.Close()
$fs2.Close()
return $equal
}
You can use it by:
FilesAreEqual c:\temp\test.html c:\temp\test.html
A hash (like MD5) needs to traverse the entire file to do the hash calculation. This script returns as soon at it sees a difference in the buffer. It compares the buffer using LINQ which is faster than native PowerShell.
foreach
that contains however many files of varying sizes, is yours more optimized than the 4.0 Get-FileHash
? –
Lette $BYTES_TO_READ
to some higher value than 8. On my system reading 8 Bytes per iteration was extremely slow. I don't know what the best value is, but increasing the buffer size to 32768 (32 KB) certainly made the file compare a lot snappier. –
Keening $BYTES_TO_READ
is not enough, because inside the loop the BitConverter
calls only compare the first 8 Bytes (= one Int64
) of the buffer. After some deliberation I settled for a second, inner loop that iterates over the byte arrays and individually compares every byte. This is reasonably fast, and it's especially much faster than the ultra-slow compare-object
cmdlet. –
Keening Read
- the docs say “An implementation is free to return fewer bytes than requested even if the end of the stream has not been reached.” (see learn.microsoft.com/en-us/dotnet/api/…) so $fs1.Read(…)
and $fs2.Read(…)
could read different byte counts. It doesn’t seem to ever actually happen in practice, but it’s possible. An assert for, e.g. $bytesRead1 -eq $bytesRead2
inside the loop would at least protect against this… –
Confraternity if ( (Get-FileHash c:\testfiles\testfile1.txt).Hash -eq (Get-FileHash c:\testfiles\testfile2.txt).Hash ) {
Write-Output "Files match"
} else {
Write-Output "Files do not match"
}
© 2022 - 2024 — McMap. All rights reserved.
-Raw
parameter ofGet-Content
without any-Encoding
, comparison wents much faster and easier. – Babylonia