Methods to hex edit binary files via Powershell
Asked Answered
D

4

13

I am trying to perform a binary hex edit from the command line using only PowerShell. I have had partial success performing a hex replace with this snippet. My problem arises when 123456 occurs multiple times, since the replacement was only supposed to take place at a specific location.

Note: This snippet requires the Convert-ByteArrayToHexString and Convert-HexStringToByteArray functions shown here.

$readin = [System.IO.File]::ReadAllBytes("C:\OldFile.exe");
$hx = Convert-ByteArrayToHexString $readin -Width 40 -Delimiter "";
$hx = $hx -replace "123456","FFFFFF";
$hx = "0x" + $hx;
$writeout = Convert-HexStringToByteArray $hx;
Set-Content -Value $writeout -Encoding byte -Path "C:\NewFile.exe";

How can I specify an offset position into PowerShell rather than use this sketchy -replace command?

Dumbfound answered 5/1, 2014 at 15:28 Comment(1)
There are a lot of good answers here, but very few arrive at the door. It would have been great to see a function that takes: (1) a filename, (2) a hex-string to search for, or (3) an offset, (4) a hex-string to replace with, as input to some powershell function. I guess we'll have to wait...Corder
S
20

You already have a byte array, so you could simply modify the bytes at any given offset.

$bytes  = [System.IO.File]::ReadAllBytes("C:\OldFile.exe")
$offset = 23

$bytes[$offset]   = 0xFF
$bytes[$offset+1] = 0xFF
$bytes[$offset+2] = 0xFF

[System.IO.File]::WriteAllBytes("C:\NewFile.exe", $bytes)
Symonds answered 5/1, 2014 at 15:47 Comment(0)
B
3

How can we specify an offset position into PowerShell to replace this sketchy -replace command.

Ansgar Wiechers' helpful answer addresses the offset question, and brianary's helpful answer shows a more PowerShell-idiomatic variant.

That said, it sounds like if you had a solution for replacing only the first occurrence of your search string, your original solution may work.


First-occurrence-only string replacement:

Unfortunately, neither PowerShell's -replace operator nor .NET's String.Replace() method offer limiting replacing to one occurrence (or a fixed number).

However, there is a workaround:

$hx = $hx -replace '(?s)123456(.*)', 'FFFFFF$1'
  • (?s)is an inline regex option that makes regex metacharacter . match newlines too.

  • (.*) captures all remaining characters in capture group 1, and $1 in the replacement string references them, which effectively removes just the first occurrence. (See this answer for the more information about -replace and the syntax of the replacement operand.)

  • General caveats:

    • If your search string happens to contain regex metacharacters that you want to be taken literally, \-escape them individually or, more generally, pass the entire search term to [regex]::Escape().

    • If your replacement string happens to contain $ characters that you want to be taken literally, $-escape them or, more generally, apply -replace '\$', '$$$$' (sic) to it.

However, as iRon points out, while the above generically solves the replace-only-once problem, it is not a fully robust solution, because there is no guarantee that the search string will match at a byte boundary; e.g., single-byte search string 12 would match the middle 12 in 0123, even though there is no byte 12 in the input string, composed of bytes 01 and 23.

To address this ambiguity, the input "byte string" and the search string must be constructed differently: simply separate the digits constituting a byte each with spaces, as shown below.


Replacing byte sequences by search rather than fixed offsets:

Here's an all-PowerShell solution (PSv4+) that doesn't require third-party functionality:

Note:

  • As in your attempt, the entire file contents are read at once, and to-and- from string conversion is performed; PSv4+ syntax

  • To construct the search and replacement strings as "byte strings" with space-separated hex. representations created from byte-array input, use the same approach as for constructing the byte string from the input as shown below, e.g.:

    • (0x12, 0x34, 0x56, 0x1).ForEach('ToString', 'X') -join ' ' -> '12 34 56 1'
      • .ForEach('ToString', 'X') is the equivalent of calling .ToString('X') on each array element and collecting the results.
    • If prefer each byte to be consistently represented as two hex digits, even for values less than 0x10, (e.g., 01 rather than 1), use 'X2', which increases memory consumption, however.
      Also, you'll have to 0-prefix single-digit byte values in the search string too, e.g.:
      '12 34 56 01'
# Read the entire file content as a [byte[]] array.
# Note: Use PowerShell *Core* syntax. 
# In *Windows PowerShell*, replace `-AsByteStream` with `-Encoding Byte`
# `-Raw` ensures that the file is efficiently read as [byte[]] array at once.
$byteArray = Get-Content C:\OldFile.exe -Raw -AsByteStream

# Convert the byte array to a single-line "byte string", 
# where the whitespace-separated tokens are the hex. encoding of a single byte.
# If you want to guaranteed that even byte values < 0x10 are represented as
# *pairs* of hex digits, use 'X2' instead.
$byteString = $byteArray.ForEach('ToString', 'X') -join ' '

# Perform the replacement.
# Note that since the string is guaranteed to be single-line, 
# inline option `(?s)` isn't needed.
# Also note how the hex-digit sequences representing bytes are also separated
# by spaces in the search and replacement strings.
$byteString = $byteString -replace '\b12 34 56\b(.*)', 'FF FF FF$1'

# Convert the byte string back to a [byte[]] array, and save it to the
# target file.
# Note how the array is passed as an *argument*, via parameter -Value, 
# rather than via the pipeline, because that is much faster.
# Again, in *Windows PowerShell* use `-Encoding Byte` instead of `-AsByteStream`.
[byte[]] $newByteArray = -split $byteString -replace '^', '0x'
Set-Content "C:\NewFile.exe" -AsByteStream -Value $newByteArray
Benjaminbenji answered 3/8, 2019 at 21:45 Comment(4)
The method posted over here, closed for being a duplicate, is faster and uses less memroy. #57337393Neighbor
@js2010: I assume you're referring to your own answer there: (a) the question is still a duplicate and (b) your answer shows how to replace a single byte value, however often it appears in the file (whereas the original question was completely open-ended). I suggest you recreate your answer here and amend it to meet this question's specific requirements, pointing out that dealing with decimal values allows for a shorter and more efficient solution. If you also add -Raw to the Get-Content call and fix the clumsy -as 'byte[]', you'll have my up-vote.Benjaminbenji
Not my answer, the one given who first asked the question.Neighbor
@js2010: It is never appropriate to edit an answer into a question. If you feel that there's something noteworthy there (which isn't obvious to me at first glance, given the very specific CSV-based code in there, and the fact that the whole file is still read into memory, along with inefficient pipeline-based code, and an explicit warning about memory use), encourage the author to post an answer, preferably here.Benjaminbenji
O
3

As far as I can oversee the quest, there is no need to do any hexadecimal conversion on a byte stream to do a replacement. You can just do a replacement on a decimal value list (default string conversion) where the values are bounded by spaces (word ends), e.g.:
(I am skipping the file input/output which is already explained in the answer from @mklement0)

$bInput = [Byte[]](0x69, 0x52, 0x6f, 0x6e, 0x57, 0x61, 0x73, 0x48, 0x65, 0x72, 0x65)
$bOriginal = [Byte[]](0x57, 0x61, 0x73, 0x48)
$bSubstitute = [Byte[]](0x20, 0x77, 0x61, 0x73, 0x20, 0x68)
$bOutput = [Byte[]]("$bInput" -Replace "\b$bOriginal\b", "$bSubstitute" -Split '\s+')

In case you like to use hexadecimal strings (e.g. for the replace arguments), you can convert a hex string to a byte array as follows: [Byte[]]('123456' -Split '(..)' | ? { $_ } | % {[Convert]::toint16($_, 16)})

Note that this solution supports different $bOriginal and $bSubstitute lengths. In such a case, if you like to start replacing from a specific offset you might want to use the Select-Object cmdlet:

$Offset = 3
$bArray = $bInput | Select -Skip $Offset
$bArray = [Byte[]]("$bArray" -Replace "\b$bOriginal\b", "$bSubstitute" -Split '\s+')
$bOutput = ($bInput | Select -First $Offset) + $bArray
Ormandy answered 5/8, 2019 at 7:22 Comment(1)
+1 for the clever implicit (decimal) stringification; it results in a larger string, but that probably won't matter. Do note, however, that the use of -replace still bears the risk of unintentionally replacing multiple occurrences - which was the OP's initial problem (though not the only one, due to the byte-boundary issues you pointed out). Also, instead of Select-Object I'd use array slicing, because that will make a noticeable difference in performance.Benjaminbenji
T
2

Probably the way most idiomatic to PowerShell would be:

$offset = 0x3C
[byte[]]$bytes = Get-Content C:\OldFile.exe -Encoding Byte -Raw

$bytes[$offset++] = 0xFF
$bytes[$offset++] = 0xFF
$bytes[$offset] = 0xFF

,$bytes |Set-Content C:\NewFile.exe -Encoding Byte
Treadwell answered 6/6, 2018 at 16:46 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.