This is an enhancement
of wef’s answer.
We can remove the issue of the special meaning of various special characters
and strings (^
, .
, [
, *
, $
, \(
, \)
, \{
, \}
, \+
, \?
,
&
, \1
, …, whatever, and the /
delimiter)
by removing the special characters.
Specifically, we can convert everything to hex;
then we have only 0
-9
and a
-f
to deal with.
This example demonstrates the principle:
$ echo -n '3.14' | xxd
0000000: 332e 3134 3.14
$ echo -n 'pi' | xxd
0000000: 7069 pi
$ echo '3.14 is a transcendental number. 3614 is an integer.' | xxd
0000000: 332e 3134 2069 7320 6120 7472 616e 7363 3.14 is a transc
0000010: 656e 6465 6e74 616c 206e 756d 6265 722e endental number.
0000020: 2020 3336 3134 2069 7320 616e 2069 6e74 3614 is an int
0000030: 6567 6572 2e0a eger..
$ echo "3.14 is a transcendental number. 3614 is an integer." | xxd -p \
| sed 's/332e3134/7069/g' | xxd -p -r
pi is a transcendental number. 3614 is an integer.
whereas, of course, sed 's/3.14/pi/g'
would also change 3614
.
The above is a slight oversimplification; it doesn’t account for boundaries.
Consider this (somewhat contrived) example:
$ echo -n 'E' | xxd
0000000: 45 E
$ echo -n 'g' | xxd
0000000: 67 g
$ echo '$Q Eak!' | xxd
0000000: 2451 2045 616b 210a $Q Eak!.
$ echo '$Q Eak!' | xxd -p | sed 's/45/67/g' | xxd -p -r
&q gak!
Because $
(24
) and Q
(51
)
combine to form 2451
,
the s/45/67/g
command rips it apart from the inside.
It changes 2451
to 2671
, which is &q
(26
+ 71
).
We can prevent that by separating the bytes of data in the search text,
the replacement text and the file with spaces.
Here’s a stylized solution:
encode() {
xxd -p -- "$@" | sed 's/../& /g' | tr -d '\n'
}
decode() {
xxd -p -r -- "$@"
}
left=$( printf '%s' "$search" | encode)
right=$(printf '%s' "$replacement" | encode)
encode file.txt | sed "s/$left/$right/g" | decode
I defined an encode
function because I used that functionality three times,
and then I defined decode
for symmetry.
If you don’t want to define a decode
function, just change the last line to
encode file.txt | sed "s/$left/$right/g" | xxd -p –r
Note that the encode
function triples the size of the data (text)
in the file, and then sends it through sed
as a single line
— without even having a newline at the end.
GNU sed seems to be able to handle this;
other versions might not be able to.
As an added bonus, this solution handles multi-line search and replace
(in other words, search and replacement strings that contain newline(s)).
sed
, see this previous discussion: Is it possible to escape regex metacharacters reliably with sed – Oisin