I am trying to manipulate a text file and remove non-ASCII characters from the text. I don't want to remove the line. I only want to remove the offending characters. I am trying to get the following expression to work:
sed '/[\x80-\xFF]/d'
I am trying to manipulate a text file and remove non-ASCII characters from the text. I don't want to remove the line. I only want to remove the offending characters. I am trying to get the following expression to work:
sed '/[\x80-\xFF]/d'
The suggested solutions may fail with specific version of sed, e.g. GNU sed 4.2.1.
Using tr
:
tr -cd '[:print:]' < yourfile.txt
This will remove any characters not in [\x20-\x7e]
.
If you want to keep e.g. line feeds, just add \n
:
tr -cd '[:print:]\n' < yourfile.txt
If you really want to keep all ASCII characters (even the control codes):
tr -cd '[:print:][:cntrl:]' < yourfile.txt
This will remove any characters not in [\x00-\x7f]
.
tr -cd '\11\12\15\40-\176'
which worked with meld (at least with my files) ref –
Kerf © 2022 - 2024 — McMap. All rights reserved.