Remove multiple BOMs from a file
Asked Answered
V

5

12

I am using a Javascript file that is a concatenation of other JavaScript files.

Unfortunately, the person who concatenated these JavaScript files together did not use the proper encoding when reading the file, and allowed a BOM for every single JavaScript file to get written to the concatenated JavaScript file.

Does anyone know a simple way to search through the concatenated file and remove any/all BOM markers?

Using PHP or a bash script for Mac OSX would be great.

Vital answered 1/2, 2012 at 17:54 Comment(2)
have you tryed using Notepad++, Encodage > select the one that should be there, Convert it back to UTF8-NoBomsCatchword
What is a compiled Javascript file? You surely mean concatenated or what?Poilu
P
17

See also: Using awk to remove the Byte-order mark

To remove multiple BOMs from anywhere within a text file you can try something similar. Just leave out the ^ anchor:

perl -e 's/\xef\xbb\xbf//;' -pi~ file.js

(This edits the file in-place. But creates a backup file.js~.)

Poilu answered 1/2, 2012 at 18:17 Comment(0)
D
18

I normally do it using vim:

vim -c "set nobomb" -c wq! myfile
Dissolvent answered 5/2, 2013 at 18:37 Comment(1)
This worked for me. I just couldn't get the sed command to strip them.Rebeccarebecka
P
17

See also: Using awk to remove the Byte-order mark

To remove multiple BOMs from anywhere within a text file you can try something similar. Just leave out the ^ anchor:

perl -e 's/\xef\xbb\xbf//;' -pi~ file.js

(This edits the file in-place. But creates a backup file.js~.)

Poilu answered 1/2, 2012 at 18:17 Comment(0)
S
1

fetch BOM files

grep -rIlo $’^\xEF\xBB\xBF’ ./

remove BOM files

grep -rIlo $’^\xEF\xBB\xBF’ . | xargs sed –in-place -e ‘s/\xef\xbb\xbf//’

exclude .svn dir

grep -rIlo –exclude-dir=”.svn” $’^\xEF\xBB\xBF’ . | xargs sed –in-place -e ‘s/\xef\xbb\xbf//’

Secretion answered 19/11, 2014 at 2:1 Comment(0)
V
0

I also figured out this solution which works entirely in PHP:

$packed = pack("CCC",0xef,0xbb,0xbf);
$contents = preg_replace('/'.$packed.'/','',$contents);
Vital answered 1/2, 2012 at 19:11 Comment(1)
It's probably easier to type "\xef\xbb\xbf", see double quoted string escapes.Thirddegree
E
0

I have written a bash script see here that works for Mac, I haven't tested on other systems but I suspect it should work there as well. The script also support files or file paths that contains spaces.

Examples

Remove BOM from all files in current directory:

rmbom .

Print all files with a BOM in the current directory

rmbom . -a

Only remove BOM from all files in current directory with extension txt or cs:

rmbom . -e txt -e cs

Print help

rmbom -h

Evslin answered 17/10, 2020 at 9:0 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.