Validation error: "Byte-Order Mark found in UTF-8 File"

Asked 8/8, 2011 at 18:30 Answered 8/8, 2011 at 18:40

Solved html utf-8 w3c-validation byte-order-mark

I'm working on a website and, while displaying it on Firefox is fine, on Internet Explorer I've got a lot of problems. I used the W3C validator and I got a lot of strange errors.

Here's the link to the website: http://misenplacecatering.it/

The first validation error, which I think is the most relevant, is this:

Byte-Order Mark found in UTF-8 File. The Unicode Byte-Order Mark (BOM) in UTF-8 encoded files is known to cause problems for some text editors and older browsers. You may want to consider avoiding its use until it is better supported.

and

Line 1, Column 1: Non-space characters found without seeing a doctype first. Expected .

<!DOCTYPE HTML>

I've read other questions about this issue, so I tried to open the file with different editors (I always use Vim, anyway), but I don't see any space or anything else before the doctype definition. I even used Notepad++ and used an option to remove the BOM, but nothing.

How can I fix it?

Redeem answered 8/8, 2011 at 18:30 Comment(5)

I wouldn't care too much about the second error as long as you haven't removed the first one. Your page indeed has the extra four bytes at the beginning of the file that serve as BOM. Remove these four bytes and try again. – Malonis 8/8, 2011 at 18:45

Never use BOMs in UTF-8. It’s Yet Another Microsoft Bug. – Ninebark 8/8, 2011 at 20:4

@Ninebark - I'd welcome seeing you expand on your viewpoint by adding an answer to the quite popular question What's different between utf-8 and utf-8 without BOM?. – Halfwit 1/10, 2014 at 21:12

@Halfwit It’s kind of hopeless. UTF-8 does not have a BOM. Microsoft has decided to lie to people the way they already do when they call "Unicode" an encoding, which it is not. In the same way, what they call UTF-8 is actually UTF-8 with an extraneous U+FEFF ZERO WIDTH NO-BREAK SPACE prefixing the real data. What they are calling "UTF-8 without BOM" is real UTF-8. So they have millions of people getting everything backwards and making files that by default contain something other than what they pretend to contain. It is a total botch. – Ninebark 2/10, 2014 at 0:29

@Ninebark - I found it interesting that none of the answers to What's different between utf-8 and utf-8 without BOM? specifically cited Microsoft as a key influence in popularizing the use of the BOM with UTF-8. However, Wikipedia's article on the BOM does indeed cite Microsoft and I used that article as the basis for writing my own answer. – Halfwit 2/10, 2014 at 20:39

If using Notepad++, use Convert to UTF-8 without BOM.

If you are using PHP, make sure that any included/required file is in either in ASCII or UTF without a BOM, as PHP doesn't handle non-ASCII file very well (this one gave me a headache once)

You could try converting your files to ASCII, if you don't need UTF characters.

In your <meta charset> attribute, try writing the value within quotes.

Pons answered 8/8, 2011 at 18:40 Comment(3)

Thanks, the problem was actually in another php file. A file which isn't even used in the homepage. Anyway I solved saving it with notepad++ without BOM. – Redeem 8/8, 2011 at 19:22

glad it helped. i once had this problem on a third level include, and i won't forget it – Pons 8/8, 2011 at 19:30

It helped me a lot i was having a include file which is not without BOM and that was creating problem. – Betty 8/3, 2015 at 22:38

The free text editor PSPad has a hex editing mode which is very handy for seeing exactly what you really have in your text files.

Uprush answered 8/8, 2011 at 18:32 Comment(0)

Recommended topics

Hot tags