Why does Opera parse my web page as XML?
Asked Answered
M

10

7

I just tried viewing my website http://www.logmytime.de/ in Opera (version 10.50) it gives me an "xml parsing failed error" and refuses to display the web page.

I can choose to "Reparse the document as HTML" and then the page works fine, but that's hardly a solution to my problem.

The weird thing is that the error still occurs after setting a HTML (instead of XTHML) doctype:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
          "http://www.w3.org/TR/html4/loose.dtd">

I checked the source output from the browser to make sure I did not make any mistake with the Doctype I even viewed the same web page in Firebug and it shows a Content-Type of text/html; .

So, why does Opera still try to parse my web page as XML?

Thanks,

Adrian

Edit: Just to clarify: I am not asking what the error on my web page is. I understand why this is not valid XHTML. However, I am also using the javascript micro templating engine, and it's templates are never valid XML, which is why I need the browser to parse my entire web site as HTML, not XHTML. In order to demonstrate this, I just inserted an example template into the web page.

<script type="text/html" id="StopWatchTemplate" > 

<h1><a href="#" onclick="TimeEntriesList.EditTimeEntry('<#=timeEntryID#>')"><#=currentlyRunning?"Aktueller":"Letzter"#> Stoppuhr-Zeiteintrag</a></h1>
<%-- Stoppuhr - Ende--%>

</script>

When opening the page in Opera, you can see that the template now produces XML parsing errors even though the doctype for the page is still HTML.

Edit 2:: Just to make this even clearer: I am not asking why my web page is not valid XHTML. I am asking why Opera tries to parse it as XHTML despite the HTML doctype.

Edit3:: Please do not post any more answers, I have found the cause of this and documented it below.

Maemaeander answered 4/5, 2010 at 15:49 Comment(7)
Is there something wrong with your markup (ie tags not closed properly?) I'm trying to run it through the w3 validator but it's not loading for me right now.Errick
"In order to demonstrate this, I just inserted an example template into the web page. " - what exactly and where have you inserted?Arriaga
Why would you possibly want to intentionally produce invalid documents?Arriaga
@M28, but XHTML IS XML and thus any parsing error should make the browser bark.Hornpipe
@M28: No, but XHTML is always XML.Maemaeander
Your webpage isn't valid HTML, either, so even if you figure out how to get it parsed as HTML, this is still not going to fix the problem. The problem is, your webpage is broken. The solution is to fix it. It's really that simple. In fact, you could probably have fixed it ten times, just in the time you spent writing your comments.Camphor
How exactly do you propose to fix the html template from my edit above? You might want to have a look at this web page to understand what the javascript microtemplating engine is: ejohn.org/blog/javascript-micro-templatingMaemaeander
M
5

In case someone else has the same problem: As suggested by DeveloperArt it can be fixed with a simple ContentType="text/html" attribute in the page element.

Edit: The problem was in fact caused by a bug with the mobile.Browser file I am using in my web project. The workaround above works, but it is not really necessary in my case. See this answer for more details.

Maemaeander answered 4/5, 2010 at 22:47 Comment(2)
I'm working on documenting this as well. Do you remember which version of the MDBF you were using?Parotic
@Scott Hanselman: I can't really say anymore since I removed the MDBF from my project in favor of 51Degrees.mobi and switched version control to Mercurial a few months ago. However, I am pretty sure I downloaded the MDBF after August last year.Maemaeander
K
13

Your document is not a valid HTML document. So, the browser should reject it. Unfortunately, due to a historic accident, most browsers do not reject invalid documents, but rather try to fix them (usually with pretty crappy results), so that the authro never even notices that his document is broken.

Thankfully, with XHTML, the browser vendors decided to fix that, and actually reject invalid documents. In your case, you are delivering your document as XHTML with the application/xhtml+xml MIME type:

# curl --head http://www.logmytime.de/
HTTP/1.1 200 OK
Cache-Control: private
Content-Length: 12529
Content-Type: application/xhtml+xml; charset=utf-8
              ^^^^^^^^^^^^^^^^^^^^^
Server: Microsoft-IIS/7.5
X-AspNetMvc-Version: 2.0
X-AspNet-Version: 2.0.50727
Set-Cookie: Referrer=None; path=/
X-Powered-By: ASP.NET
Date: Tue, 04 May 2010 16:08:40 GMT
So, the browser rejects your document (as it should). When you switch over to HTML, then it tries to fix your broken HTML.

Now, you have changed your DOCTYPE to HTML 4.01, but you are still delivering it as XHTML. All you have achieved now is that there are two reasons for the browser to reject your document: it's still invalid because you haven't fixed the actual bug and the DOCTYPE and the MIME type don't match up.

Instead of mucking around with DOCTYPEs and MIME types in order to get the browser to parse your broken document, the correct way to solve this problem would be to simply fix the invalid markup and remove the extraneous class attribute on line 172. [BTW: who wrote that document? The indentation and formatting is awful.]

Kakapo answered 4/5, 2010 at 16:25 Comment(10)
+1 for stealing my answer. The document looks auto-generated to me, but that is a little weird considering ASP.NET MVC doesn't generate the page code for you. He's probably using third-party controls or something that generates HTML code automatically.Hornpipe
-1: Sorry, but you are not answering my question. There is a good reason why I would want to use non-xhtml code. Please see my edit above.Maemaeander
@Adrian, he did answer your question. The web serving tells the browser your page is application/xhtml+xml, the proper doctype for XHTML, which causes it to enter XML parsing mode. But you tell it nothing in the page, apart from the doctype, which is ignored because of the MIME type. As your page's markup is TOTALLY BROKEN, the browser raises an XML parsing error.Hornpipe
That curl log shows that your server is sending the page as XHTMLOrthocephalic
@Adrian Grigore: First off, if you want to use non-XHTML, then why are you serving it as application/xhtml+xml? And secondly, it doesn't have anything to do with XHTML. Your document is invalid HTML, anyway. It doesn't matter whether you interpret it as XHTML or HTML, because it is neither.Camphor
In that case I guess my question is "Why is the web server sending the page as XHTML?"Maemaeander
Jörg: I am using the javascript micro templating engine, that's why my page can never be valid XHTML, even if I do fix the other obvious errors. See the example template in my edit1 above.Maemaeander
@Adrian Grigore: This might be relevant: #2013724Arriaga
@Andrian, your web server is serving it as XHTML because that's how IIS is configured to do by default with any ASP.NET web page.Hornpipe
You can read stuff about it on dev.opera.com. Starbucks had the same issue. Rogue library.Thorite
A
7

You have the "class" attribute specified two times.

alt text

From Well-formedness constraint: Unique Att Spec:

An attribute name MUST NOT appear more than once in the same start-tag or empty-element tag.

Arriaga answered 4/5, 2010 at 15:51 Comment(3)
-1: Thanks for your reply, but you are not answering my question. Please see my edit above.Maemaeander
Regarding your question: "Why would you possibly want to intentionally produce invalid documents?": Please see my edit above.Maemaeander
@Adrian Grigore: this does answer your question. It's simple: the specification forbids browsers to display broken documents. Period. If you want your document displayed, fix it. Also, you write: "I checked the source output from the browser to make sure I did not make any mistake ". Clearly, you didn't check very carefully, since you missed this one.Camphor
M
5

In case someone else has the same problem: As suggested by DeveloperArt it can be fixed with a simple ContentType="text/html" attribute in the page element.

Edit: The problem was in fact caused by a bug with the mobile.Browser file I am using in my web project. The workaround above works, but it is not really necessary in my case. See this answer for more details.

Maemaeander answered 4/5, 2010 at 22:47 Comment(2)
I'm working on documenting this as well. Do you remember which version of the MDBF you were using?Parotic
@Scott Hanselman: I can't really say anymore since I removed the MDBF from my project in favor of 51Degrees.mobi and switched version control to Mercurial a few months ago. However, I am pretty sure I downloaded the MDBF after August last year.Maemaeander
E
5

You got the correct answer (HTTP content-type header mandating XML parsing) and it seems it's fixed. I'll just add a minor hint on how you can figure out what's wrong from within Opera itself. Two possible ways:

1) Info panel

This is not visible by default, but if you open the panel bar on the left (press F4 to toggle if you don't see it), then click the small plus sign at the bottom, you can enable "Info" in the menu.

The info panel shows some assorted information about the page currently open, including encoding and MIME type.

2) Opera Dragonfly

Press Ctrl-Shift-I to open developer tools (or go through menus to Tools > Advanced > Opera Dragonfly)

Go to "Network" tab, then re-load site. You will see the request and can review the headers. Comparing this with corresponding information from Firebug would have shown you the difference in Content-type headers. (Here you will also see that Opera sends an "Accept" header that contains "application/xhtml+xml". This means "Hi server, if you happen to have this file in real XHTML format I would understand that just fine.". Perhaps your server-side framework saw that header and wrongly responded with the XHTML content-type even though the content was invalid?)

Eager answered 7/5, 2010 at 12:25 Comment(1)
Thanks for the information on the developer tools and info panel. It should come in handy for further debugging.Maemaeander
S
1

It seems like the server is serving a different mime types to different user-agents. Firefox is getting text/html but Opera (and curl according to Jörg W Mittag) is getting application/xhtml+xml. Do you have any content-negotiation code for your site?

Sympathetic answered 4/5, 2010 at 16:32 Comment(2)
I have no code on my web page that looks for the browser type and returns different content-types to different browser versions. Or do you mean something else?Maemaeander
#2013724Arriaga
C
0

This mostly occurs with ASP.NET as it sets content type for opera as application/xhtml+xml. In order to over come this issue. You need to set content type to text/html. The best way to fix this issue is to add following code to .browser config file for opera in App_Browser file.

<capability name="preferredRenderingMime" value="text/html" />
<capability name="preferredRenderingType" value="html32" />
<capability name="SupportsXhtmlRendering" value="false" />

Confucius answered 4/5, 2010 at 15:50 Comment(0)
A
0

Try from another PC to make sure that you're not hitting a cache issue.

Alfredalfreda answered 4/5, 2010 at 15:51 Comment(1)
I tried it from another computer, but the problem still persists.Maemaeander
C
0

The page code is cached in your browser, which is why you are continuing to see the error. You originally saw the error, because your code is likely not valid.

Chichihaerh answered 4/5, 2010 at 15:52 Comment(1)
I tried it from another computer, but the problem still persists.Maemaeander
E
0

It is because you've kind of told it to...

<html xmlns="http://www.w3.org/1999/xhtml">
Emporium answered 7/5, 2010 at 12:30 Comment(1)
I have considered this too, but it is not really relevant in this case. The problem is indeed that the server is sending a content type of application/xhtml+xml; to opera browsers by default and text/html to almost all other browsers. I'm still not sure why this is (I did not code anything browser-dependent on the server side), but at least I now know how to override it.Maemaeander
T
0

application/xhtml+xml

If the server sends the page as application/xhtml+xml, the browser parses it as XML as required by specification. When parsing as XML, the first XML well-formedness mistake will stop the parsing and the client (browser) usually displays an error message.

text/html

The parsers for text/html are more tolerant (due to the history of html development).

Changing the mime type

To change the content type sent by the server, you have to override the HTTP header value: Content-Type. This can be done through scripting language on the server side or sometimes in the configuration of the server such as Apache for example. I do not know how Microsoft-IIS/7.5 can specify on a URI base.

Content-Type: application/xhtml+xml; charset=utf-8 or Content-Type: text/html; charset=utf-8

Thorite answered 12/11, 2010 at 2:51 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.