Is > ever necessary?
Asked Answered
P

5

14

I now develop websites and XML interfaces since 7 years, and never, ever came in a situation, where it was really necessary to use the &gt; for a >. All disambiguition could so far be handled by quoting <, &, " and ' alone.

Has anyone ever been in a situation (related to, e.g., SGML processing, browser issues, XSLT, ...) where you found it indespensable to escape the greater-than sign with &gt;?

Update: I just checked with the XML spec, where it says, for example, about character data in section 2.4:

Character Data

[14]      CharData       ::=      [^<&]* - ([^<&]* ']]>' [^<&]*)

So even there, the > isn't mentioned as something special, except from the ending sequence of a CDATA section.

This one single case, where the > is of any significance, would be the ending of a CDATA section, ]]>, but then again, if you'd quote it, the quote (i.e., the literal string ]]&gt;) would land literally in the output (since it's CDATA).

Pleader answered 25/8, 2010 at 14:41 Comment(6)
Maybe I don't understand, but it helps prevent injection of html/js. Also its requiried if your result html is to be xhtml complient.Jointed
I think it is more for symmetry with '&lt;' than anything else.Loyal
You never needed to because browsers aren't like compilers, they are way too permissive/forgiving, hence the ignorance of the standards across the web. Aren't you escaping "'" in a JavaScript string? (var test = 'I'll tell';) This is the same thing.Silly
@Jointed Boss: If you always escape the < correctly, can you name use cases, where the quoting of > is necessary to prevent HTML injection? About XHTML compliance: I checked the XML spec, and they don't say a word about > being any more special than any letter or so.Pleader
@Mike Gleason jr Couturier: In my question I don't specifically concentrate on browsers, but the whole SGML/XML toolchain. Actually, I'm more interested in issues with well-formed XML than in any browser quirks. And no, I don't think, that escaping quotes in JS has anything to do with the question.Pleader
@Jonathan Leffler: That's my impression, too. I just wanted to check if I missed something.Pleader
R
7

You don't need to absolutely because almost any XML interpreter will understand what you mean. But still you use a special character without any protection if you do so.

XML is all about semantic, and this is not really semantic compliant.

About your update, you forgot this part :

The right angle bracket (>) may be represented using the string " > ", and must, for compatibility, be escaped using either " &gt; " or a character reference when it appears in the string " ]]> " in content, when that string is not marking the end of a CDATA section.

The use case given in the documentation is more about something like this :

<xmlmarkup>
]]>
</xmlmarkup>

Here the ]]> part could be a problem with old SGML parsers, so it must be escaped into = ]]&gt; for compatibilities reasons.

Runway answered 25/8, 2010 at 14:45 Comment(6)
What about the almost part? Are there any that get a hickup from an unquoted >?Pleader
Well, if anyone wrote a XML Parser that only respect the XML standards, it might happen. I don't know a parser which could have this behavior, but it wouldn't be its fault nor a problem.Runway
An XML parser that respects the XML standards MUST accept unquoted >. I think the paragraph you quoted refers to compatibility with non-compliant parsers or maybe with an older (draft) version of the spec.Technic
@Technic from this spec : w3.org/TR/xml, and yes it's about compatibility. The must part means for me that I don't really have a choice here.Runway
OK, again: The only place in the spec, where I found the > to have any relevance is the ]]> CDATA closing. And that's kind of a special case, because there ]]&gt; won't bring you anything. (That's by the way the only section over all, that contains the string &gt;)Pleader
@Colin HEBERT: I wrote the last comment while you changed the answer (for the other readers: I, too, merged updates 1 and 2 in the question).Pleader
P
3

Not so much as an author of (x)html documents, but more as a user of sloppy written comments fields in websites, that "offer" you to insert html.

I mean if you do your site the right way, you wouldn't hardcode your content anyway, right? So your call to htmlentities or whatever (long time no see, php) would take care of replacing special characters for you. So sure, you wouldn't manually type &gt; but I hope you take measures so > is automatically replaced.

Penile answered 25/8, 2010 at 14:50 Comment(0)
D
3

I used one not 19 hours ago to pass a strict xml validator. Another case is when you use them actually in html/xml content text (rather than attributes), like this: <.

Sure, a lax parser will accept most anything you throw at it, but if you're ever worried about XSS, &lt; is your friend.

Update: Here's an example where you need to escape > in Firefox:

<?xml version="1.0" encoding="utf-8" ?>
<test>
    ]]>
</test>

Granted, it still isn't an example of having to escape a lone >.

Dakota answered 25/8, 2010 at 14:55 Comment(3)
Actually, would your referenced example have worked, too, if you only escaped the <? That exactly is my case. And if not, is the parser wrong, or have I missed the spot in the XML spec? w3.org/TR/xml/#NT-AttValuePleader
OK, now I see the point you're trying to get at. Updated the post with an example which gives paring errors in Firefox, but would parse if you never needed to escape >.Dakota
Yep, now we're thinking the same. Colin and me found that one, too. It seems that, at least in the "XML part of SGML"-world, this is the only relevant example where &gt; makes sense.Pleader
P
0

I just thought of another example, where you need to quote > in HTML5 (not XHTML5) documents: If you need it in attributes without quotes (which is something, that can be argued of course).

<img src=arrow.png alt=&gt;>

should be equivalent to XHTML

<img src="arrow.png" alt=">" />

But then again, (?<!X)HTML is not SGML.

Pleader answered 18/7, 2011 at 8:6 Comment(0)
T
0

Imagine that you have the following text this is a not a ]]> nice day and you decide to surround it by CDATA sections <![CDATA[this is a not a ]]> nice day]]>.

In order to avoid that (and for allowing parsing of SGML fragments with unterminated marked sections), clause 10.4 of ISO 8879:1986 declares that the occurrence of ]]> outside a marked section is an error.

Also, in the times of SGML marked sections were very popular, as they were not only used for CDATA (as in XML), but also for RCDATA (only entities and character references allowed) and IGNORE and INCLUDE (which allowed for recognition of markup inside them).

For instance, in SGML one could write:

 <!ENTITY %WHATTODO "INCLUDE">
 <![%WHATTODO;[<b>]]&gt;</b>]]>

Which is equivalent to:

 <b>]]&gt;</b>
Thurnau answered 21/12, 2013 at 0:19 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.