Special Characters in XML
Asked Answered
M

9

12

I am creating a left navigation system utilizing xml and xsl. Everything was been going great until I tried to use a special character in my xml document. I am using » and I get th error.

reason: Reference to undefined entity 'raquo'.
error code: -1072898046

How do I make this work?

Matteroffact answered 16/10, 2008 at 15:58 Comment(0)
C
22

You are trying to use an HTML entity in a non-HTML or non-XHTML document. These entities are declared in the document's Document Type Definition (DTD).

You should use the numerical Unicode version of the entity reference. For example, in the case of » you should use »

Alternatively, you can define them in your XML document's DTD:

<!ENTITY entity-name "entity-value">
<!ENTITY raquo "&#187;">

Otherwise, if your document is UTF-8, I believe you can just use the actual character directly in your XML document.

»
Cockleshell answered 16/10, 2008 at 16:19 Comment(2)
Definitely use unicode characters or unicode entity references if you can. Named character references should be avoided in XML.Crannog
It's quite possible that the OP doesn't have a DTD for his XML. Even then, your answer could be used inside an internal subset if the user wanted. However, you are right that the simple answer is UTF-8 and just use the character.Treble
F
6

did you specify a doc type for your file ?

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

I think you might get such errors if you forget to specify it.

Also sometimes the entities work if you specify them by number instead of name.

&#187; &#171; instead of &raquo; and &laquo;
Faroff answered 16/10, 2008 at 16:14 Comment(1)
No, that's an XHTML Doctype. XHTML is an XML application and it defines &raquo;.Chipman
A
3

You don't need to declare an entity in your DTD, or even use a DTD. You probably don't need to use the Unicode representation of the character. You certainly don't need to use a CDATA section.

What you need to do is use a DOM to build your XML instead of trying to build it with string manipulation. The DOM will fix this problem for you.

In C#, this code:

 XmlDocument d = new XmlDocument();
 d.LoadXml("<foo/>");
 char c = (char)187;
 d.DocumentElement.InnerText = "Here's that character: " + c;
 Debug.WriteLine(d.OuterXml);
 d.DocumentElement.InnerText = "Here it is as an HTML entity: &raquo;";
 Debug.WriteLine(d.OuterXml);

produces this output:

<foo>Here's that character: »</foo>
<foo>Here it is as an HTML entity: &amp;raquo;</foo>

As you can see from the first example, the » character is perfectly legal in XML text. But I don't think you're trying to represent that character.

I think you're trying to do what's in the second example, based on the error message that you reported. You're trying to represent the string of characters &raquo;. The proper way to represent that string of characters in XML text is by escaping the ampersand; thus: &amp;raquo;.

So if you must use string manipulation to build your XML, just make sure that you escape any ampersands in your source data. Not to belabor the point, but if you were using a DOM, this would have been done for you automatically.

One other thing. It's quite likely that in your original question, which now reads "I am using »", what you actually typed is "I am using &raquo;". The actual post doesn't look like that, though. If you need to represent text literally in markdown, enclose it in backticks; otherwise, HTML entities will get converted to their character representation when the post is rendered.

Androgyne answered 16/10, 2008 at 19:56 Comment(1)
I wonder why it has been downvoted. It is a perfectly correct answer.Incommodity
S
1

This is an issue because not all HTML entities are XML entity. You can import the DTD of HTML into your document as Pat suggested, or do one of the following:

Replace all the occurances of the special character with the numeric entity code:

&raquo; becomes &#187;

Wrap all occurances of the special characters in a CDATA Tag

<![CDATA[&raquo;]]>

Define entitys at the top of your document

<!DOCTYPE ROOT_XML_ELEMENT [ <!ENTITY raquo "&#187;"> ]>
Serpasil answered 16/10, 2008 at 16:21 Comment(0)
B
0

Are you using the » symbol directly or are you defining it as &raquo; ? If you're using the escaped symbol, did you forget the semicolon?

Bathometer answered 16/10, 2008 at 16:2 Comment(1)
I am defining it as ». Double checked and no I didn;t forget the semicolon just missed when I pasted into here.Matteroffact
M
0

Joe

When I use the unicode version shows a square.

Putting the entity decalration into the xml doc produces a "Cannot have a DTD declaration outside of a DTD." error. I suppose this is expected.

When I use '' to include the dtd externally it doesn't seem to have any effect.

I am wondering if this is maybe a server issue. I am developing this locally and using Baby Web Server.

Matteroffact answered 16/10, 2008 at 17:35 Comment(1)
if you get square, then you're not declaring encoding properly or have wrong encoding of the file. Make sure you always use UTF-8 and if possible, send Content-Type:application/xml;charset=UTF-8 HTTP header. If that's not possible, add <?xml version="1.0" encoding="UTF-8"?> to the document.Hue
H
0

simply replace your HTML entity &raquo; with the numeric reference &#187; which is good in any XML and HTML.

Heideheidegger answered 17/10, 2008 at 14:56 Comment(0)
S
0

I found myself googling for such info a lot, so decided to post a matrix on my own site for the simple purpose of quickly being able to do a lookup:

http://martinkool.com/characters

Use the &#...; form indeed.

Singleness answered 27/10, 2008 at 21:37 Comment(0)
S
0

If you want the output document to contain the named HTML entity &raquo; rather than the numeric reference, add the following elements to your stylesheet (XSLT2.0 only):

<xsl:output use-character-maps="raquo.ent"/>
<xsl:character-map name="raquo.ent">
    <xsl:output-character character="&#187;" string="&amp;raquo;"/>
</xsl:character-map>
Solidarity answered 5/10, 2010 at 13:17 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.