Set HTML5 doctype with XSLT
Asked Answered
L

12

144

How would I cleanly set the doctype of a file to HTML5 <!DOCTYPE html> via XSLT (in this case with collective.xdv)

The following, which is the best my Google foo has been able to find:

<xsl:output
    method="html"
    doctype-public="XSLT-compat"
    omit-xml-declaration="yes"
    encoding="UTF-8"
    indent="yes" />

produces:

<!DOCTYPE html PUBLIC "XSLT-compat" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
Lundberg answered 2/8, 2010 at 11:8 Comment(3)
Incidentally, using PUBLIC "XSLT-compat" is out of date. The XSLT compatible HTML5 doctype is now <!DOCTYPE HTML SYSTEM "about:legacy-compat">. See dev.w3.org/html5/spec/syntax.html#doctype-legacy-stringEyestalk
From the last Editor WD, it looks like almost any doctype is allowed: short <!DOCTYPE html>, legacy <!DOCTYPE HTML SYSTEM "about:legacy-compat"> and obsoleted ("should not") HTML 4, HTML 4.01, XHTML 1.0 and XHTML 1.1 (all strict DTD when there is SYSTEM).Autolysis
Please update some answer to HTML5 as (nowadays) W3C recommendation.Yarak
I
158

I think this is currently only supported by writing the doctype out as text:

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output method="html" encoding="utf-8" indent="yes" />

  <xsl:template match="/">
    <xsl:text disable-output-escaping='yes'>&lt;!DOCTYPE html&gt;</xsl:text>
    <html>
    </html>
  </xsl:template>

</xsl:stylesheet>

This will produce the following output:

<!DOCTYPE html>
<html>
</html>
Inerrable answered 2/8, 2010 at 12:5 Comment(7)
This is the only standar way. But, with MSXSL, there is a non standar way: use empty xsl:output/@doctype-public and xsl:output/@doctype-system.Autolysis
disable-output-escaping was meant by CaseyDiablerie
This worked great once I removed both internal and public doc type attributes from the output method tag. Thanks!Mortgagee
This will work most of the time, but it is a hack, and it is unlikely (i.e. won't) work as expected if you are not serialising your result back to a text file on disk (e.g. if the result of the transform is being passed on to another process without serialisation).Jandy
If the doctype and opening html tag end up on the same line, then you can simply add a newline <xsl:text disable-output-escaping='yes'>&lt;!DOCTYPE html&gt;\n</xsl:text> (at least in Java's JAX, some 9 years later)Handhold
If the doctype and opening html tag end up on the same line and your JAX version do not support \n, you can use &#xa;: <xsl:text disable-output-escaping='yes'>&lt;!DOCTYPE html&gt;&#xa;</xsl:text>Excelsior
This script produces me: "<result>&lt;!DOCTYPE html&gt;<html></html></result>" If I replace &lt; with <, it produces an error...Deferent
P
66

To use the simple HTML doctype <!DOCTYPE html>, you have to use the disable-output-escaping feature: <xsl:text disable-output-escaping="yes">&lt;!DOCTYPE html&gt;</xsl:text>. However, disable-output-escaping is an optional feature in XSLT, so your XSLT engine or serialization pipeline might not support it.

For this reason, HTML5 provides an alternative doctype for compatibility with HTML5-unaware XSLT versions (i.e. all the currently existing versions of XSLT) and other systems that have the same problem. The alternative doctype is <!DOCTYPE html SYSTEM "about:legacy-compat">. To output this doctype, use the attribute doctype-system="about:legacy-compat" on the xsl:output element without using a doctype-public attribute at all.

<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
   <xsl:output method="html" doctype-system="about:legacy-compat"/>
   ...
   <html>
   </html>
</xsl:stylesheet>
Pilocarpine answered 4/8, 2010 at 11:9 Comment(3)
I appreciate this is probably the correct, standards driven way to accomplish what I want (I've upvoted it as such). But the former isn't supported (my processor falls over) and the latter still results in "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd" in my doctype. As @Jirka Kosek suggested, I think my XSLT processor might be broken.Lundberg
Deliverance (the XSLT processor I am using) mailing list discussion regarding this problem is here: coactivate.org/projects/deliverance/lists/…Lundberg
The w3c validator service issues a warning when the document starts with <!DOCTYPE html SYSTEM "about:legacy-compat">Hoes
F
30
<xsl:output
     method="html"
     doctype-system="about:legacy-compat"
     encoding="UTF-8"
     indent="yes" />

this outputs

<!DOCTYPE html SYSTEM "about:legacy-compat">

this is modified as my fix to http://ukchill.com/technology/generating-html5-using-xslt/

Foetor answered 6/5, 2012 at 7:24 Comment(2)
The w3c validator service issues a warning when the document starts with <!DOCTYPE html SYSTEM "about:legacy-compat">Hoes
@AdrianW The warning is "Documents should not use about:legacy-compat, except if generated by legacy systems that can't output the standard <!DOCTYPE html> doctype.", which is exactly what is happening here with xslt. This system is a legacy system that must emit a System ID. The HTML spec makes it very clear that <!DOCTYPE html SYSTEM "about:legacy-compat"> is the correct html5 doctype.Dense
F
21

With Saxon 9.4 you can use:

<xsl:output method="html" version="5.0" encoding="UTF-8" indent="yes" />

This generates:

<!DOCTYPE HTML>
Forestry answered 15/10, 2013 at 10:54 Comment(2)
Unfortunately, it's specific to Saxon. On the otherhand, it is simply the most concise answer to the Q. I wonder if this works with the other XSLT 2.0 processors?Offhand
This is now no longer specific just to Saxon but is also supported in the libxslt/xsltproc sources. See the details at the end of https://mcmap.net/q/158746/-set-html5-doctype-with-xslt/…Shortchange
M
10

Use doctype-system instead of doctype-public

Michamichael answered 2/8, 2010 at 12:1 Comment(4)
That still leaves "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd" in the doctype.Lundberg
if <xsl:output doctype-system="about:legacy-compat" method="html"/> produces what you say, then there is definitively bug in your XSLT processor you use.Michamichael
Where is this behavior specified? This definitely doesn't work in JAXP XSLT.Renner
xml.apache.org/xalan-j this one gives nowhere near what you're expecting - maybe just age.Factitive
A
10

You must use XHTML 1.0 Strict as the doctype if you want XHTML output consistent with HTML5, libxml2's xml serializer has a special output mode triggered by the XHTML 1.0 doctypes that ensures output is XHTML compatible, (e.g. <br /> rather than <br/>, <div></div> rather than <div/>). doctype-system="about:legacy-compat" does not trigger this compatibility mode

If you are happy with html output, then setting <xsl:output method="html"> should do the right thing. You can then set the doctype with <xsl:text disable-output-escaping="yes">&lt;!DOCTYPE html&gt;</xsl:text>, though this will need plumbing in at the appropriate place as XDV does not support this yet.

In fact it seems <xsl:output method="html"/> does not really help either - this will result in <br/> being output as <br></br>.

Allodium answered 4/8, 2010 at 14:9 Comment(0)
L
6

This variation of Jirka Kosek's advice, via Advanced XDV theming on Plone.org seems to work for me in collective.xdv.

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output
      doctype-public="HTML"
      doctype-system=""/>
</xsl:stylesheet>
Lundberg answered 2/8, 2010 at 13:42 Comment(1)
Yes, but as I've commented in 0xA3 answer, empty @doctype-system or @doctype-public are not standar (also, it's against the spec!)Autolysis
A
5

This is a comment, but I do not have enough karma points to put it in the correct place. Sigh.

I appreciate this is probably the correct, standards driven way to accomplish what I want (I've upvoted it as such). But the former isn't supported (my processor falls over) and the latter still results in "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd" in my doctype. As @Jirka Kosek suggested, I think my XSLT processor might be broken.

No, your XSLT processor is not broken, it's just that XDV adds:

<xsl:output method="xml" indent="no" omit-xml-declaration="yes" media-type="text/html" encoding="utf-8" doctype-public="-//W3C//DTD XHTML 1.0 Transitional//EN" doctype-system="http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"/>

by default, so when you add a second <xsl:output doctype-system="about:legacy-compat"/> the previous doctype-public is not overwritten.

Note that XHTML 1.0 strict is listed as an obsolete permitted doctype string, so it is perfectly acceptable to use this doctype and still call it HTML5.

Allodium answered 4/8, 2010 at 14:18 Comment(2)
If your XSLT processor adds elements to your stylesheets or has some non-standards attribute default values, that would mean it's broken.Autolysis
@Alejandro: XDV (now renamed diazo) is not an XSLT processor, it is a theme -> XSLT compiler. It is XDV which is adding the the default values into the compiled XSLT. I know this because I wrote it ;)Allodium
K
3

Sorry to only provide links but this was discussed among the WHATWG group but it's been many months since I've dealt with it. Here Ian Hickson and some XML experts discuss this:
http://lists.w3.org/Archives/Public/public-html/2009Jan/0640.html
http://markmail.org/message/64aykbbsfzlbidzl
and here is the actual issue number:
http://www.w3.org/html/wg/tracker/issues/54
and here's this discussion
http://www.contentwithstyle.co.uk/content/xslt-and-html-5-problems

Kassia answered 2/8, 2010 at 12:15 Comment(0)
G
2

Use this tag

<xsl:output method="xml" doctype-system="http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd" doctype-public="XSLT-compat" indent="yes"/>
Gavrah answered 12/2, 2018 at 9:40 Comment(0)
S
1

The following code will work as a standalone template if saved as html5.xml:

<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet type="text/xsl" href="html5.xml"?>
<xsl:stylesheet version="1.0"
            xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns="http://www.w3.org/1999/xhtml"
            >
<xsl:output method="xml" encoding="utf-8" version="" indent="yes" standalone="no" media-type="text/html" omit-xml-declaration="no" doctype-system="about:legacy-compat" />

<xsl:template match="xsl:stylesheet">
  <xsl:apply-templates/>
</xsl:template>

<xsl:template match="/">
  <html>
    <head>
      <meta http-equiv="Content-Type" content="text/html;charset=utf-8" />
    </head>
    <body>
      <xsl:text>hi</xsl:text>
    </body>
  </html>
</xsl:template>

</xsl:stylesheet>

References

Sterculiaceous answered 5/1, 2012 at 23:9 Comment(0)
T
1

that's what i use to generate a compatible html5 doctype (getting saxons html5 out, otherwise doing the legacy thing)

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>

<xsl:stylesheet
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns="http://www.w3.org/TR/REC-html40">

    <xsl:output
        method="html"
        version="5.0"
        doctype-system="about:legacy-compat"
        encoding="UTF-8"
        indent="yes" />
Tempera answered 16/10, 2016 at 0:46 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.