What is difference between XML Schema and DTD?
Asked Answered
M

11

206

I have googled this question, but I do not understand clearly what is an XML schema and DTD (document type definition), and why the XML schema is more powerful compared to DTD.

Any guidance would be highly appreciated.

Marriott answered 9/10, 2009 at 14:39 Comment(0)
K
156

From the Differences Between DTDs and Schema section of the Converting a DTD into a Schema article:

The critical difference between DTDs and XML Schema is that XML Schema utilize an XML-based syntax, whereas DTDs have a unique syntax held over from SGML DTDs. Although DTDs are often criticized because of this need to learn a new syntax, the syntax itself is quite terse. The opposite is true for XML Schema, which are verbose, but also make use of tags and XML so that authors of XML should find the syntax of XML Schema less intimidating.

The goal of DTDs was to retain a level of compatibility with SGML for applications that might want to convert SGML DTDs into XML DTDs. However, in keeping with one of the goals of XML, "terseness in XML markup is of minimal importance," there is no real concern with keeping the syntax brief.

[...]

So what are some of the other differences which might be especially important when we are converting a DTD? Let's take a look.

Typing

The most significant difference between DTDs and XML Schema is the capability to create and use datatypes in Schema in conjunction with element and attribute declarations. In fact, it's such an important difference that one half of the XML Schema Recommendation is devoted to datatyping and XML Schema. We cover datatypes in detail in Part III of this book, "XML Schema Datatypes."

[...]

Occurrence Constraints

Another area where DTDs and Schema differ significantly is with occurrence constraints. If you recall from our previous examples in Chapter 2, "Schema Structure" (or your own work with DTDs), there are three symbols that you can use to limit the number of occurrences of an element: *, + and ?.

[...]

Enumerations

So, let's say we had a element, and we wanted to be able to define a size attribute for the shirt, which allowed users to choose a size: small, medium, or large. Our DTD would look like this:

<!ELEMENT item (shirt)>
<!ELEMENT shirt (#PCDATA)>
<!ATTLIST shirt
    size_value (small | medium | large)>

[...]

But what if we wanted size to be an element? We can't do that with a DTD. DTDs do not provide for enumerations in an element's text content. However, because of datatypes with Schema, when we declared the enumeration in the preceding example, we actually created a simpleType called size_values which we can now use with an element:

<xs:element name="size" type="size_value">

[...]

Krutz answered 9/10, 2009 at 15:1 Comment(3)
just a note, W3C seems to think that DTD is a type of XML schema language: "There are several different schema languages in widespread use, but the main ones are Document Type Definitions (DTDs), Relax-NG, Schematron and W3C XSD (XML Schema Definitions). " w3.org/standards/xml/schemaGeodetic
@Geodetic I guess they are specifying DTD as a schema language not a XML schema.Interjoin
On "But what if we wanted size to be an element? ": <size name='medium'/> Now size is an element ;-)Soybean
R
108

Differences between an XML Schema Definition (XSD) and Document Type Definition (DTD) include:

  • XML schemas are written in XML while DTD are derived from SGML syntax.
  • XML schemas define datatypes for elements and attributes while DTD doesn't support datatypes.
  • XML schemas allow support for namespaces while DTD does not.
  • XML schemas define number and order of child elements, while DTD does not.
  • XML schemas can be manipulated on your own with XML DOM but it is not possible in case of DTD.
  • using XML schema user need not to learn a new language but working with DTD is difficult for a user.
  • XML schema provides secure data communication i.e sender can describe the data in a way that receiver will understand, but in case of DTD data can be misunderstood by the receiver.
  • XML schemas are extensible while DTD is not extensible.

Not all these bullet points are 100% accurate, but you get the gist.

On the other hand:

  • DTD lets you define new ENTITY values for use in your XML file.
  • DTD lets you extend it local to an individual XML file.
Ramin answered 15/5, 2012 at 20:30 Comment(1)
On "using XML schema user need not to learn a new language but working with DTD is difficult for a user.": I actually think that DTDs are more readable to humans.Soybean
K
35

As many people have mentioned before, XML Schema utilize an XML-based syntax and DTDs have a unique syntax. DTD doesn't support datatypes, which does matter.

Lets see a very simple example in which university has multiple students and each student has two elements "name" and "year". Please note that I have uses "// --> " in my code just for comments.

enter image description here

Now I will write this example both in DTD and in XSD.

DTD

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE university[              // --> university as root element 
<!ELEMENT university (student*)>   // --> university has  * = Multiple students
<!ELEMENT student (name,year)>     // --> Student has elements name and year
<!ELEMENT name (#PCDATA)>          // --> name as Parsed character data
<!ELEMENT year (#PCDATA)>          // --> year as Parsed character data
]>

<university>
    <student>
        <name>
            John Niel             //---> I can also use an Integer,not good
        </name>
        <year>
            2000                 //---> I can also use a string,not good
        </year>
    </student>
</university>

XML Schema Definition (XSD)

<?xml version="1.0" encoding="UTF-8"?>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">

<xsd:complexType name ="uniType">                    //--> complex datatype uniType
 <xsd:sequence>
  <xsd:element ref="student" maxOccurs="unbounded"/> //--> has unbounded no.of students
 </xsd:sequence>
</xsd:complexType>

<xsd:complexType name="stuType">                     //--> complex datatype stuType
 <xsd:sequence>
  <xsd:element ref="name"/>                          //--> has element name
  <xsd:element ref="year"/>                          //--> has element year
 </xsd:sequence>
</xsd:complexType>

<xsd:element name="university" type="uniType"/>       //--> university of type UniType 
<xsd:element name="student" type="stuType"/>          //--> student of type stuType
<xsd:element name="name" type="xsd:string"/>          //--> name of datatype string
<xsd:element name="year" type="xsd:integer"/>         //--> year of datatype integer
</xsd:schema>



<?xml version="1.0" encoding="UTF-8"?>
<university>
    <student>
        <name>
            John Niel          
        </name>
        <year>
            2000                      //--> only an Integer value is allowed
        </year>
    </student>
</university>
Koslo answered 17/8, 2017 at 18:23 Comment(1)
You should explain where "datatype" actually matters for your example.Soybean
H
20

DTD predates XML and is therefore not valid XML itself. That's probably the biggest reason for XSD's invention.

Herd answered 9/10, 2009 at 14:40 Comment(3)
exactly - XSD / XML Schema is XML itself - which is a really good thing!Mayweed
hmm, XSD adds more thing than just XML syntax; for instance, datatypesMessieurs
Maybe elaborate why it is desirable that the DTD is XML.Soybean
K
10

Similarities between XSD and DTD

both specify elements, attributes, nesting, ordering, #occurences

Differences between XSD and DTD

XSD also has data types, (typed) pointers, namespaces, keys and more.... unlike DTD 

Moreover though XSD is little verbose its syntax is extension of XML, making it convenient to learn fast.

Kettering answered 13/11, 2013 at 14:38 Comment(1)
DTD is more limited than XSD as far as #occurences with only the choices of 1, 0 or 1, 0 or more, while XSD can specify the minimum and maximum number.Bartram
F
9

One difference is that in a DTD the content model of an element is completely determined by its name, independently of where it appears in the document:

Assuming you want to have

  • a person element
  • with a child element called name
  • an name itself has child elements first and last.

Like this

   <person>
       <name>
            <first></first>
            <last></last>
       </name>
   </person>

If a city element in the same document also needs to have a child element 'name' the DTD requires that this 'name' element must have child elements first and last as well. Despite the fact that city.name does not require first and last as children.

In contrast, XML Schema allows you to declare child element types locally; you could declare the name child elements for both person and city separately. Thus giving them their proper content models in those contexts.

The other major difference is support for namespaces. Since DTDs are part of the original XML specification (and inherited from SGML), they are not namespace-aware at all because XML namespaces were specified later. You can use DTDs in combination with namespaces, but it requires some contortions, like being forced to define the prefixes in the DTD and using only those prefixes, instead of being able to use arbitrary prefixes.

To me, other differences are mostly superficial. Datatype support could easily be added to DTDs, and syntax is just syntax. (I, for one, find the XML Schema syntax horrible and would never want to hand-maintain an XML Schema, which I wouldn't say about DTDs or RELAX NG schemas; if I need an XML Schema for some reason, I usually write a RELAX NG one and convert it with trang.)

Flats answered 10/10, 2009 at 18:35 Comment(1)
Using the same name name for two different things (types) is never a good idea.Soybean
R
8

Similarities:

DTDs and Schemas both perform the same basic functions:

  • First, they both declare a list of elements and attributes.
  • Second, both describe how those elements are grouped, nested or used within the XML. In other words, they declare the rules by which you are allowing someone to create an XML file within your workflow, and
  • Third, both DTDs and schemas provide methods for restricting, or forcing, the type or format of an element. For example, within the DTD or Schema you can force a date field to be written as 01/05/06 or 1/5/2006.

Differences:

  • DTDs are better for text-intensive applications, while schemas have several advantages for data-intensive workflows.
  • Schemas are written in XML and thus follow the same rules, while DTDs are written in a completely different language.

Examples:

DTD:

<?xml version="1.0" encoding="UTF-8"?>
    <!ELEMENT employees (Efirstname, Elastname, Etitle, Ephone, Eemail)>
         <!ELEMENT Efirstname (#PCDATA)>
         <!ELEMENT Elastname (#PCDATA)>
         <!ELEMENT Etitle (#PCDATA)>
         <!ELEMENT Ephone (#PCDATA)>
         <!ELEMENT Eemail (#PCDATA)>

XSD:

<?xml version="1.0" encoding="UTF-8"?>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:od="urn:schemas-microsoft-com:officedata">
<xsd:element name="dataroot">
     <xsd:complexType>
          <xsd:sequence>
               <xsd:element ref="employees" minOccurs="0" maxOccurs="unbounded"/>
          </xsd:sequence>
          <xsd:attribute name="generated" type="xsd:dateTime"/>
      </xsd:complexType>
</xsd:element>
<xsd:element name="employees">
      <xsd:annotation>
           <xsd:appinfo>
               <od:index index-name="PrimaryKey" index-key="Employeeid " primary="yes"
                unique="yes" clustered="no"/>
          <od:index index-name="Employeeid" index-key="Employeeid " primary="no" unique="no"
           clustered="no"/>
     </xsd:appinfo>
</xsd:annotation>
     <xsd:complexType>
          <xsd:sequence>
               <xsd:element name="Elastname" minOccurs="0" od:jetType="text"
                od:sqlSType="nvarchar">
                    <xsd:simpleType>
                         <xsd:restriction base="xsd:string">
                              <xsd:maxLength value="50"/>
                         </xsd:restriction>
                    </xsd:simpleType>
               </xsd:element>
               <xsd:element name="Etitle" minOccurs="0" od:jetType="text" od:sqlSType="nvarchar">
                    <xsd:simpleType>
                         <xsd:restriction base="xsd:string">
                              <xsd:maxLength value="50"/>
                         </xsd:restriction>
                    </xsd:simpleType>
               </xsd:element>
               <xsd:element name="Ephone" minOccurs="0" od:jetType="text"
                od:sqlSType="nvarchar">
                    <xsd:simpleType>
                         <xsd:restriction base="xsd:string">
                              <xsd:maxLength value="50"/>
                         </xsd:restriction>
                    </xsd:simpleType>
               </xsd:element>
               <xsd:element name="Eemail" minOccurs="0" od:jetType="text"
               od:sqlSType="nvarchar">
                    <xsd:simpleType>
                         <xsd:restriction base="xsd:string">
                              <xsd:maxLength value="50"/>
                         </xsd:restriction>
                    </xsd:simpleType>
               </xsd:element>
               <xsd:element name="Ephoto" minOccurs="0" od:jetType="text"
                od:sqlSType="nvarchar">
                    <xsd:simpleType>
                         <xsd:restriction base="xsd:string">
                              <xsd:maxLength value="50"/>
                         </xsd:restriction>
                    </xsd:simpleType>
               </xsd:element>
          </xsd:sequence>
</xsd:complexType>
</xsd:element>
</xsd:schema>
Robbynrobe answered 10/10, 2015 at 9:11 Comment(0)
H
7

XML DTD

The purpose of a DTD is to define the structure of an XML document. It defines the structure with a list of legal elements:

<!ATTLIST contact type CDATA #IMPLIED>
<!ELEMENT address1 ( #PCDATA)>
<!ELEMENT city ( #PCDATA)>
<!ELEMENT state ( #PCDATA)>
<!ELEMENT zip ( #PCDATA)>

XML Schema

XML Schema enables schema authors to specify that element quantity’s data must be numeric or, even more specifically, an integer. In the following example I used string:

<xs:element name="note">
<xs:complexType>
  <xs:sequence>
    <xs:element name="address1" type="xs:string"/>
    <xs:element name="city" type="xs:string"/>
    <xs:element name="state" type="xs:string"/>
    <xs:element name="zip" type="xs:string"/>
  </xs:sequence>
</xs:complexType>

Herrah answered 11/11, 2013 at 17:19 Comment(1)
I like your emphasis on "structure"; it's like with syntax diagrams (for programming languages): Not every syntactically correct program is semantically correct, and you cannot provide a syntax diagram to allow only semantically correct programs (it seems that is what people expect from XSD).Soybean
G
4

DTD can have only two types of data, the CDATA and the PCDATA. But in a schema you can use all the primitive data type that you use in the programming language and you have the flexibility of defining your own custom data types.

The developer building a schema can create custom data types based on the core data types and by using different operators and modifiers.

Gadmann answered 12/4, 2012 at 18:8 Comment(2)
DTD can also have the subset of CDATA called enumeration values.Bartram
See my comment on https://mcmap.net/q/126682/-what-is-difference-between-xml-schema-and-dtd, too.Soybean
C
4

When XML first came out, we were told it would solve all our problems: XML will be user-friendly, infinitely extensible, avoid strong-typing, and not require any programming skills. I learnt about DTD's and wrote my own XML parser. 15+ years later, I see that most XML is not user-friendly, and not very extensible (depending on its usage). As soon as some clever clogs hooked up XML to a database I knew that data types were all but inevitable. And, you should see the XSLT (transformation file) I had to work on the other day. If that isn't programming, I don't know what is! Nowadays it's not unusual to see all kinds of problems relating to XML data or interfaces gone bad. I love XML but, it has strayed far from its original altruistic starting point.

The short answer? DTD's have been deprecated in favor of XSD's because an XSD lets you define an XML structure with more precision.

Checkerwork answered 8/9, 2015 at 12:31 Comment(1)
Well, I guess more than 90% use XML just to represent nested data structures with a standard syntax, not caring about DTDs at all. Maybe because it's so easy to create XML (e.g. from a Java object) with current tools.Soybean
P
2

DTD is pretty much deprecated because it is limited in its usefulness as a schema language, doesn't support namespace, and does not support data type. In addition, DTD's syntax is quite complicated, making it difficult to understand and maintain..

Pauwles answered 9/10, 2009 at 18:19 Comment(1)
Deprecated? No. [XDR is deprecated] Going out of fashion? Maybe. More limited than XSD? Yes. Functionality subset of XSD functionality? No. Syntax too complex? Hardly, just different (IMHO). Personally I find DTD easier to read than XSD precisely because it isn't XML.Bartram

© 2022 - 2024 — McMap. All rights reserved.