ThePlace

Home ] Search ] Resources ] Site Map ] Contact Me ]
Dave's Information Technology Resource

Up ]

What is XML ] Viewing XML ] XML Syntax ] Creating XML ] XML and CSS ] [ XML and DTD ] XML, XSL, and HTML ] The XML DOM ]

--- XML and DTD ---

XML and DTD's

What is a DTD?

bullet DTD = Document Type Definition
bulletThe purpose of a DTD is to define the legal building blocks of an XML document. 
bulletIt defines the document structure with a list of legal elements. 
bulletA DTD can be declared inline in your XML document, or as an external reference.

 

Why do we have DTD's?

bulletSome XML parsers such as Internet Explorer merely check XML for "well-formedness".
bulletDTD's are used to ensure that the XML data meets criteria for specific structure and layout.
bulletDTD's are very useful when data is to be exchanged between different data systems.
bulletTools that act as validating parsers include:
bulletXML Parser for Java
bulletChilkat XML Parser

 

Internal DTD's

This is an XML document with a internal Document Type Definition.

<?xml version="1.0"?>
<!DOCTYPE letter [
  <!ELEMENT letter    (to,from,subject,body)>
  <!ELEMENT to      (#PCDATA)>
  <!ELEMENT from    (#PCDATA)>
  <!ELEMENT subject (#PCDATA)>
  <!ELEMENT body    (#PCDATA)>
]>
<letter>
<to>John</to>
<from>Bob</from>
<subject>Something important</subject>
<body>See me in the morning for the next assignment...</body>
</letter> 

The DTD is interpreted like this:

!ELEMENT note (in line 2) defines the element "letter" as having four elements: "to,from,subject,body".
!ELEMENT to (in line 3) defines the "to" element  to be of the type "CDATA".
!ELEMENT from (in line 4) defines the "from" element to be of the type "CDATA"
and so on.....

External DTD

This is an XML document with an external DTD:

<?xml version="1.0"?>
<!DOCTYPE letter SYSTEM "note.dtd">
<letter>
<to>John</to>
<from>Bob</from>
<subject>Something important</subject>
<body>See me in the morning for the next assignment...</body>
</letter> 

This is what you find in "note.dtd" containing the Document Type Definition:

<?xml version="1.0"?>
<!ELEMENT letter    (to,from,subject,body)>
<!ELEMENT to      (#PCDATA)>
<!ELEMENT from    (#PCDATA)>
<!ELEMENT subject (#PCDATA)>
<!ELEMENT body    (#PCDATA)>
 

Why use a DTD?

bulletXML provides an application independent way of sharing data
bulletWith a DTD, different groups of people can agree on a common DTD for interchanging data. 
bulletYour application can use a standard DTD to verify that data that you receive from the outside world is valid. 
bulletThe DTD can be used to verify your data.

 

 

DTD - XML building blocks

XML documents (and HTML documents) are made up by the following building blocks:

bulletElements
bulletTags
bulletAttributes
bulletEntities
bulletPCDATA
bulletCDATA

Elements

bulletElements are the main building blocks of both XML and HTML documents.
bulletExamples of HTML elements are "body" and "table". 
bulletExamples of XML elements could be "note" and "message". 
bulletElements can contain text, other elements, or be empty. 
bulletExamples of empty HTML elements are "hr", "br" and "img".

Tags

bulletTags are used to markup elements.
bulletA starting tag like <element_name> mark up the beginning of an element, and an ending tag like </element_name>  mark up the end of  an element.
bulletExamples:
A body element: <body>body text in between</body>.
A message element: <message>some message in between</message>

Attributes

bulletAttributes provide extra information about elements.
bulletAttributes are placed inside the start tag of an element. 
bulletAttributes come in name/value pairs. 
bulletThe following "img" element has an additional information about a source file:
<img src="computer.gif" />
bulletThe name of the element is "img". 
bulletThe name of the attribute is "src". 
bulletThe value of the attribute is "computer.gif". 
bulletSince the element itself is empty it is closed by a " /".

Entities

bulletEntities as variables used to define common text. 
bulletEntity references are references to entities.
bulletConsider the HTML entity reference: "&nbsp;"  that is used to insert an extra space in an HTML document. 
bullet Entities are expanded when a document is parsed by an XML parser.
bulletThe following entities are predefined in XML:

 

Entity References Character
&lt; <
&gt; >
&amp; &
&quot; "
&apos; '

PCDATA

bulletPCDATA means parsed character data.
bulletThink of character data as the text found between the start tag and the end tag of an XML element.
bulletPCDATA is text that will be parsed by a parser. 
bullet Tags inside the text will be treated as markup and entities will be expanded. 

CDATA

bulletCDATA also means character data.
bulletCDATA is text that will NOT be parsed by a parser. 
bullet Tags inside the text will NOT be treated as markup and entities will not be expanded.
 

 

DTD - Elements

Declaring an Element

bulletIn the DTD, XML elements are declared with an element declaration. 
bulletAn element declaration has the following syntax:
<!ELEMENT element-name (element-content)>

Empty elements

bulletEmpty elements are declared with the keyword EMPTY inside the parentheses:
<!ELEMENT element-name (EMPTY)>

example:
<!ELEMENT img (EMPTY)>


Elements with data

bulletElements with data are declared with the data type inside parentheses:
<!ELEMENT element-name (#CDATA)>
or
<!ELEMENT element-name (#PCDATA)>
or
<!ELEMENT element-name (ANY)>
example:
<!ELEMENT letter (#PCDATA)>
bullet#CDATA means the element contains character data that is not supposed to be parsed by a parser.
bullet#PCDATA means that the element contains data that IS going to be parsed by a parser.
bulletThe keyword ANY declares an element with any content.
bulletIf a #PCDATA section contains elements, these elements must also be declared.

Elements with children (sequences)

bulletElements with one or more children are defined with the name of the children elements inside the parentheses:
<!ELEMENT element-name (child-element-name)>
or
<!ELEMENT element-name (child-element-name,child-element-name,.....)>
example:
<!ELEMENT letter (to,from,heading,body)>
bulletWhen children are declared in a sequence separated by commas, the children must appear in the same sequence in the document. 
bulletIn a full declaration, the children must also be declared, and the children can also have children. 
bulletThe full declaration of the letter document will be:
<!ELEMENT letter    (to,from,subject,body)>
<!ELEMENT to      (#PCDATA)>
<!ELEMENT from    (#PCDATA)>
<!ELEMENT subject (#PCDATA)>
<!ELEMENT body    (#PCDATA)>

Wrapping

bulletIf the DTD is to be included in your XML source file, it should be wrapped in a DOCTYPE definition with the following syntax:
<!DOCTYPE root-element [element-declarations]>
example: 
<?xml version="1.0"?>
<!DOCTYPE letter [
  <!ELEMENT letter    (to,from,subject,body)>
  <!ELEMENT to      (#PCDATA)>
  <!ELEMENT from    (#PCDATA)>
  <!ELEMENT subject (#PCDATA)>
  <!ELEMENT body    (#PCDATA)>
]>
<letter>
<to>John</to>
<from>Bob</from>
<subject>Something important</subject>
<body>See me in the morning for the next assignment...</body>
</letter> 

Declaring only one occurrence of the same element 

<!ELEMENT element-name (child-name)>
example
<!ELEMENT letter (message)>
bulletThe example declaration above declares that the child element message can only occur one time inside the letter element.

Declaring minimum one occurrence of the same element

<!ELEMENT element-name (child-name+)>
example
<!ELEMENT letter (message+)>
bulletThe + sign in the example above declares that the child element message must occur one or more times inside the letter element.

Declaring zero or more occurrences of the same element 

<!ELEMENT element-name (child-name*)>
example
<!ELEMENT letter (message*)>
bulletThe * sign in the example above declares that the child element message can occur zero or more times inside the letter element.

Declaring zero or one occurrences of the same element 

<!ELEMENT element-name (child-name?)>
example
<!ELEMENT letter (message?)>
bulletThe ? sign in the example above declares that the child element message can occur zero or one times inside the letter element.

Declaring mixed content

example
<!ELEMENT letter (to+,from,subject,message*,#PCDATA)>
bulletThe example above declares that the element letter must contain at least one to child element, exactly one from child element, exactly one header, zero or more message, and some other parsed character data as well.

 

 

DTD - Attributes

Declaring Attributes

bulletIn the DTD, XML element attributes are declared with an ATTLIST declaration. An attribute declaration has the following syntax:
<!ATTLIST element-name attribute-name attribute-type default-value>
bulletAs you can see from the syntax above, the ATTLIST declaration defines the element which can have the attribute, the name of the attribute, the type of the attribute, and the default attribute value.
bulletThe attribute-type can have the following values:

 

Value Explanation
CDATA
The value is character data
(eval|eval|..)
The value must be an enumerated value
ID
The value is an unique id 
IDREF
The value is the id of another element
IDREFS
The value is a list of other ids
NMTOKEN
The value is a valid XML name
NMTOKENS
The value is a list of valid XML names
ENTITY
The value is an entity 
ENTITIES
The value is a list of entities
NOTATION
The value is a name of a notation
xml:
The value is predefined

 

bulletThe attribute-default-value can have the following values:

 

Value Explanation
#DEFAULT value
The attribute has a default value
#REQUIRED
The attribute value must be included in the element
#IMPLIED
The attribute does not have to be included
#FIXED value
The attribute value is fixed


Attribute declaration example

DTD example:
<!ELEMENT square EMPTY>
  <!ATTLIST square width CDATA "0">

XML example:
<square width="100"></square>
bulletIn the above example the element square is defined to be an empty element with the attributes width of  type CDATA. The width attribute has a default value of 0. 

Default attribute value

Syntax:
<!ATTLIST element-name attribute-name CDATA "default-value">

DTD example:
<!ATTLIST payment type CDATA "check">

XML example:
<payment type="check">
bulletSpecifying a default value for an attribute, assures that the attribute will get a value even if the author of the XML document didn't include it.

Implied attribute

Syntax:
<!ATTLIST element-name attribute-name attribute-type #IMPLIED>
DTD example:
<!ATTLIST contact fax CDATA #IMPLIED>

XML example:
<contact fax="555-667788">
bulletUse an implied attribute if you don't want to force the author to include an attribute and you don't have an option for a default value either. 

Required attribute

Syntax:
<!ATTLIST element-name attribute_name attribute-type #REQUIRED>
DTD example:
<!ATTLIST person number CDATA #REQUIRED>

XML example:
<person number="5677">
bulletUse a required attribute if you don't have an option for a default value, but still want to force the attribute to be present.

Fixed attribute value

Syntax:
<!ATTLIST element-name attribute-name attribute-type #FIXED "value">
DTD example:
<!ATTLIST sender company CDATA #FIXED "Microsoft">


XML example:
<sender company="Microsoft">
bulletUse a fixed attribute value when you want an attribute to have a fixed value without allowing the author to change it. If an author includes another value, the XML parser will return an error.

Enumerated attribute values

Syntax:
<!ATTLIST element-name attribute-name (eval|eval|..) default-value>
DTD example:
<!ATTLIST payment type (check|cash) "cash">

XML example:
<payment type="check">
or
<payment type="cash">
bulletUse enumerated attribute values when you want the attribute values to be one of a fixed set of legal values.

 

 

DTD - Entities

Entities

bulletEntities as variables used to define shortcuts to common text.
bulletEntity references are references to entities.
bulletEntities can be declared internal.
bulletEntities can be declared external

Internal Entity Declaration

Syntax: 
<!ENTITY entity-name "entity-value">

DTD Example:
<!ENTITY writer "Jan Egil Refsnes.">
<!ENTITY copyright "Copyright XML101.">
XML example:
<author>&writer;&copyright;</author>

External Entity Declaration

Syntax: 
<!ENTITY entity-name SYSTEM "URI/URL">

DTD Example:
<!ENTITY writer    SYSTEM "http://www.xml101.com/entities/entities.xml">
<!ENTITY copyright SYSTEM "http://www.xml101.com/entities/entities.dtd">
XML example:
<author>&writer;&copyright;</author>
 

 

DTD Validation

Validating with the XML Parser

bulletIf you try to open an XML document, the XML Parser might generate an error. 
bulletBy accessing the parseError object, the exact error code, the error text, and even the line that caused the error can be retrieved:
var xmlDoc = new ActiveXObject("Microsoft.XMLDOM")
xmlDoc.async="false"
xmlDoc.validateOnParse="true"
xmlDoc.load("note_dtd_error.xml")

document.write("<br>Error Code: ")
document.write(xmlDoc.parseError.errorCode)
document.write("<br>Error Reason: ")
document.write(xmlDoc.parseError.reason)
document.write("<br>Error Line: ")
document.write(xmlDoc.parseError.line)

Turning Validation off

bulletValidation can be turned off by setting the XML parser's validateOnParse="false".
var xmlDoc = new ActiveXObject("Microsoft.XMLDOM")
xmlDoc.async="false"
xmlDoc.validateOnParse="false"
xmlDoc.load("note_dtd_error.xml")

document.write("<br>Error Code: ")
document.write(xmlDoc.parseError.errorCode)
document.write("<br>Error Reason: ")
document.write(xmlDoc.parseError.reason)
document.write("<br>Error Line: ")
document.write(xmlDoc.parseError.line)

 

 

DTD - Examples

TV Schedule DTD

By David Moisan. Copied from: http://www1.shore.net/~dmoisan/ 

<!DOCTYPE TVSCHEDULE [ 

<!ELEMENT TVSCHEDULE (CHANNEL+)>
<!ELEMENT CHANNEL (BANNER, DAY+)>
<!ELEMENT BANNER (#PCDATA)>
<!ELEMENT DAY ((DATE, HOLIDAY) | (DATE, PROGRAMSLOT+))+>
<!ELEMENT HOLIDAY (#PCDATA)>
<!ELEMENT DATE (#PCDATA)>
<!ELEMENT PROGRAMSLOT (TIME, TITLE, DESCRIPTION?)>
<!ELEMENT TIME (#PCDATA)>
<!ELEMENT TITLE (#PCDATA)> 
<!ELEMENT DESCRIPTION (#PCDATA)>

<!ATTLIST TVSCHEDULE NAME CDATA #REQUIRED>
<!ATTLIST CHANNEL CHAN CDATA #REQUIRED>
<!ATTLIST PROGRAMSLOT VTR CDATA #IMPLIED>
<!ATTLIST TITLE RATING CDATA #IMPLIED>
<!ATTLIST TITLE LANGUAGE CDATA #IMPLIED>

]>

 

A Report DTD

By Richard Erlander. Copied from: http://pdbeam.uwaterloo.ca/~rlander/ 

<!DOCTYPE REPORT [ 

<!ELEMENT REPORT (TITLE,(SECTION|SHORTSECT)+)>
<!ELEMENT SECTION (TITLE,%BODY;,SUBSECTION*)>
<!ELEMENT SUBSECTION (TITLE,%BODY;,SUBSECTION*)>
<!ELEMENT SHORTSECT (TITLE,%BODY;)>
<!ELEMENT TITLE %TEXT;>
<!ELEMENT PARA %TEXT;>
<!ELEMENT LIST (ITEM)+>
<!ELEMENT ITEM (%BLOCK;)>
<!ELEMENT CODE (#PCDATA)>
<!ELEMENT KEYWORD (#PCDATA)>
<!ELEMENT EXAMPLE (TITLE?,%BLOCK;)>
<!ELEMENT GRAPHIC EMPTY>

<!ATTLIST REPORT security (high | medium | low ) "low">
<!ATTLIST CODE type CDATA #IMPLIED>
<!ATTLIST GRAPHIC file ENTITY #REQUIRED>

<!ENTITY xml "Extensible Markup Language">
<!ENTITY sgml "Standard Generalized Markup Language">
<!ENTITY pxa "Professional XML Authoring">
<!ENTITY % TEXT "(#PCDATA|CODE|KEYWORD|QUOTATION)*">
<!ENTITY % BLOCK "(PARA|LIST)+">
<!ENTITY % BODY "(%BLOCK;|EXAMPLE|NOTE)+">

<!NOTATION GIF SYSTEM "">
<!NOTATION JPG SYSTEM "">
<!NOTATION BMP SYSTEM "">

]>

 

Newspaper Article DTD

Copied from http://www.vervet.com/ 

<!DOCTYPE NEWSPAPER [ 

<!ELEMENT NEWSPAPER (ARTICLE+)>
<!ELEMENT ARTICLE (HEADLINE, BYLINE, LEAD, BODY, NOTES)>
<!ELEMENT HEADLINE (#PCDATA)>
<!ELEMENT BYLINE (#PCDATA)>
<!ELEMENT LEAD (#PCDATA)>
<!ELEMENT BODY (#PCDATA)>
<!ELEMENT NOTES (#PCDATA)> 

<!ATTLIST ARTICLE AUTHOR CDATA #REQUIRED>
<!ATTLIST ARTICLE EDITOR CDATA #IMPLIED>
<!ATTLIST ARTICLE DATE CDATA #IMPLIED>
<!ATTLIST ARTICLE EDITION CDATA #IMPLIED>

<!ENTITY NEWSPAPER "Vervet Logic Times">
<!ENTITY PUBLISHER "Vervet Logic Press">
<!ENTITY COPYRIGHT "Copyright 1998 Vervet Logic Press">

]>

 

Product Catalog DTD

Copied from http://www.vervet.com/ 

<!DOCTYPE CATALOG [ 

<!ELEMENT CATALOG (PRODUCT+)>
<!ELEMENT PRODUCT (SPECIFICATIONS+, OPTIONS?, PRICE+, NOTES?)>
<!ELEMENT SPECIFICATIONS (#PCDATA)>
<!ELEMENT OPTIONS (#PCDATA)>
<!ELEMENT PRICE (#PCDATA)>
<!ELEMENT NOTES (#PCDATA)>

<!ATTLIST PRODUCT NAME CDATA #IMPLIED>
<!ATTLIST CATEGORY (HandTool | Table | Shop-Professional) "HandTool">
<!ATTLIST PARTNUM CDATA #IMPLIED>
<!ATTLIST PLANT (Pittsburgh | Milwaukee | Chicago) "Chicago">
<!ATTLIST INVENTORY (InStock | Backordered | Discontinued) "InStock">
<!ATTLIST SPECIFICATIONS WEIGHT CDATA #IMPLIED>
<!ATTLIST POWER CDATA #IMPLIED>
<!ATTLIST OPTIONS FINISH (Metal | Polished | Matte) "Matte">
<!ATTLIST OPTIONS ADAPTER (Included | Optional | NotApplicable) "Included">
<!ATTLIST OPTIONS CASE (HardShell | Soft | NotApplicable) "HardShell">
<!ATTLIST PRICE MSRP CDATA #IMPLIED>
<!ATTLIST PRICE WHOLESALE CDATA #IMPLIED>
<!ATTLIST PRICE STREET CDATA #IMPLIED>
<!ATTLIST PRICE SHIPPING CDATA #IMPLIED>

<!ENTITY AUTHOR "John Doe">
<!ENTITY COMPANY "JD Power Tools, Inc.">
<!ENTITY EMAIL "jd@jd-tools.com">

]>

 

 

 

 

Home ] Up ] Computer Architecture ] Programming Bootcamp ] Database Bootcamp ] Visual BasicS ] Web Basics ] Web Multimedia ] Web Programming ] Advanced Web Topics ] Developing Web Sites ] Web Glossary ]

Copyright © 1999 - 2005 
ThePlace - Written and Sponsored by Dave Hillman