ThePlace

Home ] Search ] Resources ] Site Map ] Contact Me ]
Dave's Information Technology Resource

Up ]

What is XML ] Viewing XML ] [ XML Syntax ] Creating XML ] XML and CSS ] XML and DTD ] XML, XSL, and HTML ] The XML DOM ]

--- XML Syntax ---

Well-Formedness

bulletThe W3C calls for XML documents to be "well-formed".
bulletThis differs from HTML in that rules are often stretched or broken (e.g., no closing tags, tag name misspellings, etc.), but the document may be still be displayed (as much as possible).
bulletXML documents must have:
bulletA root element
bulletCase sensitivity
bulletOpening and closing tags
bulletElements must be properly nested
bulletAll attribute values must be quoted
bulletEmpty elements must be closed.

 

Consider the following example...

<?xml version="1.0"?>
<letter>
<to>John</to>
<from>Bob</from>
<subject>Something important</subject>
<body>See me in the morning for the next assignment...</body>
</letter> 
The first line in the document defines the XML version of the document.
This document conforms to the 1.0 specification of XML...

<?xml version="1.0"?>

The next line defines the first element of the document called the root element...

<letter>

The next lines defines 4 child elements of the root

<to>John</to>
<from>Bob</from>
<subject>Something important</subject>
<body>See me in the morning for the next assignment...</body>

The last line defines the end of the root element.

</letter>

The Structure of XML...

All XML documents must have a root tag

bulletAll XML documents must contain a single tag pair to define the root element. 
bulletAll other elements must be nested within the root element
bulletAll elements can have sub (children) elements. 
bullet Sub elements must be in pairs and correctly nested within their parent element:

<root>
  <child>
    <subchild>
    </subchild>
  </child>
</root> 

 

All XML elements must have a closing tag

bulletIn HTML some elements do not have to have a closing tag. 
bulletTherefore, the following code is legal in HTML:

<p>This is a paragraph
<p>This is another paragraph

bulletIn XML all elements must have a closing tag like this:

<p>This is a paragraph</p>
<p>This is another paragraph</p> 

XML tags are case sensitive

bulletXML tags are case sensitive
bulletFor example, the tag <Letter> is different from the tag <letter>.
bulletOpening and closing tags must therefore be written with the same case:

<Message>This is incorrect</message>
<message>This is correct</message>

 

Predefined Entities in XML

bulletThere are five pre-defined character entities for XML:
< &lt;
> &gt;
& (ampersand) &amp;
' (apostrophe) &apos;
" (double quote) &quot;
bulletDepending on the parser, the apostrophe and double quote will generally be presented without problems, but technically, should be defined by character entities.

 

Other numeric character references

bulletUnicode character codes which includes  the ASCII character codes (e.g., "A" is ASCII code 65 or &#65)
bulletThis also includes characters not on the keyboard:
bullet© - &#162
bullet£ - &#163
bullet® - &#174
bulletClick here for a Unicode numeric character reference chart.

 

All XML elements must be properly nested

bulletIn HTML some elements can be improperly nested within each other like this:

<b><i>This text is bold and italic</b></i>

bulletIn XML all elements must be properly nested within each other like this

<b><i>This text is bold and italic</i></b>

 

Comments in an XML document

Uses the same commenting scheme as HTML...

<!--  comment goes here  -->

or multi-line

<!--
a comment can go here
or here
-->

 

Attribute values must always be quoted

bulletXML elements can have attributes in name/value pairs just like in HTML. 
bulletThe attribute value must always be quoted
bulletConsider the two XML documents below. 

 

This one is incorrect:

<?xml version="1.0"?>
<letter date=3/27/01>
<to>John</to>
<from>Bob</from>
<subject>Something important</subject>
<body>See me in the morning for the next assignment...</body>
</letter> 

 

This one is correct:

<?xml version="1.0"?>
<letter date="3/27/01">
<to>John</to>
<from>Bob</from>
<subject>Something important</subject>
<body>See me in the morning for the next assignment...</body>
</letter> 

 

 

Non-Standard Text

bulletThere are situations where it is not desirable to parse and display content when the document is presented.
bulletNon-standard text that is not parsed can be included in a document using the following syntax:
<?xml version="1.0"?>
<letter>
<to>John</to>
<from>Bob</from>
<subject>Something important</subject>
<![CDATA[
<head><title>Title for this Document Goes Here</title></head>
<body bgcolor="blue">
<h1>This is a Header 1 in HTML</h1>
<p>this is text in the document
</body>
]]>

</letter>

 

Namespaces

bulletNamespaces are used to minimize confusion with common element names (e.g., <name>).
bulletNamespaces typically reference unique uniform reference identifier (URI) -- these references are sometimes "made up" references or documents.
bulletThe point is to create a unique and separating reference.
bulletMust start with a letter or underscore and contain only letters, underscores, digits, hyphens, and periods).
bulletThe syntax for creating a namespace follows:

<namespace:elementname xmlns:namespace="globallyUniqueURI">
 

 

XML Attributes

XML attributes are used to describe XML elements, or to provide additional information about elements. 

bulletConsider the following HTML:  <IMG SRC="davepic.gif">. 
bulletSRC is an attribute to the IMG element. 
bulletThe SRC attribute provides additional information about the element.

Attributes are always contained within the start tag of an element. 

Use of Elements vs. Attributes

Take a look at these examples:

Using an Attribute for sex:

<person sex="female">
  <firstname>Anna</firstname>
  <lastname>Smith</lastname>
</person>

Using an Element for sex:

<person>
  <sex>female</sex>
  <firstname>Anna</firstname>
  <lastname>Smith</lastname>
</person>
bulletIn the first example sex is an attribute. In the last example sex is an element. Both examples provides the same information to the reader.  
bulletConsider:
bulletThere are no fixed rules about when to use attributes to describe data, and when to use elements. 
bulletA good general rule is...with XML, you should try to avoid them, as long as the same information can be expressed using elements.
bulletHere is another example, demonstrating how elements can be used instead of attributes. The following three XML documents contain exactly the same information: 

 

A date attribute is used...

<?xml version="1.0"?>
<letter date="3/27/01">
<to>John</to>
<from>Bob</from>
<subject>Something important</subject>
<body>See me in the morning...</body>
</letter> 

A date element is used...

<?xml version="1.0"?>
<letter>
<date>3/27/01</date>
<to>John</to>
<from>Bob</from>
<subject>Something important</subject>
<body>See me in the morning...</body>
</letter> 

An expanded date element is used...

<?xml version="1.0"?>
<letter>
<date>
  <day>3</day>
  <month>27</month>
  <year>01</year>
</date>

<to>John</to>
<from>Bob</from>
<subject>Something important</subject>
<body>See me in the morning...</body>
</letter> 

Reasons to avoid attributes:

bulletAttributes can not contain multiple values (elements can)
bulletAttributes are not expandable (for future changes)
bulletAttributes can not describe structures (like child elements can)
bulletAttributes are more difficult to manipulate by program code
bulletAttribute values are not easy to test against a DTD

Like everything, there are exceptions to the attribute rules..

bulletYou may want to assign ID references to elements in my XML documents. 
bulletID references can be used to access XML element in much the same way as the NAME or ID attributes in HTML. 
bulletThis example demonstrates this:
An expanded date element is used...
<?xml version="1.0"?>
<letter ref="222">
<to>John</to>
<from>Bob</from>
<subject>Something important</subject>
<body>See me in the morning for the next assignment...</body>
</letter> 
<?xml version="1.0"?>
<letter ref="223">
<to>George</to>
<from>Bob</from>
<subject>Something else important</subject>
<body>Yadda, yadda...</body>
</letter> 
bulletThe "ref" in these examples is just a counter, or a unique identifier, to identify the different notes in the XML file.

Some Examples

bulletNote: the following must be viewed with Internet Explorer 5/6 or Netscape 5+
bulletlibrary.xml
bulletnote.xml
bulletplant_catalog.xml

 

Home ] Up ] Computer Architecture ] Programming Bootcamp ] Database Bootcamp ] Visual BasicS ] Web Basics ] Web Multimedia ] Web Programming ] Advanced Web Topics ] Developing Web Sites ] Web Glossary ]

Copyright © 1999 - 2005 
ThePlace - Written and Sponsored by Dave Hillman