Validating XML with XML Schema

As the Internet moves forward, Extensible Markup Language, XML, is poised to become the method for interchanging information among all sorts of devices. For instance, a hand-held Global Positioning System device might be Internet-enabled to receive weather reports encoded in XML. This hypothetical device doesn't have a lot of extra memory to do all the error-checking and “forgiving” that a browser can do with your HTML. This means that servers must ensure that the data is “good to go” before sending it to the device. XML Schema is a new method that the World Wide Web Consortium has come up with to help make sure your data is valid.

Before we can describe XML Schema, we have to discuss what we mean when we say “valid” and how documents are currently validated. Let's look at a sample weather report, written in the just-invented EWEML (Eisenberg's Weather Example Markup Language).

   <report>
     <datestamp>2000-09-01</datestamp>
     <station fullname="San Jose" abbrev="KSJC">
         <latitude>37.3618619</latitude>
         <longitude>-121.9290089</longitude>
      </station>
      <temperature>
         <min>20</min>
         <max>29</max>
         <forecast-low>21</forecast-low>
         <forecast-high>30</forecast-high>
      </temperature>
      <wind>
         <speed>5</speed>
         <direction>NNW</direction>
      </wind>
   </report>

The first step in quality control is making sure that the document follows the basic rules of XML. Among these rules:

  1. opening tags must have closing tags
  2. tags must be nested properly
  3. values in a tag must be enclosed in quote marks

A document that follows these “punctuation rules” is called well-formed. You don't need any information other than the document itself to tell if all the tags are closed or nested correctly. There are, however, some questions you can't answer just by looking at the document. >>>

  1. Validating XML with XML Schema
  2. Validity and the DTD
  3. Validity and the Schema
  4. Specifying Elements
  5. Making Validation More Specific
  6. Making New Types
  7. Enumerations
  8. The Big Picture
  9. Summary