CIS 97YT Index > Assignment 1

Assignment 1
Well-Formedness

This document was updated on 3 July 2002. Updates are marked with a red bar at the side, as this paragraph is marked.

The objects of this assignment are:

Part One

The following HTML file is not well-formed. You must convert it to well-formed XHTML. You may not do so by eliminating all the tags and keeping only the text! The resulting file must look the same in a web browser as the original file.

Copy and paste the text below into a file. Name the file lastname_initial_1a.html. For example, if your name is “Frank Smith,” the file would have a name like smith_f_1a.html

<html>
<head>
<title> Exercise 1a </title>
</head>

<body bgcolor=white>
<div align="center">
<h2>Exercise 1a</h2>
<img src=xml_logo.png border=1
    width=100 height=40 alt="XML Logo">
</div>
<p>
The goal of this exercise is to convert a non-well-formed HTML
file to a file that will be <b><i>well-formed</b></i> with respect to the
rules of XML.
</p>
<hr width="50%" noshade>
<p>
Quick quiz: XML is...
</p>
<ol type="A">
<li> A format for structuring data.
<li> A three-letter acronym.
<li> A really cool technology.
<li> All of the above.
</OL>
<!-- The correct answer is D -- as if you didn't know that already. -->

</body>
</html>

Click this link to view the image that the HTML page uses. Then right-click the image; a pop-up menu will let you save the image on your disk. (You don't have to include that file in the results that you email to me.)

If you haven't already created the Windows batch file for the wellformed checker (or the Unix shell file), go to the batch files page (or shell files page) and follow the instructions to create the wellformed.bat or wellformed.sh file. You should put this file in the same directory as your HTML file.

Again, using the example of a person named Frank Smith who is using Windows, he could test his file by typing this at the MS-DOS command prompt:

wellformed smith_f_1a.html

You may object, “But that's an HTML file. Don't I have to name it smith_f_1a.xml before I can check it?” No, you don't. The wellformed checker will parse any file that you give it, no matter what its name. Unlike most other applications, the XML tools that we are using don't care about the file name. They care only about what is inside the file.

Part Two

The following XML file is not well-formed. You must convert it to well-formed XML. Since you've never seen this markup language before (it's a custom one that was designed for this particular exercise), you may be wondering how to be sure that you've made the corrections the “right way.” I want you to have that unsettled feeling; it leads into the next topic that we'll take up.

For this exercise, modify the tags in any way that you feel is reasonable so that the end result is a well-formed document.

Copy and paste the text below into a file. Name the file lastname_initial_1b.xml. For example, if your name is “Frank Smith,” the file would have a name like smith_f_1b.xml

<?xml version="1.0"?>
<catalog>
    <company>Office Magick</company>
    <department name="Office Supplies" code=235>
        <item>
            <name>Stapler</name>
			<manufacturer>Bostich</manufacturer>
            <color-list>
                <color sku="S367-B" hex=#000000>black
                <color sku="S367-PY"hex=#ffffcc>pastel yellow
            </color-list>
            <price amt="8.95">
            <summary>Heavy-duty office stapler</summary>
            <description>
            This stapler has a 30-sheet capacity and a lever
            action for accurate placement.
            </description>
        </item>
        <item>
            <!-- On backorder -- cannot restock. -->
            <name>Notebook</name>
            <color-list>
                <color sku="NB1783-G"hex="#00ff00">Green</color>
                <color sku="NB1783-Y" hex="#ffff00">Yellow</color>
                <color hex="#ff0000" sku="NB1783-R">Red</color>
            </color-list>
        </item>
    <department code="240">
    Computer Peripherals
        <item>
            <name>Mouse</name>
            <color-list>
                <color sku="M-0115-LG" hex="#cccccc">Light Gray
                <color hex="#ffffff">White
            </color>
            <price units="USD" />10.95</price>
        </item>
    </department>
</catalog>

Checking Your Work

Windows users may use the wellformed.bat file to check to see if their files are well-formed. You will run this batch file from an MS-DOS prompt.

Unix/Linux users may use the wellformed.sh shell script. Run it from a shell prompt.

Note: XML parsers stop at the first well-formedness error that they encounter. XML parsers will attempt to parse as much as they can. They will generate an error only when it is impossible to proceed with the parse.

In the following example, because it's possible to nest tags, the parser can't know that you haven't closed the first <ul> element until it hits the final </div> closing tag. This XHTML, with line numbers for reference:

 1 <div align="left">
 2 <ul>
 3  <li>List 1, item 1</li>
 4  <li>List 1, item 2</li>
 5
 6 <ul>
 7   <li>List 2, item 1</li>
 8   <li>List 2, item 2</li>
 9 </ul>
10 </div>

Generates this error (the 10:6 means line ten, character number six).

[Fatal Error] testfile.html:10:6: The element type "ul" must be terminated
by the matching end-tag "</ul>".

When You Finish

You will have two files. The first one, lastname_initial_1a.html will contain well-formed XHTML. The second one, lastname_initial_1b.xml will contain well-formed XML. Send them as attachments to my email address. If you wish, you can create a .ZIP file and send that as an attachment.