Kaye and Geoff's web page documentation 


If you have read and digested our previous page on basic HTML then you now understand what tags and their attributes are, the general structure of a web page, and how to create one and view it in your browser. Now we can introduce some more complex HTML, starting with lists. These are an example of HTML structures which need more than one type of tag to achieve the desired effect. The HTML for a simple list might look like this:

<ul> <li>first item in list <li>second item in list <li>last item in list </ul>
and this code is displayed as follows:
  • first item in list
  • second item in list
  • last item in list
Note that <ul>...>/ul< (unordered list) tags enclose the entire list, with an unterminated <li> (list item) tag at the start of each item. The browser indents the list items from the left page margin and each item has a solid circular "bullet" in front of it. You can change the bullet by setting the "type" attribute in the <ul> tag, for example using <ul type="square"> produces:
  • first item in list
  • second item in list
  • last item in list
There are other types of lists. Here is an ordered list:
  1. first item in list
  2. last item in list
which has an <ol> (ordered list) tag in place of the <ul> tag. The appearance of ordered lists can also be changed by setting the <li> tag's type attribute, for example with a type of "a" (<li type="a">) we get alphabetic labels:
  1. first item in list
  2. last item in list
whereas with a type of "i" we get roman numerals
  1. first item in list
  2. last item in list
There are also definition lists with the following structure:
<dl> <dt>first term <dd>definition of first term <dt>second term <dd>definition of second term </dl>
The browser displays this list as follows:
first term
definition of first term
second term
definition of second term


Pictures are widely used in web pages to convey information, to break up blocks of text, for decoration and other uses. There are held in separate files from the HTML which defines a web page, so the HTML only needs to specify how to get the required file, and how the browser should position it within the page. There is a lot to appreciate about using pictures in web pages; so much so that we have separate pages devoted to images and graphics, so here we will just introduce the topic.

Images are specified with an <img> tag, which must always include a src (source) attribute to say where to find the file containing the image. The image file may be stored locally (on the same server as the HTML file) but need not be - it can be anywhere on the internet. A terminating image tag is not allowed. So to specify a local file (which in this example is located in an "images" directory) you would use:

<img src="images/picture.gif">
but if the image was on another server you have to specify a full URL for the file:
<img src="http://www.domain1.com/images/picture.jpg">
There are several optional attributes which can be used with the <img> tag. Here are a few of the more straightforward ones:
  • border="x" - x can be "0" (no border) or an integer which defines the width of the border (in pixels) to place around the image
  • hspace="x" - (horizontal space) x is the number of pixels of space to use to pad the left and right edges of the image
  • vspace="x" - (vertical space) x is the number of pixels of space to use to pad the top and bottom edges of the image
  • alt="text" - (alternative) - text to be used if a picture is not displayed - for example in speaking browsers designed for blind people

Anchors (links)

Links between documents are the defining feature of the world wide web - that is why it is called a "web". Links can exist within a single page, between pages on the same site, between pages on different servers, and even through email. HTML allows links to be defined with <a>...</a> (anchor) tags. A href attribute is normally required to specify the URL to link to:
<a href="http://www.domainx.com.au/index.html">link to somewhere</a>
The text between the opening and closing anchor tags is the "active" text - when the mouse clicks on it the page defined in the href attribute is loaded into the browser window. Browsers usually indicate that this text is active by making it look different from other text - most commonly (by default) colouring it blue and underlining it. Instead of text, you can specify an image which becomes the active area:
<a href="http://www.domainx.com.au/index.html"><img src="images/button.gif"></a>
Links within a page need a target, also defined with an anchor tag, which the link goes to. This is a case where an href attribute is not required, but the closing tag must be supplied even though there may not be any active text defined. The "name" attribute defines an address to link to; its value can be any string made up of alphanumeric characters. For example:
<a name="target1"></a>
The href attribute value in the link "source" indicates that it is an internal (within the same page) link by preceeding the name by a hash (#) character, so to link to the target defined above, you can use:
<a href="#target1">click here to take internal link</a>
The "top" link at the bottom of this page is a working example of this type of anchor.

These two approaches can be combined to link to a labelled position in a different page:

<a href="http://www.domain3.gov/index.html#target2">click here</a>

Except in special circumstances all links within your own site should be relative addresses, for example:

href="myfile.html" href="images/mypic.gif" href="../myhome.html"
where the file is located relative to the location of the linking page rather than absolute addresses (complete URLs) such as in the previous examples. This allows the entire directory structure to be moved to another location with no changes to your link references.

A link need not be to a web page; it can also invoke an email program. In this case the href attribute provides an email address preceeded by mailto: rather that a hypertext (http) document:

<a href="mailto:freddy@domainq.org.au">email me</a>
Be aware that this type of link is the most common source of email addresses collected from the internet by spammers.


Throughout our HTML documentation pages (such as this one) we need to show tags with their angle brackets, which means we need to have some way of forcing the browser to display them rather than act on them. Entities (also called character codes) allow us to define a character using a short code. Not just angle brackets, but a wide range of "special" characters can be specified in this way, including many which do not exist on a standard computer keyboard, for example the degree sign, diacritical marks such as accents and cedillas, Greek letters, common symbols and lots more.

Entities are of the form (including the semicolon at the end):


where name is a short code defining the type of entity. A few examples are:

&lt;   left angle bracket (<)
&gt;   right angle bracket (>)
&deg;   degree sign (°)
&copy;   copyright symbol (©)
&nbsp;   fixed space ( )

The fixed space can be very useful. It defines a space character which is not treated as white space, so you can use (say) three of them in sequence to specify a space three characters wide, whereas HTML rules say that three regular space characters in a row will be concatenated to just one space.

We have a separate page with more entity codes, based on the full official W3C set.

The complete HTML

Since our pages are only an introduction to HTML we have not tried to be rigorous in covering every possible format for each tag discussed, and have not included every possible attribute. There are a number of ways to get more complete information. We have a separate HTML reference page which includes most of the commonly used tags and their attributes. It can be helpful to consult this reference as you read about new tags and HTML structures, to get an idea of the range of options which are available.

HTML is an evolving language, so there is no one definite description of HTML. In fact there are a series of backwardly compatable standard versions of HTML, with updated specifications published every few years or so. The international body which determines these standards is the World Wide Web Consortium (usually shortened to "W3C"). Their HTML specifications are the ultimate word on what HTML is and how it works, and are available on the web (naturally). The approach used in these documents is very formal and turgid, but it should not be too difficult to follow once you have a general understanding of how HTML is structured. If you are just starting out writing HTML, you will probably find that earlier specifications (for example 3.0) can be a lot easier to make sense of than the most recent ones.

There may be occasions when you see interesting (or maybe awful, or just peculular) features on a web page and you wonder about the HTML which was responsible. All browsers allow you to see the source code for the web page that they are displaying, through the "view source" (or "page source" or something similar) option from the "view" menu item. So this means that a page's HTML is open for all to see (and "borrow"), conforming to the sprit of openness and cooperation which, (in our fantasies at least), characterises the web.

Conforming to HTML standards

If you were to look at the HTML for this and other pages on our site, you would notice that sometimes tags have been used with a mix of upper and lower case, attribute values are usually but not always enclosed in double quotes and <img> tags may or may not have an "alt" attribute. We have been making web pages for quite a few years, and have not always developed good HTML coding habits (we are trying to improve with more recent sites we are working on - promise!). Current browsers will handle the variations referred to above, but standard modern HTML should have tags in lower case, attribute values should all be enclosed in double quotes, and all image tags should include an "alt" attribute.

Most importantly, non-standard tags and attributes should be avoided. Unfortunately it is all too easy to find web pages which are full of HTML which will only render correctly in one brand of browser. This not only alienates those using different browsers but in the future may not work with any browser as standards are more widely adhered to. So it is a good idea to develop good habits at the start of your HTML writing career, and thereby avoid any possible problems at a later date. If you want to know if a page conforms to the official international standard, you can submit it to the W3C HTML validation service.

Some of the tags which you can learn about here (generally all those which determine how the text is to be displayed, for example <h1> and <font>) have been superceded, at least in theory, by the use of cascading style sheets (CSS). However, due to their poor design and a flagrant disregard for CSS standards in most older browsers, we always use CSS with care and in limited amounts. There is no sign that formatting tags will disappear from the HTML standard any time soon, so we are happy to keep using them. Strangely, we have noticed that even pages which use CSS extensively still seem to use formatting tags as well, possibly related to the limitations of CSS in practice.

You cannot win - the <xmp> tag

Despite all this sage advice, sometimes you have to make a compromise between good intentions and practicality. Throughout these pages we use the <xmp>...</xmp> tag, which displays its enclosed text exactly as it is typed, including white space characters and anything in angle brackets, and does so in a different font from that used for normal page contents. So it is ideal for illustrating HTML code examples, which is why we use it.

Unfortunately, while <xmp> was part of the early standards, later versions designated it as "deprecated" in favour of the <pre> tag; this means that it should not be used because it might be removed from a future standard. In the meantime, browsers are expected to continue supporting it. So why do we not use the <pre> tag instead? Because it behaves entirely differently; it honours white space characters but interprets tags in the normal way. This would make a big mess of our examples. HTML also provides <code> and <samp> tags, but these also discard anything with angle brackets! While there are other ways to force text within angle brackets to be displayed, they make the HTML difficult to read and to write and change, so we do what we have to do and hope that <xmp> will remain part of HTML for as long as it is needed.