Kaye and Geoff's web page documentation
If you have read and digested our previous page on basic HTML then you now understand what tags and their attributes are, the general structure of a web page, and how to create one and view it in your browser. Now we can introduce some more complex HTML, starting with lists. These are an example of HTML structures which need more than one type of tag to achieve the desired effect. The HTML for a simple list might look like this:
ImagesPictures are widely used in web pages to convey information, to break up blocks of text, for decoration and other uses. There are held in separate files from the HTML which defines a web page, so the HTML only needs to specify how to get the required file, and how the browser should position it within the page. There is a lot to appreciate about using pictures in web pages; so much so that we have separate pages devoted to images and graphics, so here we will just introduce the topic.
Images are specified with an <img> tag, which must always include a src (source) attribute to say where to find the file containing the image. The image file may be stored locally (on the same server as the HTML file) but need not be - it can be anywhere on the internet. A terminating image tag is not allowed. So to specify a local file (which in this example is located in an "images" directory) you would use:
Anchors (links)Links between documents are the defining feature of the world wide web - that is why it is called a "web". Links can exist within a single page, between pages on the same site, between pages on different servers, and even through email. HTML allows links to be defined with <a>...</a> (anchor) tags. A href attribute is normally required to specify the URL to link to:
These two approaches can be combined to link to a labelled position in a different page:
Except in special circumstances all links within your own site should be relative addresses, for example:
A link need not be to a web page; it can also invoke an email program. In this case the href attribute provides an email address preceeded by mailto: rather that a hypertext (http) document:
EntitiesThroughout our HTML documentation pages (such as this one) we need to show tags with their angle brackets, which means we need to have some way of forcing the browser to display them rather than act on them. Entities (also called character codes) allow us to define a character using a short code. Not just angle brackets, but a wide range of "special" characters can be specified in this way, including many which do not exist on a standard computer keyboard, for example the degree sign, diacritical marks such as accents and cedillas, Greek letters, common symbols and lots more.
Entities are of the form (including the semicolon at the end):
where name is a short code defining the type of entity. A few examples are:
< left angle bracket (<)
> right angle bracket (>)
° degree sign (°)
© copyright symbol (©)
fixed space ( )
The fixed space can be very useful. It defines a space character which is not treated as white space, so you can use (say) three of them in sequence to specify a space three characters wide, whereas HTML rules say that three regular space characters in a row will be concatenated to just one space.
We have a separate page with more entity codes, based on the full official W3C set.
The complete HTMLSince our pages are only an introduction to HTML we have not tried to be rigorous in covering every possible format for each tag discussed, and have not included every possible attribute. There are a number of ways to get more complete information. We have a separate HTML reference page which includes most of the commonly used tags and their attributes. It can be helpful to consult this reference as you read about new tags and HTML structures, to get an idea of the range of options which are available.
HTML is an evolving language, so there is no one definite description of HTML. In fact there are a series of backwardly compatable standard versions of HTML, with updated specifications published every few years or so. The international body which determines these standards is the World Wide Web Consortium (usually shortened to "W3C"). Their HTML specifications are the ultimate word on what HTML is and how it works, and are available on the web (naturally). The approach used in these documents is very formal and turgid, but it should not be too difficult to follow once you have a general understanding of how HTML is structured. If you are just starting out writing HTML, you will probably find that earlier specifications (for example 3.0) can be a lot easier to make sense of than the most recent ones.
There may be occasions when you see interesting (or maybe awful, or just peculular) features on a web page and you wonder about the HTML which was responsible. All browsers allow you to see the source code for the web page that they are displaying, through the "view source" (or "page source" or something similar) option from the "view" menu item. So this means that a page's HTML is open for all to see (and "borrow"), conforming to the sprit of openness and cooperation which, (in our fantasies at least), characterises the web.
Conforming to HTML standards
If you were to look at the HTML for this and other pages on our site, you would notice that sometimes tags have been used with a mix of upper and lower case, attribute values are usually but not always enclosed in double quotes and <img> tags may or may not have an "alt" attribute. We have been making web pages for quite a few years, and have not always developed good HTML coding habits (we are trying to improve with more recent sites we are working on - promise!). Current browsers will handle the variations referred to above, but standard modern HTML should have tags in lower case, attribute values should all be enclosed in double quotes, and all image tags should include an "alt" attribute.
Most importantly, non-standard tags and attributes should be avoided. Unfortunately it is all too easy to find web pages which are full of HTML which will only render correctly in one brand of browser. This not only alienates those using different browsers but in the future may not work with any browser as standards are more widely adhered to. So it is a good idea to develop good habits at the start of your HTML writing career, and thereby avoid any possible problems at a later date. If you want to know if a page conforms to the official international standard, you can submit it to the W3C HTML validation service.
Some of the tags which you can learn about here (generally all those which determine how the text is to be displayed, for example <h1> and <font>) have been superceded, at least in theory, by the use of cascading style sheets (CSS). However, due to their poor design and a flagrant disregard for CSS standards in most older browsers, we always use CSS with care and in limited amounts. There is no sign that formatting tags will disappear from the HTML standard any time soon, so we are happy to keep using them. Strangely, we have noticed that even pages which use CSS extensively still seem to use formatting tags as well, possibly related to the limitations of CSS in practice.
You cannot win - the <xmp> tag
Despite all this sage advice, sometimes you have to make a compromise between good intentions and practicality. Throughout these pages we use the <xmp>...</xmp> tag, which displays its enclosed text exactly as it is typed, including white space characters and anything in angle brackets, and does so in a different font from that used for normal page contents. So it is ideal for illustrating HTML code examples, which is why we use it.
Unfortunately, while <xmp> was part of the early standards, later versions designated it as "deprecated" in favour of the <pre> tag; this means that it should not be used because it might be removed from a future standard. In the meantime, browsers are expected to continue supporting it. So why do we not use the <pre> tag instead? Because it behaves entirely differently; it honours white space characters but interprets tags in the normal way. This would make a big mess of our examples. HTML also provides <code> and <samp> tags, but these also discard anything with angle brackets! While there are other ways to force text within angle brackets to be displayed, they make the HTML difficult to read and to write and change, so we do what we have to do and hope that <xmp> will remain part of HTML for as long as it is needed.