Kaye and Geoff's web page documentation 

Introduction

The contents of the head of an HTML document are placed inside <head>...</head> tags. There must be only one head in the document, and it must preceed the body (or the frameset). As we have seen, the head contains a title in <title>...</title> tags which is used to identify the web page - it is placed at the top of the browser window. Interestingly, a legal HTML document must have a title, although it need not have <head>...</head> or even <html>...</html> tags.

Meta data

The head can also contain other information about the document (rather than document content, which goes in the body). HTML provides a general way to do this with the <meta> tag and its name and content attributes. Here is an example:

<meta name="author" content="Bill Bloggs">

In general meta data do not affect how the page is rendered and are ignored by the browser, but they are available to and may be used by the web server or other programs which request the page over the internet. So indexing information (key words), content classification, author details, a complete page title (often used in the listings of search engines results) and much more can be provided in this way. Here are some more commonly used examples:

<meta name="description" content="Bill Bloggs' geometry page: all you want to know about Pythagorus"> <meta name="keywords" content="Bill Bloggs,geometry,pythagorus,triangles,square,hypotenuse"> <meta name="Language" content="English"> <meta name="Generator" content="Dreamweaver"> <meta name="Copyright" content="Copyright held by Geoff and Kaye"> <meta name="Robots" content="index,follow">
The last example requires a bit of explanation. It is aimed at search engines and other web crawlers and gives them the OK to index the page, and to follow its links to find other pages. To ask them not to index the page and not to follow the links, you can set content to "noindex,nofollow". Of course there is no way that you can enforce this or any other meta data directive, but at least it provides a way of making your preferencees known.

The name attribute can be replaced with http-equiv; the content then modifies the hypertext protocol (HTTP) used in communication between the server and the browser. We cannot attempt a general discussion of HTTP here, but one particular http-equiv is very useful to know. It allows one page to cause the loading of a different page, a process called redirection. The format is:

<meta http-equiv="refresh" content="5;URL=http://www.domain1.com">

You can see that the value of the content attribute has two semicolon-separated components. The first is the number of seconds delay before performing the redirection and the second is the URL to load. During the delay the contents of the page containing the redirect are displayed; by convention with a message saying that redirection is going to occur and an indication of when, and often also an alternate link in case the redirect fails for some reason.

It is legal to specify a delay of zero, and you might prefer the seamless redirection which this produces. But consider what happens when someone tries to use the browser's back button on the resultant page. As soon as the browser starts loading the previous page it redirects again - the outcome is that the back button is effectively disabled. This can be very confusing and annoying. A reasonable delay before redirection provides an opportunity for this cycle to be interrupted with a second use of the back button.

CSS and scripts

Cascading style sheets (CSS) can affect the content of the web page, but should not contain actual content, so they are placed in the head. Style sheets, or links to external files containing CSS, or both, are placed within <style>...</style> tags.

Javascript code can also be placed in the head, as can links to external files containing Javascript (with the src attribute of the <script> tag). But Javascript code can also be placed in the body and even inside tags - scripting does not fit well into the content versus document information dichotomy.

Doctypes

A doctype is a directive to the browser telling it how to interpret the HTML in a file. It is not, therefore, logically a part of the HTML, and the browser must have this information before it starts rendering the page, so it is placed before the opening HTML tag. Because we use CSS sparingly and try to be compatable with old browsers, we normally specify that our HTML be viewed as conforming to version 4.01 of the standard but also to allow any tags found in older standards. This can be achieved with the first of these example declarations:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

The second example tells the browser to interpret the HTML strictly according the the 4.01 standard. You can see that the syntax of doctype declarations can be a bit obscure, so unless you want to investigate them in detail we suggest that you just find one which matches the way you design your HTML, and use it as provided.

XHTML

HTML is a specific incarnation of a more general class of markup languages which together are called XML (Extensible Markup Language). Early versions of HTML relaxed some of the rules of XML but more recently the standards police have defined an HTML which is a true subset of XML - it is called XHTML. The third doctype example above informs the browser that the code conforms strictly to XHTML rules.

What are these rules, and what does XHTML look like? In fact it does not look too different from the HTML we have used throughout these pages. It does require that tag and attribute names are in lower case and that attribute values are enclosed in double quotes, as we have already encouraged you to do and as we have done in all our examples. It also insists that all tags are terminated, which is sometimes forbidden in HTML, so we have had to chose one way or the other, and have chosen to go with the HTML approach. There are some tags in HTML which need not be terminated but logically can be, for example...

<p>This is a paragraph</p>

However, it is not immediately clear how you terminate, for example, a <hr> tag - there is nothing to "enclose" in tags. But XHTML is very insistent - and solves this dilemma by effectively combining the opening and terminating tag into one, like this:

above the line <hr /> below the line

Presumably at some stage in the future XHTML will supercede HTML not just in the standards documents but in practice as well, with all web pages conforming to its rules and all browsers insisting on it, so web page developers need to keep this in mind. To encourage this trend, there are programs on the web which will convert HTML into XHTML.

Top
Previous
Next
Index
Home