Kaye and Geoff's web page documentation
This is the page which collects together bits of HTML and a few miscellaneous subjects which have not been mentioned or covered in enough detail elsewhere. Many of the HTML tags described here are used to achieve formatting effects and so may be officially frowned on in favour of the use of style sheets, but we have some reservations about CSS. Not only are style sheets often implemented inconsistently and incompletely, but in many cases in-line tags are just easier to use and interpret.
This motley crew begins with a tag which has been mentioned in passing but not discussed in any detail...
The <font> tag
Text enclosed within <font>...</font> tags is modified as specified by the tag's attributes. These may be any combination of:
The <font> tag must be used with care. Setting the text size to 1 can make it so small that some fonts on some computers "break up" because not enough pixels are allocated to completely draw every character. Using the <small>...</small> tag is much safer. When setting text colours make sure that the background does not "hide" the text because it is a similar shade or brightness. And avoid red text on green backgrounds or vice versa - nearly 10% of the world's males suffer from red-green colour blindness (and it usually looks awful anyway). Remember that you cannot know what fonts are available on any computer displaying your page, so you must not rely on the "face" attribute for any important effect. Browsers typically allow the viewer to override "face" and other font attributes, and if the browser cannot find any of the specified fonts then it uses its default.
Here are some examples showing the effect of the <font> tag with your current browser settings:
There is also a <basefont> tag, which is used without a terminating tag. It sets the default font size of the text on the page and from which relative font sizes (using the size="+n" or size="-n" attribute in a <font> tag) are calculated. For example:
This tag should not be used. It is preferable to leave viewers to fix the basic size of text they prefer in their browser settings; your page design should be flexible enough to accommodate the resultant variations in text size. For this reason <basefont> is not in modern versions of standard HTML and some browsers ignore it (which provides an even better reason for not using it).
More text characteristics
There are a series of tags which (like bold and underline, which we have already discussed) change their enclosed text in a defined way. They do not need any discussion; the effect can be appreciated from the following examples:
<address>...</address> and <cite>...</cite> are designed to display addresses and citations, respectively. For example:
You may feel that you can format addresses and citations more appropriately than relying on the default behaviour of whichever browsers are used to display your web pages.
The <br> and <p> tags revisited
The break and paragraph tags appear to be quite similar; the only difference in their behaviour might seem to be that the <p> inserts a blank line, and the <br> does not. They are so simple that they are among the first tags to be described in our documentation. But there is a subtle difference in how they behave. The <br> tag always forces a line break, so that if you place a sequence of them in your HTML then you will get a sequence of blank lines (just as you would expect).
However, if you enter an unbroken sequence of <p> tags, you do not get a sequence of blank lines; you get just one normal paragraph break. In other words, the browser treats a series of <p> tags as though there is only one, echoing the way that white space is handled. The same rule is applied wherever the browser acts as though a paragraph break exists (for example immediately before and after <blockquote>...</blockquote> and <form>...</form> tags - although not all browsers treat these "block-level" elements in an identical way). So...
is displayed exactly the same as...
The <br> tag normally applies just to the text which preceeds and follows it, but it can be extended with the clear attribute to consider adjacent images. This attribute can have a value of "left", "right" or "all", for example:
jumped over the lazy dog
The effect is for the text after the break ("jumped over the lazy dog") to start below whichever is the lowest of the "The quick brown fox" text or the image, like this:
Block definition tags
Block of text
We have already seen that images are held in separate files which are included in web pages with the <img> tag. Most browsers, usually with the help of plug-ins, can run programs written in the Java language. These programs are also held in external files, and can be invoked with the <applet>...</applet> tag. Music or other sounds can also be included in web pages, although different browsers use different ways to achieve this. Even other HTML documents can be included with yet another specialist tag - the <iframe>...</iframe> tag specifies an in-line frame to hold a web page in a window which is inserted into the current page in a very similar way to images.
It became obvious to those responsible for setting HTML standards that in the future web pages might be required to handle even more forms of multimedia (maybe some not even invented yet), and that a uniform way of handling all inclusions from external files would be a Very Good Idea. So they came up with the generic <object>..</object> tag. Universal support for this tag for all multimedia has been slow in coming, but will presumably eventually be a reality. The use of the <object> tag can be illustrated by showing how it can be used as an alternative for the <img> tag:
Notice that a combination of a data and a type attribute informs the browser where to find the external file and what type of file to expect, and thereby what to do with it. HTML already defines quite a few "types", all with the two-words-separated-by-a-slash format as shown in this example, and to handle a new multimedia format in the future we just need to give it a new unique type. These types, also called "mime types" or "content types", are also used in other contexts within HTML, giving this approach even more universality. The text between the tags is a description of the object - the equivalent of the alt value in an <img> tag.
See if your browser can successfully deal with an image defined in this way. Do you see a green smiley?
Controlling robots - the robots.txt file
The name 'robots' here refers to 'spiders' or 'crawlers' or other similar programs which automatically trawl the web by downloading pages and following the links they find on them. The most obvious examples are search engines looking for pages to add to their indexes, but crawlers can also be used to collect email addresses to be used by spammers. Sometimes you would prefer that some of your pages were not indexed; they might be under development or temporary or not aimed at a general audience. You can ask web trawling robots to ignore these files by placing an appropriate entry in a robots.txt file.
This is a simple text file, normally placed in the same directory as your index.html or equivalent file. If the files you did not want robots to see were all in a sub-directory called 'private' (it helps to collect them all into one or a limited number of directories) then the contents of the robots.txt file would look like this:
A user-agent of '*' means all robots and the second line is clear enough - it says that I do not allow you into the directory called 'private'. Of course, just like the related robots metatag (see The Head), you cannot force robots to obey this directive, but most legitimate search engines probably do (and spammers don't). If you want serious control and security then you need (on Unix servers) to investigate the .htaccess and .htpasswd files, but we will not deal with them here.
Robots.txt files do not allow a great deal more sophistication than illustrated in the example above, but to get the complete syntax you can try the following sites:
This area presents a challenge. It provides the opportunity to create interesting and dynamic web pages but sometimes at the expense of universal accessability. The effect of many DHTML techniques differs from machine to machine, depending on the browser and host combination used. Some techniques are yet to be accepted by the official standards authorities, and others are restricted to one browser or platform. If you decide to use dynamic HTML techniques on your pages, you need to make sure that your target audience can take advantage of the enhancements. As time progresses and standards evolve DHTML should become more mainstream. In general using dynamic HTML involves programming and so is considerably more difficult to implement than straightforward HTML.
Here are some concepts and technologies which contribute to DHTML:
JSP (Java Server Page) is a web page that contains java scripts that have to be interpreted by the server delivering the page, before it is downloaded to the browser. In other words, it contains Java, but that code is used by the server, not the browser. This is potentially safer than allowing the code to be run by the browser.
ASP (Active Server Pages) is Microsoft's version of JSP, where VBScript code (ie. VisualBasic rather than Java) is embedded in the web page. The code is interpreted by Microsoft's script interpreter on the server before the result is delivered to the browser. This makes the pages dependant on having a Microsoft-compatible server, but despite this ASP are quite widely used. More recent (ASP.NET) implementations allow the code to be separated from the HTML, which means it can be compiled and so will run much faster. This reflects the way that CGIs are set up on Unix servers.