This primer assumes that you have:
HTML documents are in plain (also known as ASCII) text format and can be created using any text editor (e.g., Emacs or vi on UNIX machines). A couple of Web browsers (tkWWW for X Window System machines and CERN's Web browser for NeXT computers) include rudimentary HTML editors in a WYSIWYG environment. There are also some WYSIWIG editors available now (e.g. HotMetal for Sun Sparcstations, HTML Edit for Macintoshes). You may wish to try one of them first before delving into the details of HTML.
You can preview a document in progress with NCSA Mosaic (and some other Web browsers). Open it with the Open Local command under the File menu.After you edit the source HTML file, save the changes. Return to NCSA Mosaic and Reload the document. The changes are reflected in the on-screen display.
<TITLE>The simplest HTML example</TITLE>
<H1>This is a level-one heading</H1>
Welcome to the world of HTML.
This is one paragraph.<P>
And this is a second.<P>
Click
here to see the formatted version of the example.
HTML uses markup tags to tell the Web browser how to display the text. The above example uses:
HTML tags consist of a left angle bracket (<), (a ``less than'' symbol to mathematicians), followed by name of the tag and closed by a right angular bracket (>). Tags are usually paired, e.g. <H1> and </H1>. The ending tag looks just like the starting tag except a slash (/) precedes the text within the brackets. In the example, <H1> tells the Web browser to start formatting a level-one heading; </H1> tells the browser that the heading is complete.
The primary exception to the pairing rule is the <P> tag. There is no such thing as </P>.
NOTE: HTML is not case sensitive. <title> is equivalent to <TITLE> or <TiTlE>.
Not all tags are supported by all World Wide Web browsers. If a browser does not support a tag, it just ignores it.
In the X Window System and Microsoft Windows versions of NCSA Mosaic, the Document Title field is at the top of the screen just below the pulldown menus. In NCSA Mosaic for Macintosh, text tagged as <TITLE> appears as the window title.
<Hy>Text of heading </Hy >
where y is a number between 1 and 6 specifying the level of the heading.
For example, the coding for the ``Headings'' section heading above is
<H3>Headings</H3>
Welcome to HTML.
This is the first paragraph. <P>
In the source file, there is a line break between the sentences. A Web browser ignores this line break and starts a new paragraph only when it reaches a <P> tag.
Important: You must separate paragraphs with <P>. The browser ignores any indentations or blank lines in the source text. HTML relies almost entirely on the tags for formatting instructions, and without the <P> tags, the document becomes one large paragraph. (The exception is text tagged as ``preformatted,'' which is explained below.) For instance, the following would produce identical output as the first bare-bones HTML example:
<TITLE>The simplest HTML example</TITLE><H1>This is a level
one heading</H1>Welcome to the world of HTML. This is one
paragraph.<P>And this is a second.<P>
However, to preserve readability in HTML files, headings should be on separate lines, and paragraphs should be separated by blank lines (in addition to the <P> tags).
NCSA Mosaic handles <P> by ending the current paragraph and inserting a blank line.
In HTML+, a successor to HTML currently in development, <P> becomes a ``container'' of text, just as the text of a level-one heading is ``contained'' within<H1> ... </H1>:
<P>
This is a paragraph in HTML+.
</P>
The difference is that the </P> closing tag can always be omitted. (That is, if a browser sees a <P>, it knows that there must be an implied </P> to end the previous paragraph.) In other words, in HTML+, <P> is a beginning-of-paragraph marker.
The advantage of this change is that you will be able to specify formatting options for a paragraph. For example, in HTML+, you will be able to center a paragraph by coding
<P ALIGN=CENTER>
This is a centered paragraph. This is HTML+, so you can't do it yet.
This change won't effect any documents you write now, and they will continue to look just the same with HTML+ browsers.
HTML's single hypertext-related tag is <A>, which stands for anchor. To include an anchor in your document:
A.)
Here is an sample hypertext reference:
<A HREF="MaineStats.html">Maine</A>
This entry makes the word ``Maine'' the hyperlink to the document MaineStats.html, which is in the same directory as the first document. You can link to documents in other directories by specifying the relative path from the current document to the linked document. For example, a link to a file NJStats.html located in the subdirectory AtlanticStates would be:
<A HREF="AtlanticStates/NJStats.html">New Jersey</A>
These are called relative links. You can also use the absolute pathname of the file if you wish. Pathnames use the standard UNIX syntax.
However, use absolute pathnames when linking to documents that are not directly related. For example, consider a group of documents that comprise a user manual. Links within this group should be relative links. Links to other documents (perhaps a reference to related software) should use full path names. This way, if you move the user manual to a different directory, none of the links would have to be updated.
scheme://host.domain[:port]/path/filename
where scheme is one of
The port number can generally be omitted. (That means unless someone tells you otherwise, leave it out.)
For example, to include a link to this primer in your document, you would use
<A HREF="http://www.ncsa.uiuc.edu/General/Internet/WWW/HTMLPrimer.html">
NCSA's Beginner's Guide to HTML</A>
This would make the text ``NCSA's Beginner's Guide to HTML'' a hyperlink to this document.
For more information on URLs, look at
Here's <A NAME="Jabberwocky">some text</a>
Now when you create the link in document A, include not only the filename, but also the named anchor, separated by a hash mark (#).
This is my <A HREF="documentB.html#Jabberwocky">link</A> to document B.
Now clicking on the word ``link'' in document A sends the reader directly to the words ``some text'' in document B.
For example, to link to the Jabberwocky anchor from within the same file (Document B), use
This is <A HREF="#Jabberwocky">Jabberwocky link</A> from within Document B.
The preceding is sufficient to produce simple HTML documents. For more complex documents, HTML has tags for several types of lists, preformatted sections, extended quotations, character formatting, and other items.
Below an example two-item list:
<UL>
<LI> apples
<LI> bananas
</UL>
The output is:
The <LI> items can contain multiple paragraphs. Just separate the paragraphs with the <P> paragraph tags.
<OL>
<LI> oranges
<LI> peaches
<LI> grapes
</OL>
produces this formatted output:
The following is an example of a definition list:
<DL>
<DT> NCSA
<DD> NCSA, the National Center for Supercomputing Applications,
is located on the campus of the University of Illinois
at Urbana-Champaign. NCSA is one of the participants in the
National MetaCenter for Computational Science and Engineering.
<DT> Cornell Theory Center
<DD> CTC is located on the campus of Cornell University in Ithaca,
New York. CTC is another participant in the National MetaCenter
for Computational Science and Engineering.
</DL>
The output looks like:
The <DT> and <DD> entries can contain multiple paragraphs (separated by <P> paragraph tags), lists, or other definition information.
An example nested list:
<UL>
<LI> A few New England states:
<UL>
<LI> Vermont
<LI> New Hampshire
</UL>
<LI> One Midwestern state:
<UL>
<LI> Michigan
</UL>
</UL>
The nested list is displayed as
<PRE>
#!/bin/csh
cd $SCR
cfs get mysrc.f:mycfsdir/mysrc.f
cfs get myinfile:mycfsdir/myinfile
fc -02 -o mya.out mysrc.f
mya.out
cfs save myoutfile:mycfsdir/myoutfile
rm *
</PRE>
display as
#!/bin/csh
cd $SCR
cfs get mysrc.f:mycfsdir/mysrc.f
cfs get myinfile:mycfsdir/myinfile
fc -02 -o mya.out mysrc.f
mya.out
cfs save myoutfile:mycfsdir/myoutfile
rm *
Hyperlinks can be used within <PRE> sections. You should avoid using other HTML tags within <PRE> sections, however.
Note that because <, >, and & have special meaning in HTML, you have to use their escape sequences (<, >, and &, respectively) to enter these characters. See the section Special Characters for more information.
An example:
<BLOCKQUOTE>
I still have a dream. It is a dream deeply rooted in the
American dream. <P>
I have a dream that one day this nation will rise up and
live out the true meaning of its creed. We hold these truths
to be self-evident that all men are created equal. <P>
</BLOCKQUOTE>
The result is:
I still have a dream. It is a dream deeply rooted in the American dream.I have a dream that one day this nation will rise up and live out the true meaning of its creed. We hold these truths to be self-evident that all men are created equal.
For example, the last line of the online version of this guide is
<ADDRESS>
A Beginner's Guide to HTML / NCSA / [email protected]
</ADDRESS>
The result is
A Beginner's Guide to HTML / NCSA / [email protected]NOTE: <ADDRESS> is not used for postal addresses. See ``Forced Line Breaks'' on page 10 to see how to format postal addresses.
You can code individual words or sentences with special styles. There are two types of styles: logical and physical. Logical styles tag text according to its meaning, while physical styles specify the specific appearance of a section. For example, in the preceding sentence, the words ``logical styles'' was tagged as a ``definition.'' The same effect (formatting those words in italics), could have been achieved via a different tag that specifies merely ``put these words in italics.''
In the ideal SGML universe, content is divorced from presentation. Thus, SGML tags a level-one heading as a level-one heading, but does not specify that the level-one heading should be, for instance, 24-point bold Times centered on the top of a page. The advantage of this approach (it's similar in concept to style sheets in many word processors) is that if you decide to change level-one headings to be 20-point left-justified Helvetica, all you have to do is change the definition of the level-one heading in the presentation device (i.e., your World Wide Web browser).
The other advantage of logical tags is that they help enforce consistency in your documents. It's easier to tag something as <H1> than to remember that level-one headings are 24-point bold Times or whatever. The same is true for character styles. For example, consider the <STRONG> tag. Most browsers render it in bold text. However, it is possible that a reader would prefer that these sections be displayed in red instead. Logical styles offer this flexibility.
To use one of these characters in an HTML document, you must enter its escape sequence instead:
Additional escape sequences support accented characters. For example:
NOTE: Unlike the rest of HTML, the escape sequences are case sensitive. You cannot, for instance, use < instead of <.
One use of <BR> is in formatting addresses:
National Center for Supercomputing Applications<BR>
605 East Springfield Avenue<BR>
Champaign, Illinois 61820-5518<BR>
Most Web browsers can display in-line images (that is, images next to text) that are in X Bitmap (XBM) or GIF format. Each image takes time to process and slows down the initial display of the document, so generally you should not include too many or overly large images.
To include an in-line image, use
<IMG SRC=image_URL>
where image_URL is the URL of the image file. The syntax for IMG SRC URLs is identical to that used in an anchor HREF. If the image file is a GIF file, then the filename part of image_URL must end with .gif. Filenames of X Bitmap images must end with .xbm.
By default the bottom of an
image is aligned with the text as shown in this paragraph.
Add the
ALIGN=TOP option if you want the browser to align adjacent text
with the top of the image as shown in this paragraph. The full in-line image tag
with the top alignment is:
<IMG ALIGN=top SRC=image_URL>
ALIGN=MIDDLE aligns the text with the center of the image.
<IMG SRC="UpArrow.gif" ALT="Up">
where UpArrow.gif is the picture of an upward pointing arrow. With NCSA Mosaic and other graphics-capable viewers, the user sees the up arrow graphic. With a VT100 browser, such as lynx, the user sees the word ``Up.''
You may want to have an image open as a separate document when a user activates a link on either a word or a smaller, in-line version of the image included in your document. This is considered an external image and is useful if you do not wish to slow down the loading of the main document with large in-line images.
To include a reference to an external image, use
<A HREF=image_URL>link anchor</A>
Use the same syntax is for links to external animations and sounds. The only difference is the file extension of the linked file. For example,
<A HREF="QuickTimeMovie.mov">link anchor</A>
specifies a link to a QuickTime movie. Some common file types and their extensions are:
Make sure your intended audience has the necessary viewers. Most UNIX workstations, for instance, cannot view QuickTime movies.
<B>This is an example of <DFN>overlapping</B> HTML tags.</DFN>
The word ``overlapping'' is contained within both the <B> and <DFN> tags. How does the browser format it? You won't know until you look, and different browsers will likely react differently. In general, avoid overlapping tags.
<H1><A HREF="Destination.html">My heading</A></H1>
Do not embed a heading or another HTML element within an anchor:
<A HREF="Destination.html">
<H1>My heading</H1>
</A>
Although most browsers currently handle this example, it is forbidden by the official HTML and HTML+ specifications, and will not work with future browsers.
Character tags modify the appearance of other tags:
<UL><LI><B>A bold list item</B>
<UL>
<LI><I>An italic list item</I>
</UL>
However, avoid embedding other types of HTML element tags. For example, it is tempting to embed a heading within a list, in order to make the font size larger:
<UL><LI><H1>A large heading</H1>
<UL>
<LI><H2>Something slightly smaller</H2>
</UL>
Although some browsers, such as NCSA Mosaic for the X Window System, format this construct quite nicely, it is unpredictable (because it is undefined) what other browsers will do. For compatibility with all browsers, avoid these kinds of constructs.
What's the difference between embedding a <B> within a <LI> tag as opposed to embedding a <H1> within a <LI>? This is again a question of SGML. The semantic meaning of <H1> is that it's the main heading of a document and that it should be followed by the content of the document.Thus it doesn't make sense to find a <H1> within a list.
Character formatting tags also are generally not additive. You might expect that
<B><I>some text</I></B>
would produce bold-italic text. On some browsers it does; other browsers interpret only the innermost tag (here, the italics).
Here is a longer example of an HTML document:
<HEAD>
<TITLE>A Longer Example</TITLE>
</HEAD>
<BODY>
<H1>A Longer Example</H1>
This is a simple HTML document. This is the first
paragraph. <P>
This is the second paragraph, which shows special effects. This is a
word in <I>italics</I>. This is a word in <B>bold</B>.
Here is an in-lined GIF image: <IMG SRC="myimage.gif">.
<P>
This is the third paragraph, which demonstrates links. Here is
a hypertext link from the word <A HREF="subdir/myfile.html">foo</A>
to a document called "subdir/myfile.html". (If you
try to follow this link, you will get an error screen.) <P>
<H2>A second-level header</H2>
Here is a section of text that should display as a
fixed-width font: <P>
<PRE>
On the stiff twig up there
Hunches a wet black rook
Arranging and rearranging its feathers in the rain ...
</PRE>
This is a unordered list with two items: <P>
<UL>
<LI> cranberries
<LI> blueberries
</UL>
This is the end of my example document. <P>
<ADDRESS>Me ([email protected])</ADDRESS>
</BODY>
Click
here to see the formatted version.
In addition to tags already discussed, this example also uses the <HEAD> ... </HEAD> and <BODY> ... </BODY> tags, which separate the document into introductory information about the document and the main text of the document. These tags don't change the appearance of the formatted document at all, but are useful for several purposes (for example, NCSA Mosaic for Macintosh 2.0, for example, allows you to browse just the header portion of document before deciding whether to download the rest), and it is recommended that you use these tags.
This guide is only an introduction to HTML and not a comprehensive reference. Below are additional sources of information.