|
|
A Tour of HTML
Forms and CGI Scripts
Overview Forms on
a Web Page A
Sample Web Form A CGI
Script in Perl The
Preamble  Reading
the CGI Data  Perl
Variables  Building
a Response Page  Script
Summary  Installing
a CGI script Exercise
1: Your Form, My Script Exercise
2: Your Form, Your Script Other
Readings
Return to CGI Resources
|
 |
This is a quick introduction to HTML Forms and CGI
Scripts. It reviews some of the common form elements and then
describes how a simple CGI script interacts with a web form. As
prerequisite, you should already be acquainted with HTML, since I
use it without explanation here. To create your own CGI scripts,
you'll also need to know some programming language. I use Perl here
since it's a fairly readable language.
There
are risks associated with CGI scripts. As you'll see, you are
essentially allowing anyone on the Internet to execute a program on
your system, as often as they like. If you write a script with
flaws, it can pose a serious security risk to your account, your
files or the entire system, and can also be a massive drain on
system resources. This document is a simplified introduction to the
elements of web forms and CGI scripts; it's not complete and is not
guaranteed to be accurate.
 Overview
 It's first important to
understand what HTML forms and CGI scripts are. They are very
different, but work closely together. A form is simply a web page
with some additional markup tags to instruct a web browser how to
display the various form elements, such as checkboxes, selection
lists, buttons and user-editable text areas. However, the web page
itself does not process the data, nor does the web server, which
doesn't know what you'd like to do with the user's answers. A
separate program or script, must process that data, in whatever way
you wish.
HTML forms are just markup tags on a web
page. CGI (Common Gateway Interface) is the language or protocol
that the browser uses to communicate the data from the form to the
web server. When the user submits her answers on a form, the browser
bundles them up and sends them to the web server, which passes them
on to your script for processing. A CGI script is any program which
knows how to read that bundle of data. Some important points are:
- The web page itself does not process the data entered on the
form. Neither does the web server. There must be a separate script
which the web page tells the server to send the data to, and which
knows how to speak the language (CGI) that the server will use to
send the data. You need both the web page and the script.
- For security reasons, most web servers will not execute a file
(even a script or program with the right permissions) unless it is
in a designated directory, or sometimes has a designated filename
extension. Even if you can put a web page on your system, you may
not have write permissions in that directory. You'll have to ask
your webmaster for the location of and how to write to that
directory. You can't write forms without this, unless you use a
script that's already installed.
- Your script processes the data however you want, and then
almost always returns an acknowledgement page. So the script must
build up and return the html source for a web page. You
occasionally see a C program doing this because they are faster to
execute, but shell and Perl scripts are easier for this kind of
text manipulation and are more commonly used for CGI scripts.
In what follows, we'll first describe the
various form tags you can use in your web page and give the HTML
source for a sample page using most of them. The page works so you
can try it
out. We'll then go through the Perl CGI script which does the
processing for the page.
 Forms on a web page
 It's
pretty easy to place things like radio buttons, selection and check
boxes, and interactive text areas onto your web page. It's a little
harder to do anything with them, but we'll get to that later. Right
now, we'll talk about how to incorporate form elements into a web
page.
Forms on a web page usually are included
inside a single set of FORM markup tags. Like this: |
<FORM ACTION="http://www.your_site.com/your_script"
METHOD=POST>
...
</FORM> |
All the form tags
described below should be inside this FORM region. The opening form
tag specifies an ACTION attribute, which gives the URL of the CGI
script you want to process the user's form data when she submits it.
This is where you supply the link between your form and your script.
The form tag also specifies a protocol or method for sending the
data, which can be either GET or POST. The latter is both more
secure and flexible and is recommended.
Inside
a FORM region, you can have any HTML elements you wish, including
text, images or links. You can also have various form elements.
Here's a list of the major form elements. Each has an example of
what it looks like on a page, a template for the corresponding
markup tag that's used in the source for your page, and a brief
comment or description.
For most of these markup tags,
you'll specify attributes, such as type, display characteristics,
and especially a name and a value. The name is essentially a
variable name, which is passed to your script so it can refer to the
information (the value) the user entered for that variable. The name
can be anything as long as it's different for each kind of
information you want the user to supply.
 A
Sample Web Form
 Here's the HTML source for a simple web page with many
of these form elements. You can try out the page in action.
|
<html><head><title>Your Title</title></head>
<body><h1>Your Heading</h1>
<form
action="http://www.speakeasy.org/~cgires/perl_form.cgi" method=post>
Type something here:
<input type="text" name="some_text"
size=30 maxlength=50><p>
Here's a checkbox:
<input type="checkbox" name="box"> <p>
Select one of these:
<select name="choice">
<option selected> Ha
<option> He
<option> Hi
<option> Ho
</select> <p>
Now some radio buttons:
<input type="radio" name="radbut"
value="oop" checked> Oop
<input type="radio" name="radbut"
value="eep"> Eep
<input type="radio" name="radbut"
value="urp"> Urp <p>
<hr>
Finally, you need to submit it:
<input type="submit" value="Send it">
or
<input type="reset" value="Erase all"> <p>
</form>
</body></html> |
 A CGI
Script in Perl
 When the user submits her form data, the browser
bundles all her answers up in a package and sends it to the script
whose URL was specified in the ACTION attribute. CGI is the language
or protocol used to construct this bundle and a script that knows
how to unconstruct the bundle is a CGI script.
You
needn't be concerned about the details of CGI since in Perl (as in
shell and C) there are packages which can read this bundle for you
and return each of the user's form data in special variables which
you can manipulate or process in any way you wish. The best known
packages for Perl are cgi-lib.pl, written by
Steven E. Brenner and cgi.pm,
by Lincoln Stein. These packages contain a number of functions which
are very useful for CGI scripts.
I'm
not going to describe all those functions, or even show how to
include them in your script. Including a package is a somewhat
advanced feature of Perl. Also, these package are popular and there
are a lot of versions around, many of them older versions, which may
not work as I shall describe. Instead, I'll give you the most useful
of those functions, and show you how to copy and use it in your
script.
So here's a line by line account of a simple
Perl CGI script. Remember, you can try out the page in action.
 The Preamble
 A Perl CGI script should always
begin with the following lines: |
#!/usr/bin/perl
# perl_form - a simple illustration of forms and Perl CGI |
| The first line is
mandatory for all Perl scripts. You may need to change the path to
Perl for your site. The second line is a comment, giving the name of
the script and what it does. Put lots of comments in all your
scripts and programs. Everything to the right of # is ignored by
Perl, as are blank lines. |
print "Content-type: text/html\n\n"; |
This line begins some
work. The web server knows it is executing a script, but has no idea
what to expect in return. So the script must first tell the server
what is coming, usually a web page of some sort. The print command
simply writes back to whoever executed it, the web server in this
case. It sends the magic words indicating a web page is about to
follow. After the server sees this, it will pass the contents of
further print statements back to the browser. This is how a script
can return a web page.
 Reading the CGI Data
 To read
the user's form data into your script, it's as simple as this:
|
&ReadParse; |
This is the really useful
function from the cgi-lib.pl package. It reads all the
form data from the user and puts them into a Perl variable called
%in. The Perl variable called %ENV has
some good data in it as well. I'll talk about how to use these in a
moment.
In order to use a function, you must define
it somewhere. Perl has special syntax for the use and definition of
functions. To use a Perl function, preface it with an
ampersand (&), as we did above. To define a function,
use the special keyword sub, then the name of the
function, then the block of code which defines the function,
enclosed in braces. Something like this: |
sub ReadParse {
... a lot of code ...
} |
Programmers will often
place the definitions of their functions at the end of the script,
and we will too, in the Summary
section below. I'm not going to explain how this function works here
(though more advanced or bold readers can view a separate tutorial
devoted to reading
CGI form data). It's fairly elegant code, but you really need to
know a few of the gory details of Perl to understand it. Just copy
and paste it onto the end of your script and use it happily
somewhere near the beginning.
Use it how? So the form data is in
something called %in. What's that?
 Perl
Variables
 There are three kinds of variables in Perl,
distinguished by the character preceding the variable's name:
 |
- $
a scalar, like $in
- a variable which can contain a string, an integer or a
real number;
- @
an array, like @in
- a simple list of any kind of scalar data. E.g., the
first one, the second one, ...;
- %
an associative array, like
%in
- a list, but instead of being indexed by numerical order,
you refer to the items in the list by any set of keywords of
your choosing. E.g., the red one, the green one, the blue
one,....
| So %in is
an associative array which contains the data the user submitted on
the form. The keywords used to access the elements of this array are
just the variable names you specified on the web form page. If you
look at the source for The
Sample Web Form above, you'll recall that we used variable names
of: some_text, box, choice, and radbut. Consequently, the values
that the user submitted for each of the form elements are stored in
$in{some_text}, $in{box}, $in{choice}, $in{radbut}.
You'll notice that to access a particular
value in an associative array, you put the keyword inside braces
following the name of the array. You also use a $ in front instead
of a %. Many people find this confusing, thinking you should use a %
instead, but it has a certain logic when you note that the
particular value you want is in fact a scalar, even though it's
coming from an array. You still use % when you want to refer to the
array as a whole, as we'll see.
 Building a Response Page
 We can now start processing the
form data in the script. In this case, we'll simply return a page
reporting what the data was. To do this, we first need to start
building the HTML source for a web page, to return to the web server
that called the script. Recall Perl's print function
does this: |
print "<title>The Response</title><h1>The Response</h1><hr>";
print "Here is the form data:<ul>"; |
We want an unordered list
of the variable names and corresponding values that the user
submitted. This is just the keyword and its corresponding value in
%in. Rather than code each keyword by hand, Perl has
some built-in functions and control loops that make this easy to do,
and also means this script will work with any web form. So
you can specify this script as your target ACTION in
the form tag when you try building your own web form.
(That's an important point -- read it again. You'll find this simple
script is quite useful for debugging a web form.)
Perl's
built-in function, keys, returns a list of all keywords
in an associative array. Then, Perl's foreach function
will cycle through every element in an array, and execute a block of
code once each time. Here it is: |
foreach $key (keys %in) {
print "<li>$key: $in{$key}";
}
print "</ul>"; |
This says, set the scalar
variable, $key, to be successively, each of the
keywords in the associative array, %in. Then each time,
print a <li> tag followed by the keyword, a colon
and a space, then the item in the associative array corresponding to
that keyword. This works, no matter how many keywords (named
variables in your script) there are, or what they are called.
Finally at the end, print one </ul> to close the
unordered list.
Note carefully the variable
substitution that occurs in the print statement.
You don't literally print the characters "$key" since it's a scalar
variable. Perl finds the value of that variable and prints
that instead. If you actually wanted to print out "$", you would
need to "escape" it by using "\$" inside the print statement, so
Perl knows you don't want to do variable substitution. The same is
true for arrays, "@" and "%". On the other hand, Perl does print
literally any characters it doesn't recognize as a variable, such as
<li>, and the colon and space. Perl makes printing very easy.
I mentioned above that another associative
array, called %ENV, has interesting information as
well. As you might guess, these are a set of environment variables,
that browsers send when they request a page from a web server. In
fact, these variables are always sent for every web page, not just
pages with forms, but you need a CGI script to read them. Are you
curious about what your web browser is saying about you behind your
back? Let's find out: |
print "and here are all the environment variables:<ul>";
foreach $key (keys %ENV) {
print "<li>$key: $ENV{$key}";
}
print "</ul>"; |
And that concludes our
CGI script. If you haven't tried out the page in action
yet, you should now.
 Summary
 To bring everything
together in one place, Here is the script again, including the
definition of the ReadParse function. |
#!/usr/bin/perl
# perl_form - a simple illustration of forms and Perl CGI
print "Content-type: text/html\n\n";
&ReadParse;
print "<title>The Response</title><h1>The Response</h1><hr>";
print "Here is the form data:<ul>";
foreach $key (keys %in) {
print "<li>$key: $in{$key}";
}
print "</ul>";
print "and here are all the environment variables:<ul>";
foreach $key (keys %ENV) {
print "<li>$key: $ENV{$key}";
}
print "</ul>";
# Adapted from cgi-lib.pl by [email protected]
# Copyright 1994 Steven E. Brenner
sub ReadParse {
local (*in) = @_ if @_;
local ($i, $key, $val);
if ( $ENV{'REQUEST_METHOD'} eq "GET" ) {
$in = $ENV{'QUERY_STRING'};
} elsif ($ENV{'REQUEST_METHOD'} eq "POST") {
read(STDIN,$in,$ENV{'CONTENT_LENGTH'});
} else {
# Added for command line debugging
# Supply name/value form data as a command line argument
# Format: name1=value1\&name2=value2\&...
# (need to escape & for shell)
# Find the first argument that's not a switch (-)
$in = ( grep( !/^-/, @ARGV )) [0];
$in =~ s/\\&/&/g;
}
@in = split(/&/,$in);
foreach $i (0 .. $#in) {
# Convert plus's to spaces
$in[$i] =~ s/\+/ /g;
# Split into key and value.
($key, $val) = split(/=/,$in[$i],2); # splits on the first =.
# Convert %XX from hex numbers to alphanumeric
$key =~ s/%(..)/pack("c",hex($1))/ge;
$val =~ s/%(..)/pack("c",hex($1))/ge;
# Associate key and value. \0 is the multiple separator
$in{$key} .= "\0" if (defined($in{$key}));
$in{$key} .= $val;
}
return length($in);
} |
 Installing a CGI Script
 Unfortunately, I can't tell you
precisely how to install a CGI script on your web server, or even
whether it's possible. Each server is configured a little
differently. You must ask your system administrator if CGI is
enabled on your server (or read the documentation yourself) and if
user-installed CGI scripts are permitted (some systems
permit only the administrator to install CGI scripts). Then you'll
typically need to find out: what's the path to Perl (needed for the
first line of a Perl script), where to place the script, what to
call it, what permissions to set for the script, and if your script
needs to write to some files, how to set those files' permissions.
I can briefly illustrate how Apache might be configured, a free
and popular web server for Unix, but only one of many. Apache has
three configuration files, httpd.conf, srm.conf and
access.conf. On my system, all are located in
/etc/httpd/conf/, though they could be anywhere, often
somewhere under /usr/ or /usr/local/.
The
second of these files is for server resource management, and
specifies how the server should handle requests from browsers. If a
browser requests a web page, the server returns the html source for
that page, that is, the contents of the file. But if a browser
requests a CGI script, the server must know it's not supposed to
return the contents of the file containing the script, but instead
should run that script as a program, and return to the browser the
results of the program. The server must be told the difference
between an HTML file and a CGI script.
There
are a couple of different ways to do this in srm.conf, as
the following two server directives in my configuration
show: |
ScriptAlias /cgi-bin/ /home/httpd/cgi-bin/
AddHandler cgi-script .cgi |
The first line tells the
server that any file it finds in the directory
home/httpd/cgi-bin/ is a CGI script to be run as a program
when requested by a browser. The second line tells the server that
any file (anywhere under the server's document root
directory--specified by another server directive) whose name ends in
.cgi is a script to be executed. The ScriptAlias
directory and this .cgi file name extension could be
anything. If your server isn't configured with these directives (or
they have been commented out), then you can't run CGI scripts on
your server. If it has only the first directive, but not the second,
and if you don't have write permission for the cgi-bin directory,
then your system administrator can install CGI scripts but you
can't.
 Exercise 1: Your Form, My
Script
 You can use this script, even if you're not on my site.
(Web browsers usually don't care where they send a request, and my
server will accept yours.) So try it with your own form. Compose a
web form on your server and specify |
http://www.speakeasy.org/~cgires/perl_form.cgi |
as the action attribute
in your opening <form> tag. No matter what or how
many form elements you use in your own web page, or what variable
names you use for them, the script will report the form and
environment data, just as it did above. Of course, that's all it
will do. For anything more interesting, you'll need to write your
own script.
 Exercise 2: Your Form, Your
Script
 If you have write access to your web server's cgi-bin
directory, you can copy this script to your server and use it there.
You'll need to make your script world-readable and world-executable:
chmod 755 </path/script_name> at the unix prompt
should do that. You'll also need to ask your webmaster for the
directory or naming conventions and the URL of your script, which
will typically be different from the path to the file name. Then use
that URL in the action attribute of your web page's
<form> tag.
To make things more interesting,
specialize the script so it only reports form data specifically
mentioned in your web form. For example, if your form tags have NAME
attributes of my_first_tag and
my_second_tag, you could use a print statement in your
Perl script like this: |
print "<ul>";
print "<li>my_first_tag: $in{'my_first_tag'}";
print "<li>my_second_tag: $in{'my_second_tag'}";
print "</ul>"; |
Try reformatting them,
putting them in tables, adding images. Add a link which points to
the contents of the $ENV{'HTTP_REFERER'} variable.
Remember also to copy and paste the definition of the
ReadParse function from above into your
script.
 Other Readings
 This
list is probably far out of date.
For
the most authoritative information on CGI, see the collection of references
assembled by the folks at the World Wide Web consortium. I know of a
few other CGI tutorials on the Web: Learn to Write
CGI-Forms and a CGI and
Perl Tutorial. I like Building-blocks
for CGI Scripts in Perl; it has site-specific material but is
quite good.. A sample
chapter from a book on CGI scripts in shell and Perl is
available on the Web. There is also Carlos'
Forms Tutorial, which discusses forms but not CGI. Yet Another HTCYOHP
Home Page discusses CGI scripts written in C. A good FAQ on CGI
is A
CGI Programmer's Reference.
You
can now continue to the more advanced material in CGI/Perl Tips,
Tricks and Techniques or return to the CGI Resource
index

|
CGI
Resources are copyright
1995-98, Sanford Morton Last modified: Sun
Aug 16 17:12:47 PDT 1998
| |