From: Nick Kew Newsgroups: comp.infosystems.www.authoring.cgi,comp.answers,news.answers Subject: FAQ: Frequently Asked Questions about CGI Programming Supersedes: Followup-To: comp.infosystems.www.authoring.cgi Date: 7 Dec 1996 18:02:01 GMT Organization: A dangerous illusion Lines: 1237 Approved: news-answers-request@MIT.EDU Expires: 27 Dec 1996 17:18:53 GMT Message-ID: NNTP-Posting-Host: gordon.esrin.esa.it Summary: The Common Gateway Interface - Programming for the WWWeb: Basics (what is CGI; when to use CGI vs other web programming techniques) HTTP and NPH scripts: technical info and references Programming tips: "How do I do this..." Applications: where to find existing programs and information Troubleshooting: How to tackle your problems Further Reading: related FAQs and reference material Keywords: CGI,FAQ,HTTP,WWW User-Agent: libwww-perl/4.98 Archive-name: www/cgi-faq Posting-frequency: Twice monthly Frequently Asked Questions on CGI programming ------------------------------ Subject: Table of Contents ========================== 0. Preamble 0.1. Changes 0.2. Notice and Disclaimer 0.3. Where to get this document 0.4. How to contribute to this document? 0.5. Can I email the author my questions? 0.6. What's up with posting to comp.infosystems.www.authoring.cgi? 0.7. Credits 1. Basic Questions 1.1. What is CGI? 1.2. Is it a script or a program? 1.3. When do I need to use CGI? 1.4. Should I use CGI or JAVA? 1.5. Should I use CGI or SSI? 1.6. Should I use CGI or an API? 1.7. What do I absolutely need to know? 1.8. Does CGI create new security risks? 1.9. Do I need to be on Unix? 1.10. Do I have to use Perl? 1.11. Do I have to put it in cgi-bin? 1.12. Do I have to call it *.cgi? *.pl? 1.13. What is CGIWrap, and how does it affect my program? 2. HTTP Headers and NPH Scripts 2.1. What is HTTP (HyperText Transfer Protocol)? 2.2. What HTTP request headers can I use? 2.3. What Environment variables are available to my application? 2.4. What HTTP response headers do I need to know about? 2.5. What is NPH? 2.6. Must/should/can I write nph scripts? 2.7. Do I have to call it nph-* 2.8. What is the difference between GET and POST? 3. Techniques: "How do I..." 3.1. Can I get information about who is visiting? 3.2. Can I get the email of visitors? 3.3. "But I saw some.kool.site display my email address..." 3.4. Can I get browser details and return different pages? 3.5. Can I trace where a user has come from/is going to? 3.6. Can I launch a long process and return a page before it's finished? 3.7. Can I launch a long process which the user interacts with? 3.8. Can I password-protect my pages? 3.9. Can I do HTTP authentication using CGI? 3.10. Can I identify users/sessions without password protection? 3.11. Can I redirect users to another page? 3.12. Can I run a CGI script without returning a new page to the browser? 3.13. Can I write output to a different Netscape frame? 3.14. Can I write output to several frames at once? 3.15. Can I use a CGI script to generate both text and inline images? 3.16. How can I use Caches to make CGI scripts faster and more Net-friendly? 4. Applications: Is there an existing script to ... 4.1. Where to look for free scripts for my application? 4.2. Discussion group/bulletin board 4.3. CSCW/Groupware 4.4. Database 5. Troubleshooting a CGI application 5.1. Are there some interactive debugging tools and services available? 5.2. I'm having trouble with my headers. What can I do? 6. Further Reading 6.1. Other FAQs/collections (including online book) 6.2. Reference Pages INDEX ------------------------------------------------------------- Subject: SECTION 0 - PREAMBLE NOTE: the Reply-to address in this FAQ is an autoresponder. If you want to write to me, you'll have to set the "To:" line by hand: mailto:nick.kew@pobox.com NOTE: the numbering in this document is automatically generated by my posting software, and will change between postings if new questions are added (as _may_ happen when I see a FAQ I've previously overlooked :-) ------------------------------ Subject: 0.1 Changes Last Modified: December 5th 1996: * Added GET vs POST question * Added new Web Authoring FAQ to further reading ------------------------------ Subject: 0.2 Notice and Disclaimer Copyright 1996 Nick Kew. You are free to copy or distribute this document in whole or in part for any purpose and on any medium you choose, provided: You DON'T do so for profit. You DO include this notice and disclaimer in full. Disclaimer: This information is offered in good faith and in the hope that it may be of use, but is not guaranteed to be correct, up to date or suitable for any particular purpose. The author accepts no liability in respect of this information or its use. ------------------------------ Subject: 0.3 Where to get this document The home of this document on the Web is now the WebThing WebCentre, at http://pobox.com/%7Ewebthing/ This is an interactive site, using CGI software that permits readers to comment on, and contribute to, the FAQ itself. See next question. NOTE - If you want to mirror the FAQ on your WWW site, the best document to use is the HTML version from my autoresponder (see below). If you're putting it on a publicly-visible server, please make sure you keep it up-to-date (if you let me know you have it, I can automate the updates). Other known sources are: (1) USENET: posted to newsgroups (TEXT) news:comp.infosystems.www.authoring.cgi news:comp.answers news:news.answers (2) RTFM and mirror sites (TEXT) ftp://rtfm.mit.edu/pub/usenet/news.answers/www/cgi-faq (3) RTFM WWW mirror sites, including (Partial HTML) Europe - http://www.cs.ruu.nl/cgi-bin/faqwais America - http://www.cis.ohio-state.edu/hypertext/faq/usenet/ (4) By EMAIL from my autoresponder (HTML or TEXT) Send blank email for info: currently it will respond to subject lines: send cgifaq.txt or send cgifaq.html but these may have changed if you're reading a saved copy. mailto:satfaq@pobox.com (5) By EMAIL from the FAQserver at RTFM (TEXT) Send email to mailto:mail-server@rtfm.mit.edu with send usenet/news.answers/www/cgi-faq in the body of your message ------------------------------ Subject: 0.4 How to contribute to this document? The WebThing software permits collaborative authoring using your web browser. When you are reading any entry in this InterFAQ, you can add a new entry which will then appear as another "more on" subject. http://pobox.com/%7Ewebthing/ In order to maintain the quality of the FAQ, and avoid inappropriate 'commercial' entries, write permission is limited using an Access Control List. If you have a contribution to make, send me an email including your WebThing userid (i.e. what you entered in the registration form) and I'll add you to the list. InterFAQ readers - If your browser isn't showing a "new entry" button, then either you aren't logged in or you're not on the access control list. Note that this InterFAQ is limited to questions-and-answers appropriate to periodic Usenet posting. Other types of contribution can be added elsewhere in the WebCentre. For example * If you have a relevant website and want to link to it, enter it the appropriate collection (e.g. "scripts" or "misc"). You can then also include a description of your site, and have it indexed. * If you want to post a question or comment on something in this document, you can post it as a followup to the "flat" version of the FAQ (library document in the "FAQS" collection). If you don't want to use the InterFAQ you can always mail me ( mailto:nick.kew@pobox.com ) ------------------------------ Subject: 0.5 Can I email the author my questions? I already get more email than I can possibly answer personally, so in general the answer is no - I'm NOT a free advice centre. The possible exception is when something already in the FAQ needs clarifying: don't expect a personal reply, but I *might* add something to the answer in question, so check the next posting (or three). The newsgroup is the appropriate place for free advice. But remember: bad questions usually get bad answers, so think carefully before posting. ------------------------------ Subject: 0.6 What's up with posting to comp.infosystems.www.authoring.cgi? This is now a moderated newsgroup. The moderator is a bot run by Thomas Boutell ( mailto:boutell@boutell.com ). The charter for moderation is as follows: This newsgroup is self-moderated. Your first posting will not appear until you have read and responded to an automatic welcome mailing, at which point your posting will appear with no further delay. Provision will also be made to automatically approve first postings that contain a header requesting this. Subsequent postings are approved automatically. If posting normally doesn't work - as could be the case if your newsfeed has trouble with moderated groups - you can post articles by emailing them to: mailto:authoring-cgi@boutell.com Provided the return address in your mail is correct, you will then receive precise instructions for having your post(s) automatically approved. Alternative means of posting are detailed in the WWW FAQ, posted regularly by Thomas Boutell. ------------------------------ Subject: 0.7 Credits This FAQ was written by Nick Kew, and has been considerably improved with the help of comments and criticisms, newsgroup posts and miscellaneous suggestions from Nathan Neulinger, Maurice L. Marvin, Matthew Healy and Alan J. Flavell. ------------------------------------------------------------- Subject: SECTION 1 - BASIC QUESTIONS This section aims to deal with basic questions, addressing the role and nature of CGI, and its place in Web programming. Questions/answers which just don't appear to 'fit' under any other section may also be included here. ------------------------------ Subject: 1.1 What is CGI? [ from the CGI reference http://hoohoo.ncsa.uiuc.edu/cgi/overview.html ] The Common Gateway Interface, or CGI, is a standard for external gateway programs to interface with information servers such as HTTP servers. A plain HTML document that the Web daemon retrieves is static, which means it exists in a constant state: a text file that doesn't change. A CGI program, on the other hand, is executed in real-time, so that it can output dynamic information. ------------------------------ Subject: 1.2 Is it a script or a program? The distinction is semantic. Traditionally, compiled executables (binaries) are called programs, and interpreted programs are usually called scripts. In the context of CGI, the distinction has become even more blurred than before. The words are often used interchangably (including in this document). Current usage favours the word "scripts" for CGI programs. ------------------------------ Subject: 1.3 When do I need to use CGI? There are innumerable caveats to this answer, but basically any Webpage containing a form will require a CGI script or program to process the form inputs. ------------------------------ Subject: 1.4 Should I use CGI or JAVA? [answer to this non-question hopes to try and reduce the noise level of the recurrent "CGI vs JAVA" threads]. CGI and JAVA are fundamentally different, and for most applications are NOT interchangable. Neither are the two isomorphic: you could in principle write a CGI program in JAVA, although it is hard to think of an instance where this would be the best choice. CGI is a mechanism for running programs on a WWW server. Typical applications include accessing a database, submitting an order, or posting messages to a bulletin board. JAVA enables programs to run on the Client machine, and is suited to such tasks as detailed manipulation of an image. Alternatives to JAVA may include the X windows client/server protocol, use of browser plugins and helper applications, and other clientside languages such as SafeTCL and perl/penguin. In certain instances the two may be combined in a single application: for example a JAVA applet to define a region of interest from a geographical map, together with a CGI script to process a query for the area defined. ------------------------------ Subject: 1.5 Should I use CGI or SSI? CGI and SSI (Server-Side Includes) are often interchangable, and it may be no more than a matter of personal preference. Here are a few guidelines: 1) CGI is a common standard agreed and supported by all major HTTPDs. SSI is NOT a common standard, but an innovation of NCSA's HTTPD which has been widely adopted in later servers. CGI has the greatest portability, if this is an issue. 2) If your requirement is sufficiently simple that it can be done by SSI without invoking an exec, then SSI will probably be more efficient. A typical application would be to include sitewide 'house styles', such as toolbars, netscapeised tags or embedded CSS stylesheets. 3) For more complex applications - like processing a form - where you need to exec (run) a program in any case, CGI is usually the best choice. ------------------------------ Subject: 1.6 Should I use CGI or an API? APIs are proprietary programming interfaces supported by particular platforms. By using an API, you lose all portability. If you know your application will only ever run on one platform (OS and HTTPD), and it has a suitable API, go ahead and use it. Otherwise stick to CGI. ------------------------------ Subject: 1.7 What do I absolutely need to know? If you're already a programmer, CGI is extremely straightforward, and just three resources should get you up to speed in the time it takes to read them: 1) Installation notes for your HTTPD. Is it configured to run CGI scripts, and if so how does it identify that a URL should be executed? (Check your manuals, READMEs, ISP webpages/FAQS, and if you still can't find it ask your server administrator). 2) The CGI specification at NCSA tells you all you need to know to get your programs running as CGI applications. http://hoohoo.ncsa.uiuc.edu/cgi/interface.html 3) WWW Security FAQ. This is not required to 'get it working', but is essential reading if you want to KEEP it working! http://www-genome.wi.mit.edu/WWW/faqs/www-security-faq.html If you're NOT already a programmer, you'll have to learn. If you would find it hard to write, say, a 'grep' or 'cat' utility to run from the commandline, then you will probably have a hard time with CGI. Make sure your programs work from the commandline BEFORE trying them with CGI, so that at least one possible source of errors has been dealt with. ------------------------------ Subject: 1.8 Does CGI create new security risks? Yes. Period. There is a lot you can do to minimise these. The most important thing to do is read and understand Lincoln Stein's excellent WWW security FAQ, at http://www-genome.wi.mit.edu/WWW/faqs/www-security-faq.html . ------------------------------ Subject: 1.9 Do I need to be on Unix? No, but it helps. The Web, along with the Internet itself, C, Perl, and almost every other Good Thing in the last 20 years of computing, originated in Unix. At the time of writing, this is still the most mature and best-supported platform for Web applications. ------------------------------ Subject: 1.10 Do I have to use Perl? No - you can use any programming language you please. Perl is simply today's most popular choice for CGI applications. Some other widely- used languages are C, TCL, BASIC and - for simple tasks - even shell scripts. Reasons for choosing Perl include its powerful text manipulation capabilities (in particular the 'regular' expression) and the fantastic WWW support modules available. ------------------------------ Subject: 1.11 Do I have to put it in cgi-bin? see next question ------------------------------ Subject: 1.12 Do I have to call it *.cgi? *.pl? Maybe. It depends on your server installation. These types of filenames are commonly used conventions - no more. It is up to the server administrator whether or not CGI scripts are enabled, and (if so) what conventions tell the server to run or to print them. If you are running your own server, read the manual. If you're on ISP or other rented webspace, check their webpages for information or FAQs. As a last resort, ask the server administrator. ------------------------------ Subject: 1.13 What is CGIWrap, and how does it affect my program? [ quoted from http://www.umr.edu/ cgiwrap/intro.html ] > CGIWrap is a gateway program that allows general users to use CGI scripts > and HTML forms without compromising the security of the http server. > Scripts are run with the permissions of the user who owns the script. In > addition, several security checks are performed on the script, which will not > be executed if any checks fail. > > CGIWrap is used via a URL in an HTML document. As distributed, cgiwrap > is configured to run user scripts which are located in the > /public_html/cgi-bin/ directory. See http://www.umr.edu/ cgiwrap/ ------------------------------------------------------------- Subject: SECTION 2 - HTTP HEADERS AND NPH SCRIPTS This is a fairly technical section dealing with HTTP, the protocol of the Web. It also includes NPH, the mechanism by which CGI programs can return HTTP header information directly to the Client. ------------------------------ Subject: 2.1 What is HTTP (HyperText Transfer Protocol)? HTTP is the protocol of the Web, by which Servers and Clients (typically browsers) communicate. An HTTP transaction comprises a Request sent by the Client to the Server, and a Response returned from the Server to the Client. Every HTTP request and response includes a message header, describing the message. These are processed by the HTTPD, and may often be mostly ignored by CGI applications (but see below). A message body may also be included: 1) A HEAD or GET request sends only a header. Any form data is encoded in an HTTP_QUERY_STRING header field, which is available to the CGI program as an environment variable QUERY_STRING. 2) A POST request sends both header and body. The body typically comprises data entered by a user in a form. 3) A HEAD request does not expect a body in the response. 4) A GET or POST request will accept a response with or without a body, according to the header. The body of a response is typically an HTML document. ------------------------------ Subject: 2.2 What HTTP request headers can I use? Most HTTP request headers are passed to the CGI script as environment variables. Some are guaranteed by the CGI spec. Others are server, browser and/or application dependent. To see what _your_ browser and server are telling each other, just use a trivial little CGI script to print out the environment. In Unix: #!/bin/sh echo "Content-type: text/plain" echo set (Just call it "env.cgi" or something, and put it where your server will execute it. Then point your browser at http://your.server/path/to/env.cgi ). This enables you to see at-a-glance what useful server variables are set. Note that dumping the environment like this within a more complex script can be a useful debugging technique. For details, see the CGI Environment Variables specification at http://hoohoo.ncsa.uiuc.edu/cgi/env.html (which also includes a version of the above script - somewhat more nicely formatted - online). ------------------------------ Subject: 2.3 What Environment variables are available to my application? See previous question. Those you can rely on are documented in NCSA's pages; those associated with your particular server and browser can be determined using the above script. ------------------------------ Subject: 2.4 What HTTP response headers do I need to know about? Unless you are using NPH, the HTTPD will insert necessary response headers on your behalf, always provided it is configured to do so. However, it is conventional for servers to insert the Content-type header based on a page's filename, and for CGI scripts it will often be absent or wrong. Hence the usual advice is to print an explicit Content-type header. Some other headers you may wish to use explicitly are: Status (to set HTTP return code explicitly. Caveats: (1) Behaviour is undefined if it conflicts with another header. (2) This is NOT an HTTP header.) Location (to redirect the user to another URI, which may or may not be on your own server) Set-cookie (Netscape/Nonstandard) Set a cookie Refresh (Netscape/Nonstandard) Clientpull You can also use general MIME headers: eg "Keywords" for the benefit of indexers (although in this instance some major search robots have regrettably introduced a new protocol to do the same thing). The 'official' list of HTTP response headers is at http://www.w3.org/pub/WWW/Protocols/HTTP/Object_Headers.html ------------------------------ Subject: 2.5 What is NPH? NPH = No Parsed Headers. The script undertakes to print the entire HTTP response including all necessary header fields. The HTTPD is thereby instructed not to parse the headers (as it would normally do) nor add any which are missing. ------------------------------ Subject: 2.6 Must/should/can I write nph scripts? Generally, no. It is usually better to save yourself hassle by letting the HTTPD produce the headers for you. If you are going to use NPH, be sure to read and understand the HTTP spec at http://www.w3.org/pub/WWW/Protocols/ Your headers should be complete and accurate, because you're instructing the HTTPD not to correct them or insert what's missing. Possible circumstances where the use of NPH is appropriate are: * When your headers are sufficiently unusal that they might be differently parsed by different HTTPDs (eg combining "Location:" with a "Status:" other than 302). * When returning output over a period of time (eg displaying unbuffered results of a slow operation in 'real' time). See http://www.w3.org/pub/WWW/Protocols/HTTP/HTRESP.html ------------------------------ Subject: 2.7 Do I have to call it nph-* According to NCSA's reference pages, this is the standard for telling the server that your script is NPH, so this should be a fully portable convention. ------------------------------ Subject: 2.8 What is the difference between GET and POST? Firstly, the the HTTP protocol specifies differing usages for the two methods. GET requests should always be idempotent on the server. This means that whereas one GET request might (rarely) change some state on the Server, two or more identical requests will have no further effect. This is a theoretical point which is also good advice in practice. If a user hits "reload" on his/her browser, an identical request will be sent to the server, potentially resulting in two identical database or guestbook entries, counter increments, etc. Browsers may reload a GET URL automatically, particularly if cacheing is disabled (as is usually the case with CGI output), but will typically prompt the user before re-submitting a POST request. This means you're far less likely to get inadvertently-repeated entries from POST. GET is (in theory) the preferred method for idempotent operations, such as querying a database, though it matters little if you're using a form. There is a further practical constraint that many systems have builtin limits to the length of a GET request they can handle: when the total size of a request (URL+params) approaches or exceeds 1Kb, you are well-advised to use POST in any case. In terms of mechanics, they differ in how parameters are passed to the CGI script. In the case of a POST request, form data is passed on STDIN, so the script should read from there (the number of bytes to be read is given by the Content-length header). In the case of GET, the data is passed in the environment variable QUERY_STRING. The content-type (application/x-www-form-urlencoded) is identical for GET and POST requests. ------------------------------------------------------------- Subject: SECTION 3 - TECHNIQUES: "HOW DO I..." This section comprises programming hints and tips for a number of popular tasks. Also included are a number of common questions to which the answer is "you can't", with the reasons why. ------------------------------ Subject: 3.1 Can I get information about who is visiting? You can get some limited information from the environment variables passed to you by the browser. Relatively few of these are guaranteed to be available, and some may be misleading. For particular types of information, see below. For full details, see NCSA's reference pages. ------------------------------ Subject: 3.2 Can I get the email of visitors? Why do you want to do this? The best information available is the REMOTE_ADDR and REMOTE_HOST, which tell you nothing about the user. Techniques such as "finger@" are not reliable, are widely disliked, and generally serve only to introduce long delays in your CGI. Better - as well as more polite - just to ask your users to fill in a form. ------------------------------ Subject: 3.3 "But I saw some.kool.site display my email address..." Some sites will play party tricks, which can get *some users* email addresses. Possible tell-tale signs of this are inordinate delays loading a page (fingering @REMOTE_HOST - doesn't often work but probably can't be detected from the webpage), or a submit button that appears to do nothing at all (a mailto: link - works quite well but trivially detectable). As a "snoop" party trick that's fine, but if you find someone abusing these facilities (eg they send you junkmail), alert their service provider! ------------------------------ Subject: 3.4 Can I get browser details and return different pages? Why do you want to do this? Well-written HTML will display correctly in any browser, so the correct answer to this question is to design a template for your output in good HTML, and make sure your output is correct. If you insist on a different answer, you can use the HTTP_USER_AGENT environment variable. This requires care, and can lead to unexpected results. For example, checking for "Mozilla" and serving a frameset to it ensures that you *also* serve the frameset to early (Non-Frame) Netscapes, me-too browsers (notably MicroSoft) and others who have chosen to lie to you about their browser. Note also that not every User Agent is a browser. Your page may be read by a user agent you've never heard of, and then displayed by 100 different browsers. Or retrieved by different browsers from a cache. Another reason to write good HTML, and not try to devise a clever or koool substitute. ------------------------------ Subject: 3.5 Can I trace where a user has come from/is going to? HTTP_REFERER might or might not tell you anything. By all means use it to collect partial statistics if you participate in (say) an advertising banner scheme. But it is not always set, and may be meaningless (eg if a user has accessed your page from a bookmark, and the browser is too dumb to cope with this). You cannot trace outgoing links at all. If you really must try, point all the external links to your HTTPD and use its redirection facility (which gives you generally-reliable logs). This is much less inefficient than using a CGI script. BTW: don't even think about asking Javascript to send you information on some event: it's a violation of privacy which Netscape fixed as soon as complaints about its abuse started coming in. If it works with *your* browser, you should upgrade! ------------------------------ Subject: 3.6 Can I launch a long process and return a page before it's finished? [UNIX] You have to fork/spawn the long-running process. The important thing to remember is to close all its file descriptors; otherwise nothing will be returned to the browser until it's finished. The standard trick to accomplish this is redirection to/from /dev/null: exec ("long_process < /dev/null > /dev/null 2>&1 &") print HTML page as usual ------------------------------ Subject: 3.7 Can I launch a long process which the user interacts with? This does not fit well with the basic mechanics of the Web, in which each transaction comprises a single request and response. If your processing can be done on the Client machine, you can use a clientside application; for example a Java applet. For processing on the server, one trick that works well for Clients running an X server (and far, far more efficient than a JAVA solution) is: if ( fork() ) { print HTML page explaining what's going on and advising about xhost } else { exec ("xterm -display THEIR_DISPLAY -title MY_APP -e MY_PROG ARGS < /dev/null > /dev/null 2>&1 &") ; } NOTE: THEIR_DISPLAY is not necessarily the same as REMOTE_HOST or REMOTE_ADDR. You have to ask users to supply their display (set REMOTE_HOST as default). ------------------------------ Subject: 3.8 Can I password-protect my pages? Yes. Use your HTTPD's authentication, just as you would a basic HTML page. Now you'll have the identity of every visitor in REMOTE_USER. ------------------------------ Subject: 3.9 Can I do HTTP authentication using CGI? It depends on which version of the question you asked. Yes, you can use CGI to trigger the browser's standard Username/Password dialogue. Send a response code 401, together with a "WWW-authenticate" header including details of the the authentication scheme and realm: e.g. (in a non-NPH script) Status: 401 Unauthorized to access the document WWW-authenticate: Basic realm="foobar" Content-type: text/plain Unauthorised to access this document The use you can make of this is server-dependent, and harder, since most servers expect to deal with authentication before ever reaching the CGI (eg through .www_acl or .htaccess). Thus it cannot usefully replace the standard login sequence, although it can be applied to other situations, such as re-validating a user - e.g after a certain timeout period or if the same person may need to login under more than one userid. What you can never get in CGI is the credentials returned by the user. The HTTPD takes care of this, and simply sets REMOTE_USER to the username if the correct password was entered. ------------------------------ Subject: 3.10 Can I identify users/sessions without password protection? The most usual (but browser-dependent) way to do this is to set a cookie. If you do this, you are accepting that not all users will have a 'session'. An alternative is to pass a session ID in every GET URL, and in hidden fields of POST requests. This can be a big overhead unless _every_ page requires CGI in any case. Another alternative is the Hyper-G solution of encoding a session-id in the URLs of pages returned: http://hyper-g.server/session_id/real/path/to/page This has the drawback of making the URLs very confusing, and causes any bookmarked pages to generate old session_ids. Note that a session ID based solely on REMOTE_HOST (or REMOTE_ADDR) will NOT work, as multiple users may access your pages concurrently from the same machine. ------------------------------ Subject: 3.11 Can I redirect users to another page? For permanent and simple redirection, use the HTTPD configuration file: it's much more efficient than doing it yourself. Some servers enable you to do this using a file in your own directory (eg Apache) whereas others use a single configuration file (eg CERN). For more complicated cases (eg process form inputs and conditionally redirect the user), use the "Location:" response header. If the redirection is itself a CGI script, it is easy to URLencode parameters to it in a GET request, but don't forget to escape the URL! ------------------------------ Subject: 3.12 Can I run a CGI script without returning a new page to the browser? Yes, but think carefully first: How are your readers going to know that their "submit" has succeeded? They may hit 'submit' many times! The correct solution according to the HTTP specification is to return HTTP status code 204. As an NPH script, this would be: #!/bin/sh # do processing (or launch it as background job) echo "HTTP/1.0 204 No Change" echo Alan J Flavell has pointed out that this will fail with certain popular browsers, and suggests a workaround to accommodate them: > 1. Send status 204, Content-type of text/html, and a short body content > that (for those few browsers that display it) will tell the reader that > their browser does not handle this reponse correctly, and invites them > to use their browser's Back function (hey, if someone tells me to put > a back button on the HTML page itself, I think I shall scream...). His survey is at http://ppewww.ph.gla.ac.uk/%7Eflavell/status204/results.html ------------------------------ Subject: 3.13 Can I write output to a different Netscape frame? Yep. The fact you're using CGI makes no difference: use "target=" in your links as usual. Alternatively, the script can print a "Window-target:" header. Read Netscape's pages for detail: these answer all the questions about things like "getting rid of" or "breaking out of" frames, too. ------------------------------ Subject: 3.14 Can I write output to several frames at once? A single CGI script can only ever print to one frame. However, this limitation may be overcome by using more than one script. The first script (the URL of the "submit" button) prints a frameset, typically to a "_parent" or "_top" target. The sources for one or more of the frames thus generated may also be CGI scripts, to which you can easily pass parameters (eg encoded in URLs with method GET). This hack is definitely not recommended. If you find yourself wanting to update several frames from a single user event, it probably means you should review the design of your application at a higher level. Warnings: 1. Don't forget to escape your URLs. 2. This technique results in your server being hit by multiple concurrent CGI requests. You'll need LOTS of memory, especially if you use a memory-hog like Perl. It can be a good recipe for bringing a server to its knees. Javascript is often a valid alternative here, but note just how silly it can (and often does) look in a different browser. ------------------------------ Subject: 3.15 Can I use a CGI script to generate both text and inline images? Not directly. One script generates one response to one request. If you want to generate a dynamic page including dynamic images (say, a report including graphs, all of which depend on user input) then your primary script will print the usual [what you asked for] and, just as in the multiple frames case, you can pass data to the image-generating program encoded in a GET URL. Of course, the same caveats apply: see above. ------------------------------ Subject: 3.16 How can I use Caches to make CGI scripts faster and more Net-friendly? This is currently beyond the scope of this FAQ (whose author urgently needs to improve his own applications in this regard). However, there is an excellent introduction to net-friendly webpages, including CGI pages, at http://vancouver-webpages.com/CacheNow/ A sample cacheing perl/cgi script by Andrew Daviel is available at http://vancouver-webpages.com/proxy/log-tail.pl ------------------------------------------------------------- Subject: SECTION 4 - APPLICATIONS: IS THERE AN EXISTING SCRIPT TO ... There are a lot of applications available. For all the tasks listed here, there are free systems you can download and install yourself (at least if you're on UNIX). Many are excellent. Before ever *buying* software, do a Net search on what you want and check what freeware is available. Does the commercial system you had in mind *really* have any advantages? If you can't follow the jargon they use to explain the merits of their system, insist on some clarification (hey, that's not just for Web software :-) Most questions under this heading are probably best answered by reference to appropriate review sites on the Web (in many cases, Thomas Boutell's WWW FAQ). In cases where I know of one or more good sites, I've referenced them. ------------------------------ Subject: 4.1 Where to look for free scripts for my application? Some popular places to look for a wide range of free CGI applications are: Selena Sol's Public Domain CGI Scripts http://www2.eff.org/ erict/Scripts/scripts.html Matt Wright's Script Archive http://www.worldwidemart.com/scripts/ Dale Bewley has a much longer list of script archives (along with his own scripts) at http://www.engr.iupui.edu/ dbewley/perl/ ------------------------------ Subject: 4.2 Discussion group/bulletin board David R Woolley maintains a list of currently around 100 systems at http://freenet.msp.mn.us/ drwool/webconf.html ("Conferencing on the Web"). ------------------------------ Subject: 4.3 CSCW/Groupware There are several overview sites for this. A few are: The CSCW Yellow Pages, at http://www11.informatik.tu-muenchen.de/cscw/yp/YP-index-type.html NCSA Web Collaboration pages, at http://union.ncsa.uiuc.edu/HyperNews/get/www/collaboration.html ------------------------------ Subject: 4.4 Database This subject deserves its own FAQ. When someone recently asked about one, Matthew.Healy@yale.edu (Matthew D. Healy) posted this answer (slightly chopped) > : Is there a CGI and Database FAQ available? > : If so, could someone tell me where can I get it? > > Dunno about a FAQ on that. I can recommend a couple of published > works, however: > > 1. I wrote a chapter about CGI/Database work for the book > {Special Edition Using CGI}. Fulltext is online at the > publisher's WWW site: > > http://www.mcp.com/que/et/se_cgi/ The book > http://www.mcp.com/que/et/se_cgi/Cgi13fi.htm My chapter on WWW/DBMS > > 2. Jeff Rowe wrote an excellent book, {Building Internet Database > Servers With CGI}. URL for more info: > > http://cscsun1.larc.nasa.gov/ beowulf/db/existing_products.html > > Jeff's WWW site has scads of useful information on WWW/DBMS programming, > and pointers to lots more sites. Matthew's CGI links page at http://ycmi.med.yale.edu/ healy/cgilinks.html expands the list, and includes links to popular packages including Bo Frese Rasmussen's WDB at http://venus.dtv.dk/ bfr/wdb/ ------------------------------------------------------------- Subject: SECTION 5 - TROUBLESHOOTING A CGI APPLICATION Since this subject is quite well conered by other documents, this FAQ has relatively little to say. Tom Christiansen's "Idiot's guide to solving Perl/CGI problems" is a slightly tongue-in-cheek list of common problems, and how to track them down. Much of what Tom covers is not specifically Perl, but applies equally to CGI programming in other languages. Marc Hedlund's CGI FAQ and Thomas Boutell's WWW FAQ also deal with this subject. See "Further Reading" below (if you don't already know where to find these documents). ------------------------------ Subject: 5.1 Are there some interactive debugging tools and services available? If you're using Perl, get Lincoln Stein's CGI.pm module. I cannot recommend this more highly: in addition to making some quite advanced perl/CGI programming as easy as HelloWorld, it offers an interactive debugging mode. http://www-genome.wi.mit.edu/ftp/pub/software/WWW/cgi_docs.html Nathan Neulinger's cgiwrap is another package with debugging aids. http://www.umr.edu/ cgiwrap/ See also the next question. ------------------------------ Subject: 5.2 I'm having trouble with my headers. What can I do? For simple cases, examining your response headers "by hand" may suffice: (1) telnet to the host and port where the server is running - e.g. telnet www.myhost.com 80 (2) Enter HTTP request. The most useful for this purpose is usually HEAD; eg HEAD /index.html HTTP/1.0 (optionally other headers) (followed by a blank line) Now you'll get a full HTTP response header back. For complex cases, such as sending a request with several headers (as a browser does) or POSTing a form, there is a free diagnosis service at the WebThing WebCentre. This will take a request from your browser (eg form inputs) and forward the identical request to your server, printing a full report of your request (request headers and form data) and the response from your server (response headers and data). http://pobox.com/ webthing/ ------------------------------------------------------------- Subject: SECTION 6 - FURTHER READING ------------------------------ Subject: 6.1 Other FAQs/collections (including online book) **** Lincoln Stein's FAQ is probably the most **** **** important WWW document you will ever read. **** Special Edition Using CGI (full book text available online) http://www.mcp.com/que/et/se_cgi/ The Web Authoring FAQ by 'Galactus' Engelfriet and John Pozadzides http://htmlhelp.com/links/wdgfaq.htm (although at the time of writing the online version appears to be a little behind the updated drafts posted). For general WWW issues, the World Wide Web FAQ by Thomas Boutell http://www.boutell.com/faq/ Another CGI FAQ, by Marc Hedlund http://www.best.com/ hedlund/cgi-faq/ Perl/CGI programming FAQ, by Shishir Gundavaram and Tom Christiansen http://www.perl.com/perl/faq/perl-cgi-faq.html The Idiot's Guide to solving Perl/CGI problems by Tom Christiansen http://www.perl.com/perl/faq/idiots-guide.html The WWW Security FAQ by Lincoln Stein http://www-genome.wi.mit.edu/WWW/faqs/www-security-faq.html The WWW Virtual Library http://WWW.Stars.com/Vlib/ ------------------------------ Subject: 6.2 Reference Pages The Common Gateway Interface (CGI) http://www.ast.cam.ac.uk/%7Edrtr/cgi-spec.html http://hoohoo.ncsa.uiuc.edu/cgi/interface.html HyperText Transfer Protocol (HTTP) http://www.w3.org/pub/WWW/Protocols/HTTP/ HyperText Markup Language (HTML) http://www.w3.org/pub/WWW/MarkUp/ ------------------------------ Subject: INDEX The index is generated from an arbitrary list of keywords. If I've missed anything obvious that should be here, please let me know. APACHE 3.11 AUTHENTICATION 3.8, 3.9 BASIC 1, 1.10, 3.7, 3.8, 3.9 BROWSER 0.4, 1.4, 2.2, 2.3, 2.8, 3.1, 3.4, 3.5, 3.6, 3.9, 3.10, 3.12, 3.14, 5.2 C 1.9, 1.10 CACHE 3.4 CERN 3.11 CGI 0.3, 0.6, 1, 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.10, 1.12, 1.13, 2.1, 2.2, 2.4, 2.8, 3.2, 3.5, 3.9, 3.10, 3.11, 3.13, 3.14, 3.16, 4.1, 4.4, 5.1, 6.1, 6.2 CGIWRAP 1.13, 5.1 CONFERENCING 4.2 COOKIE 2.4, 3.10 CREDENTIALS 3.9 CSCW 4.3 DATABASE 1.4, 2.8, 4.4 EMAIL 0.3, 0.4, 0.5, 3.3 ENVIRONMENT 2.1, 2.2, 2.8, 3.1, 3.4 FAQ 0, 0.1, 0.3, 0.4, 0.5, 0.6, 0.7, 1.7, 1.8, 3.16, 4.4, 6.1 FRAMES 3.13, 3.14, 3.15 GET 0.1, 0.5, 1.7, 2.1, 2.8, 3.1, 3.3, 3.9, 3.10, 3.11, 3.14, 3.15, 4.3, 4.4, 5.1, 5.2 HEAD 2.1, 5.2 HEADER 0.6, 2, 2.1, 2.4, 2.5, 2.8, 3.9, 3.11, 3.13, 5.2 HTML 0.3, 1.1, 1.7, 1.8, 1.13, 2.1, 2.2, 2.4, 2.6, 3.4, 3.6, 3.7, 3.8, 3.12, 4.1, 4.2, 4.3, 4.4, 5.1, 5.2, 6.1, 6.2 HTTP 0.3, 0.4, 1.1, 1.7, 1.8, 1.13, 2, 2.1, 2.2, 2.4, 2.5, 2.6, 2.8, 3.10, 3.12, 3.16, 4.1, 4.2, 4.3, 4.4, 5.1, 5.2, 6.1, 6.2 HTTPD 1.5, 1.6, 1.7, 2.1, 2.4, 2.5, 2.6, 3.5, 3.8, 3.9, 3.11 IMAGE 1.4, 3.15 JAVA 1.4, 3.7 JAVASCRIPT 3.5, 3.14 LOCATION 2.4, 2.6, 3.11 MICROSOFT 3.4 MOZILLA 3.4 NCSA 1.1, 1.5, 1.7, 2.2, 2.3, 2.7, 3.1, 4.3, 6.2 NETSCAPE 2.4, 3.5, 3.13 NPH 2, 2.4, 2.5, 2.6, 2.7, 3.9, 3.12 PERL 1.4, 1.9, 1.10, 3.14, 3.16, 4.1, 5, 5.1, 6.1 POST 0.1, 0.4, 0.6, 2.1, 2.8, 3.10 REDIRECT 2.4, 3.11 REFRESH 2.4 REQUEST 2.1, 2.2, 2.8, 3.7, 3.11, 3.15, 5.2 RESPONSE 2.1, 2.4, 2.5, 3.7, 3.9, 3.11, 3.15, 5.2 SECURITY 1.7, 1.8, 1.13, 6.1 SERVER 0.3, 1.4, 1.5, 1.7, 1.12, 1.13, 2.1, 2.2, 2.3, 2.4, 2.7, 2.8, 3.7, 3.9, 3.10, 3.14, 5.2 SSI 1.5 STATUS 2.4, 2.6, 3.9, 3.12 TCL 1.10 UNIX 1.9, 2.2, 3.6, 4 URL 1.7, 1.13, 2.8, 3.10, 3.11, 3.14, 3.15, 4.4 WWW 0.3, 0.6, 1.4, 1.7, 1.8, 1.10, 1.13, 2.4, 2.6, 2.8, 3.9, 4.1, 4.3, 4.4, 5, 5.1, 5.2, 6.1, 6.2