Adventures with the World Wide Web:

Creating a Hypertext Library Information System

By James Powell
Library Automation, University Libraries
Virginia Polytechnic Institute and State University

About the World Wide Web

There have been many flavors of the week in the Internet electronic information tool realm, from the relevance feedback Wide Area Information Server (WAIS) and multithreaded Usenet news readers to menu-based Gopher information servers. Each new development has sought to integrate the core functionality of the previous crop. While the Gopher system is probably at the peak of its popularity, the World Wide Web (WWW or W3) is now gaining in popularity as a system for publishing electronic information.

World Wide Web is a client/server information system in the tradition of other TCP/IP information systems such as Gopher. Client/server systems consist of two separate programs that communicate with each other using a protocol (a set of rules that are used to communicate requests and replies between two programs over a network). The WWW protocol is called HyperText Transport Protocol (HTTP). An information provider uses a WWW server to make available HTML (HyperText Markup Language) documents, which may contain links to other documents and resources on the network, creating a web of interconnected information spanning the Internet. Nondocument resources are supported with virtual documents. Items such as Gopher menus and WAIS indexes are converted to HTML and presented to the client application as hypertext documents on the fly. Clients are used by people browsing this information web to connect transparently to various WWW and non-WWW resources as the user selects links within documents.

The WWW supports more sources of networked and local electronic information than any other networked information retrieval tool. Supported resource types include Gopher, Wide Area Information Sserver (WAIS), World Wide Web, Network News Transfer Protocol (NNTP), Techinfo, FTP, local file systems, archie, finger, Hyper-G, Hytelnet, TeXinfo, telnet and tn3270 accessible systems. Some of these resources, such as NNTP and the Gopher protocol, are supported directly by the WWW protocol. Others are supported by gateways that convert HTTP queries to the appropriate format for resources such as WAIS. WWW was developed at CERN, the European Particle Physics Laboratory in Geneva, Switzerland, to link collections of documents on high energy physics located around the Internet into one huge resource. It relies on many powerful standards to provide this information [1,2]. One such standard is SGML (Standard Generalized Markup Language). WWW documents are marked up using an SGML tag format called HyperText Markup Language (HTML). As with other SGML formats, this language defines the structures of the document. It also provides a standard method for linking various documents together using an anchor tag, which might point to another place in the document, another document, another type of electronic resource, an image or even an audio file. These resources can be located on the same computer or on another continent. Documents can be completely marked-up to take advantage of the maximum benefits of hypertext or simply encapsulated as ASCII documents using the <PRE> preformatted tag along with a brief header [3].

Another powerful standard utilized by the WWW is the Uniform Resource Locator (URL). URLs allow the web to have dynamic content and span the globe. The URL for WWW documents consists of four parts: protocol; Internet name and port; and document path; and filename (or an optional anchor name). For example, http://borg.lib.vt.edu:80/z-borg/www/scholar.html is the URL for an HTTP-accessible document called scholar.html located on borg.lib.vt.edu in the directory /z-borg/www. As previoiusly mentioned, WWW links need not point to documents, they can point to indexes into databases, Gopher menu systems, directories, Usenet news groups, WAIS sources, or telnetable library catalogs using URLs:

FTP: ftp://wuarchive.wustl.edu/pub
telnet:: telnet://vtls.vt.edu
WWW:: http://hoohoo.ncsa.uiuc.edu:80/
Gopher:: http://gopher.umn.edu/
NTTP:: news:alt.hypertext
WAIS:: wais://quake.think.com

WWW takes care of presenting such dynamic data in a hypertext view. The user is able to use all these various resources without learning the intricacies of each new system [4].

Fortuitous Hypertext Experience

This article resulted from an attempt to accomplish a task not directly related to creating a library information system: creating a hypertext version of an electronic journal issue. Converting a document you did not write to hypertext is no easy task. While the document structure may be fairly obvious, putting hypertext links in places that would be useful to a reader is much more difficult. I discovered this while trying to convert an issue of an electronic journal that the Scholarly Communications Project of Virginia Tech publishes, The Journal of the International Academy of Hospitality Research, to a hypertext document using a simple SGML (Standard Generalized Markup Language tag set, the HyperText Markup Language (HTML). I fount it very difficult to mark up the journal without delving deeply into its content. At the same time, I was learning HTML, and making little progress toward either goal. I had placed our collection of electronic journals, which include the Community Services Catalyst, the Journal of Technology Education, and Modal Analysis, into the World Wide Web (WWW) by creating an entry document (Figure 1) and placing links from the document titles to our Gopher server. But the full benefit of HTML and the WWW are apparent only when a document is marked up in some content-significant, structured fashion. While refining this entry page it occurred to me that a hypertext library system might be a viable alternative to the more common but less flexible Gopher menu system that libraries such as ours have put in place. From my experience with our library Gopher system, I knew that it would be easier to create a prototype hypertext library system than to continue struggling with marking up an electronic journal. Indeed, it turned out to be a tremendously useful learning experience, and resulted in the development of a fully functional library information system.

Developing a Library System

I decided that this basic library information system would include the following capabilities and resources:

(1) the ability to access documents stored locally,
(2) the ability to access documents and resources located on other computer systems provided in a variety of formats by a variety of information servers,
(3) the ability to accept information from the patron for information requests such as ILL or renewal requests,
(4) the ability to access the local library OPAC, other library OPACs, and other information services available via telnet on the Internet, and
(5) a basic document classifying some diverse information resources by general subject classifications.

I selected these capabilities based on the experience I gained helping to set up our current Gopher-based information system and my experience as a user of other information systems on the Internet. WWW provides a concise, standard approach to accessing all of these resources.

I started with an entry page, a place where most users would begin (Figure 1). This page described the library and its branches, containing links to more information about the library and a link to a page listing electronic resources users could access (Figures 2-3). I marked up the information documents in the HTML language using an HTML editor on a NeXT workstation. A few of the latest extensions to the HTML language are not supported by the NeXT editor, such as the <IMG SRC> tag used by some clients to retrieve and display images alongside text, so I added these sections using the UNIX vi text editor. I added other documents that were originally created for inclusion in the help sections of our local OPAC. These documents were already formatted for the typical screen display, so I merely places HTML headers on the documents and enclosed the text in the <PRE> (preformatted) tag. Providing access to remote documents was simple. Using the <A> (anchor) tag, I created appropriately descriptive links with the document specified using its URL, like this link to a list of information services on the Internet:

<A HREF="http://borg.lib.vt.edu:80/z-borg/www/library/information_services.html>Internet Information Services</A>.

Much of the resource page was constructed in this way, including links to Gopher systems, to other WWW servers, to OPACs and to forms for patrons [5].

Tackling forms was probably the most difficult part of the project. Support for forms is promised ini HTML+, but is not yet available. Fortunately, WWW is easily integrated with other information servers, such as Gopher, so there were other options available. I created forms using the Perl-based goforms package developed to create telnet-accessible forms for Gopher systems. Each form is actually called by a Perl script when a user telnets to a specific port on the server. The user enters a telnet session transparently and is prompted for his input line-by-line until the form is complete. The forms I put in place include requests to renew checked out items, Interlibrary loan requests (Figure 4), and requests for documents stored off-site. I also used telnet to connect to our library OPAC from a link:

<A HREF=telnet://vtls.vt.edu>Access VTLS</A>

I found that the part of the system that demonstrates one of the areas where librarians could make the greatest contribution to the Internet was the document listing resources by subjects. Here I listed a few subjects with links to various resources, such as Gopher servers, other WWW hypertext documents, or Usenet newsgroups related to each subject (Figure 5). I was able to point links to specific parts of remote Gopher trees and to specific newsgroups within the Usenet hierarchy. In this way, seemingly unclassifiable collections of information such as Usenet news and Gopher, could be presented in an organized manner.

Because of the power of HTML, HTTP and the WWW client/server architecture, I was able to create a highly configurable, fully functional electronic library in a very short period of time. Depending upon the client used, the system also supports a range of multimedia data, as well as virtually every resource type available on the Internet. Best of all, a user of this system would never need to be aware of the variety of resources he was actually using.

With a Library Single Menu System based on Gopher already in place, this system duplicated some of the capabilities already provided elsewhere. However, the presentation of information appeared much more natural to me. The user is given considerably more information upon to which to base his selection of a link. This is in sharp contrast to the 70 characters Gopher provides for a menu item, and is very important when sifting through data located around the world. And, thanks to an advanced client application created by the National Center for Supercomputing Applications (NCSA) called X Mosaic, the final product we can provide to patrons in the library and around campus is much more powerful and easy to use than any Gopher system.

X Mosaic is an X Windows application capable of serving as a client program to World Wide Web servers. The user selects links, which appear as underlined text segments, with a mouse. It supports some multimedia data, such as embedded graphics, on its own and other formats, like PostScript and audio, by calling other X applications. The list of formats this client can support is open-ended as they are user-defined by utilizing the .Xdefaults file to specify what applications will be responsible for presenting the data. While X Windows is the only platform supported currently, NCSA plans to port X Mosaic to Microsoft Windows and the Macintosh environment. Other WWW clients are available, including a VT 100 client, a NeXT client, a Macintosh client, a Windows client called Cello, and another X Windows client.

By relying on X Mosaic as a primary client for in-house workstations, the possibilities for a hypertext electronic library are boundless. Special Collections could provide access to fragile material as scanned pages. Information terminals could be set up in the library lobby with floor maps and hypertext descriptions of the materials and resources located on each floor. Terminals in different parts of the library could present an introductory page appropriate for the type of materials located on that floor. Patrons using the library OPAC could view help information with X Mosaic while accessing the OPAC in a separate window (Figure 6). Labs could be set up in the media center to provide access to audio resources, such as Internet talk radio, graphics and motion video segment, and custom hypertext applications for courses taught on campus.

Why Not Use Free Software?

Libraries are understandably hesitant to rely on public domain solutions. But while some vendors attempt to keep up with all the latest electronic formats in an effort to continue to make sales to libraries, free technology that provides many of the same features is often overlooked because it must be sought out, installed and tested. These free systems often bring together many current capabilities into one easy to use system, such as Gopher menus. With rapid advances in technology, such as the CD Interactive (CD-I) format now becoming popular in academia with the release of titles covering everything from Jane Goodall's years of research into chimpanzee behavior to archives of antique scientific tools, it is highly unlikely that new advances such as these will be integrated into any system a library might purchase. Facing an uncertain future where proprietary information systems will continue to be released, it is in the best interest of libraries to try out freely-available information technology before jumping into the void of commercial systems. Powerful new systems produced by the application of new developments in computer science abound on the Internet. Locally, or nationally funded development groups provide these tools and documentation free, and with virtually no licensing restrictions. They are responsive to users and are anxious to see the tools implemented in real world situations. A library could not ask for more from a commercial vendor!

The World Wide Web system is also attractive due to the potential for the appearance of value-added products encoded in HTML. SGML is an attractive publishing format due to the flexibility of the end product, and provides new opportunities for publishers. SGML text collections in the humanities are already gaining popularity. Many of these products, such as the Oxford English Dictionary and the Chadwych-Healey Poetry database, are created using the TEI (Text Encoding Initiative) tag set, a collection of over 350 tags used to mark up a variety of literary texts and aid linguistic analysis. HTML could become a low-end standard for texts such as electronic journals and nonfiction works published on CD-ROM or other media.

Why Libraries Will Outgrow Gopher

The Gopher system created and supported by the University of Minnesota provides a menu-based approach to information resources. It is extremely easy to use and not difficult to maintain. It is supported on a variety of platforms and is currently very popular on the Internet. So why shouldn't all libraries be scrambling to put up Gopher menu systems for patrons? There are several reasons. The Gopher system is very poorly documented from an information provider or user point of view. While some client software, such as PC Gopher III, include well written manuals, the UNIX client/server system includes only brief manual pages. These short documents barely touch on many of the features available in Gopher, leaving the systems administrator to read the source code or send out S.O.S. notes to Usenet discussion groups.

Another problem is that the Gopher menu system is very restrictive in the way it presents information. Adding help or introductory information must be one by adding a menu item for that purpose, since each menu item may be only one line of 70 characters and must either be a branch or an end point. This can result in some rather unnatural presentations. Imagine trying to provide abstracts to articles stored in a Gopher. The user cannot select an abstract and then move from the abstract to the article, but must move back up a branch to select the article. Furthermore, they probably have not seen enough information to decide whether or not to even read the abstract as the menu item, at only 70 characters, may or may not have been long enough to display the title. Creating meaningful menus is very challenging work and there is much debate on how to present information. Often, Gopher menus either spill over to multiple pages or are so short as to appear deceptively barren. And finally, Gopher currently lacks support for any equivalent standard network resource locator equivalent to the URL supported by WWW. It is for these reasons that I believe Gopher will eventually become merely another information resource accessible by the World Wide Web.

Go With A Hypertext Model Now

A hypertext information system requires more effort by the information provider. It is very important that authoring guidelines are established early, so that informational documents are consistent to the user. But the benefits of hypertext, such as full text and links across documents, are very useful to end users. Most software vendors now provide some type of proprietary hypertext documentation system for their software. The result is that user environments, such as Microsoft Windows, allow application users to get started with a new software package without referring to the printed manual.

Libraries have the opportunity to utilize a nonproprietary hypertext system to interconnect electronic information resources locally and provide links to a growing global network of information. This web of information would benefit enormously from the organizational and information management skills librarians could bring to it. Indeed, it is unlikely that nay of these local collections of information will become reliable resources until librarians step in and sort out what resources are available where. One has only to attempt to find remote resources through the Gopher system to experience the frustration of dealing with a system with no standard of organization. Gopher is established, and is unlikely to change. World Wide Web, while supporting access to all of the existing electronic resource types, is a new frontier. Libraries may present information in any way they like. They may choose to encode portions of their bibliographic information with links to the actual document or resource. They may create special documents as entry points for various subject areas. This system is so versatile and configurable that the only boundary is the author's imagination. Librarians working with college departments in their subject areas could create powerful educational tools for students that can significantly enhance their access to course related materials [6].

References

[1] World Wide Web. URL=http://info.cern.ch/hypertext/WWW/TheProject.html. Geneva: C.E.R.N.

[2] Berners-Lee, et al. World-Wide Web: An Information Infrastructure for High-Energy Physics. Geneva: C.E.R.N., 1992.

[3] Nickerson, Gordon. "WorldWideWeb: Hypertext from CERN." Computers in Libraries 12, No. 11 (December 1992): pp. 75-77.

[4] Berners-Lee, Tim; Jean-Fran�ois Groff; & Rober Cailliau. Universal Document Identifiers on the Network, Geneva: C.E.R.N., 1992.

[5] Berners-Lee, Tim. Style Guide. URL=http://info.cern.ch/hypertext/WWW/Provider/Style/Overview.html. Geneva: C.E.R.N., 1993.

[6] Arms, Caroline. "Other Projects and Progress." In: Campus Strategies for Libraries and Electronic Information. (EDUCOM Strategies Series on Information Technology.) Bedford, MA: Digital Press, 1990.

Reference items listed with URLs are only available through the World Wide Web. They may be accessed directly using WWW client software or located by browsing the C.E.R.N. resources using their public VT100 client (telnet to info.cern.ch).

Readers interested in trying out the hypertext library system described in this article may do so by point a World Wide Web client, such as X Mosaic or Cello, at http://borg.lib.vt.edu:80/z-borg/www/library/library.html.

A public accessible VT100 client is also available. Telnet to borg.lib.vt.edu, and login as library, and use the password library.

Communications to the author should be addressed to James Powell, Library Automation, University Libraries, P.O. Box 90001, Blacksburg, VA 24062-9001; 703/231-4986; Internet - [email protected].

Hosted by www.Geocities.ws