HelpView
Documentation in Current GNU/Linux Systems
On today's GNU/Linux systems, one of the most important and unresolved
issues is the documentation. There is no standard way to store (everybody
wants to store their documentation in a different place), query (currently,
only man and info pages can be searched) or display (some display html,
some display text, some store DocBook and display html) documentation.
Before making our proposal let's have a look at various help systems
on the GNU/Linux platform and discuss what's wrong with them. There is
something wrong with each; this is the reason there are so many.
-
yelp: The GNOME help viewer
yelp displays html. However, it has no searching capability. In it's
current form, yelp is not more than a pretty interface to gecko.
- khelpcenter: The KDE help viewer
This is not much different then yelp. Just an html display embedder.
No search, no documented standard.
- info: GNU's choice
info is a good help program, including a top level directory, hyperlinking
and search features.
info displays text formatted for 80 columns. Although the TeX source
for info files enable the author to publish the documentation in multiple
formats, the need for compilation of the documents make it impossible to
use info in general: you can not resize an info window and expect it
to work well (only horizontal resize is important). info can not
display graphics either, which is sometimes necessary.
- man: The Old School
man is the simplest and most widely used help system. It provides
keyword searching but that's almost all it provides. It does not
provide even hyperlinking.
What Should a Good Help Viewer Include
A good help viewer should:
- Support keyword searches: Without this, a documentation system
is not very different from a row of books on the shelf: you have to
dig thru the stuff yourself.
- Make it easy to read related docs:
In addition to linking to documents in the help system, it should
also be easy to link to web resources. Many documents contain links
to project home pages, updated docs etc. It should be easy to follow
them.
- Display graphics: Not only graphics make documents prettier,
they are sometimes a must for displaying graphs, flow charts and similar.
- Display content without any presumption of the display device:
One should be able to read docs in a window sized to his own choice.
- Provide a means to store bookmarks: It should be easy to return
to a document without remembering it's locator. Also, bookmarks should
be portable across different systems so that they can be communicated.
This necessitates the need for naming a doc independent of it's location.
- Make it easy to install/update documents.
- Startup fast: Waiting for documentation to display is annoying. Especially
when you want to just lookup something.
- Be interactive when needed: A document could also be a tutorial which
can contain tests or quizzes.
- Enable scripting the documents: Sometimes documents contain
calculated tables/graphs, steps showing the execution of an algorithm
and similar. Being able to automate
these tasks with minimum effort is important to authors.
- Enable a document to contain necessary files to recreate an example:
MSDN library CDs are an example of this. When you click on an example
link, a program runs and installs the selected example on your disk.
This is a very good thing for tutorials.
- Enable the user to take notes on a subject, then read/transport them.
The following items also could be implemented:
- Users should be able to participate in the creation of the documents.
There are a number of ways a user could participate:
- By requesting a translation of a document. For many languages,
there isn't much documentation. Translation could be motivated if
many users would request a translation of a particular document.
The help system should give them a tool to do so.
- By submitting a bug fix for a document.
Proposal
File Format and the Viewer
HTML satisfies all of the above conditions that apply to the document
format. Most document writers are aware of the above necessities so there is
a lot of documentation in HTML format. Therefore we should use HTML. There
really is not much motivation to support more than one output format unless
you are a book publishing company. That can also be supported: a publisher
may do its updates on the sources used to generate the HTML and provide
the HTML to its users.
When you want to view HTML on a GNU/Linux system, you really have
only two choices: mozilla or links. Mozilla is a much capable
browser but it suffers from the many bugs, very slow startup time and
complexity of embedding. However, links is a very light-weight, simple
implementation which also works much faster than mozilla. So, I will use
links as the 'host'. As a matter of fact, I already have a learned
how to use links for the job in about 2 hours thanks to the modular
structure links was written in.
For ease of transporting and installing documents, a document should just
be a single file. With multiple files and nested directories, there are
some issues:
- The filesystem containing the document may or may not be case
sensitive to names or may not support long names. By hiding document names
from the OS, we can ensure that everything is consistent. Although
this is not a big deal if a document will be viewed only on a GNU/Linux
system, it can be a problem when viewers for the same document format
is written for other OSs.
- Sending a single file over to a friend is much easier than sending
a directory structure. Viewing such a file is also easier since you
don't have to find a place to unpack to, unpack the archive and then
point your help viewer to the unpacked document.
Because of these reasons, I will implement a document as a zip archive.
zip archives have the advantages of being created easily, providing
most of the content if there is no help viewer on a system and of course,
compression. See openoffice.org documentation for good reasons to choose
.zip format as a document format.
Search System
Each subdocument in a document should contain keyword information. These
keyword-subdocument pairs are collected into a file inside the zip archive.
This index is added to the system index when the document is installed.
Each keyword should also have a 'type'. For example, the keyword printf can exist in two documents, one inside the manual for the printf function
of libc with type cfunc and another inside the manual for the
printf program with type prg.
It should also be possible to search for words in the bodies of the
documents for very hopeless situations (i.e. user has no clue where
to start).
Browsing
A unique
identifier should be supported for each document independent of its location
in the installation. This can be easily accomplished by using hierarchial
names which also make sense for the user. For example, the document for
the Konsole program could be named kde.console or
gimp could be named gnome.image.editor etc.
These identifiers are for the system only, they should not be the name
of the programs since those can change. Normally, a user would
find a document using a query, not by trying to find it's document name.
All documents should contain an index.html which will be used
in case the user opens a document using it's filename. The filename alone
does not give any indication about which subdocument should be displayed.
Bookmark System
As said above, the bookmark system will enable the user to mark places in
documents in a consistent manner across all installations. This system
already exists in links. Just recording the URL's should be enough.
Scripting System
I think lua is the best language to
provide scripting services through. lua is designed specifically for
embedding.
Each scripted document should have a corresponding file under /scripts.
Suppose /users/add_user.html is scripted. In this case, /users/add_user.html
is fed as an input to /scripts/users/add_user.html.