
 = A Summary of some scripts

INTRODUCTION
  
  This file contains a summary of the purpose of various 'transformation' scripts
  which have been used in developing this documentation. The scripts are generally based 
  apon the principle that plain text is a simple and flexible format. The text files which
  the following scripts are designed to transform generally have some kind of 'structure',
  which is also sometimes called 'markup' but that this structure is allowed to be 
  flexible (as opposed to something like XML). The reason for this is that the sort of 
  information which the text files contain is not in itself rigidly structured and for
  this reason it is somewhat artificial to attempt to force the 'data' or information into
  some kind of rigid system. 

  Nevertheless, the power of XML based technical documentation systems is recognised  by
  the author of these scripts (m.bishop) and it is envisioned that scripts will be 
  written to allow the documentation text files to be transformed into something like
  DocBook XML. This may be compared to the approach adopted by the maintainers of the 
  'TexInfo' system. While they retain their own markup system, they ensure that their
  markup system is transformable to XML.

  The alternative to this current system would be use a fully fledged DocBook documentation
  system. The only real disadvantage to this is that all maintainers (namely me) need to
  have the software installed on every computer at which they will be writing or 
  maintaining documentation. This is not a huge hurdle. Another possible argument for retaining
  a 'plain plain' text data source (as opposed to a structured plain text system) is that
  there already exists a large set of tools for working with plain text. This tool-set
  may be said to diminish slightly when one has to deal with structured plain text.

  One other argument is that of 'universal contribution'. For example, as it currently stands,
  in order for somebody to make any kind of contribution to the various linux documentation
  projects it is necessary for that person to have installed and understand a large set of 
  software, including CVS, an XML parser, an XML editor and possibly and XML transformer, as
  well as the necessary DTDS and Style sheets required to deal with the various 
  XML formats which the different distributions of Linux use. Also the web-visitor needs
  to obtain a password to the main document archive. If it is considered that a 
  normal web visitor may simple spot a spelling mistake or some other trivial error
  (or which there are many in the tldp), it begins to seem ridiculous that there are so
  many barriers to people actually contributing.

  If a plain-plain text information and documenation datasource is retained then these
  barrriers can be significantly reduced. This concept owes something to the ideas
  elaborated by the 'wiki' web contribution systems, although my plain text is actually
  less structued than wiki markup. For the uninitiated in these 'web-collaboration' ideas,
  a wiki uses an extremely simple markup language in order to allow any visitor to a 
  web-site to edit the pages of that web-site. Because the markup language is so simple and
  can be learnt in a matter of minutes, it is possible for any web-user to contribute, rather 
  than only those who understand HTML or XML. 

  If an XML system is used, then either the web-visitor needs to understand XML, or there needs
  to me some kind of XML editing web interface (and the user will have to learn how to use this,
  which would not be trivial), or the user needs to have XML editing software installed on her
  local computer as well as some kind of 'versioning system' to allow her to access the 
  document archive. However, with a plain-plain text system, the users contributions can be
  incorporated into the existing document much more easily, simply because what the contributor
  will be typing will be much closer in structure (ie none) to the structure of the existing 
  document. Of course this system loses much of the power of an XML system, since the 
  transformation scripts essentially need to 'guess' at what types of data the plain text
  documents contain. For example, without any markup, the scripts need to guess what sort of
  string constitutes a URL (in order to hyperlink the thing in the outputted HTML, for example).
  In some cases this 'guessing' is pretty easy (if the text begins with 'http://' for example)
  but in other cases, the guessing is much harder (in the case of 'relative' urls which
  refer to the local file system. However the philosophy behind the scripts listed below, is that 
  the data is more important than the presentation of that data, or in other words, it is 
  better to have a badly formatted document than none at all.

  An additional problem with the current approach is that the contributor or documentor
  cannot have very exact influence over how the document is finally presented or displayed.
  For example if the contributor does not want a URL to be hyperlinked then she is in
  trouble with the current system. The idea is simple 'Is it really that important whether 
  your URL is hyperlinked or not?' My answer is no.
  
SEE ALSO

  http://www.ella-associates.org/alexis-info/docs/about-web-pages.html
    This URL contains a description of how the documentation web pages at
    ella-associates.org were created and how they can be maintained
  
THE TRANSFORMATION SCRIPTS
  
   diary2html.sh, 
     Turns a 'diary' style text file into HTML
   linkdoc2html.sh,
     Turns a text file which has a list of URL links and descriptions into HTML
   linkdoc2html-index.sh
     As above but also adds an HTML 'table of contents' for possible 'section headings'
   linkdoc2html-forum.sh
     Turns a test file with a URL list into an HTML file which has the capability
     to be contributed to by a web-visitor (using cgi-scripts)
   plaintext2pdf.sh,
     Turns a text file into a pdf file with an optional table of contents
   plaintext2html-simple.sh
     As below, but doesn't use certain 'bash' tricks
   plaintext2html-forum.sh
     Turns a text file with possible section headings and urls into an HTML
     file which can be contributed to via the 'add-comment' cgi script
   plaintext2html.sh
     Turns a text file with possible section headings and urls into an HTML file
   resume2html.sh
     Turns a text document which is essentially a personal resume or
     'curriculum vitae' into an HTML file
   resume2html-forum.sh
     As above, but allows the web-visitor to contribute comments to the resume
     via the add-comment cgi script.
   glossary2xml.sh
     Turn a text file which is a sort of 'glossary' into a dodgy xml file
   alphabetize-glossary.sh
     Re-arranges a text file which contains a series of definitions of 'items' or 'terms'
     so that the items are ordered alphabetically.


SOME CGI SCRIPTS

   add-comment
     This is the script which works in conjuction with the '-forum' transformation scripts.
     It allows the user to add comments to documents. This should become a Servlet or 
     something similar.

   bashmail
     A very dodgy way of sending an email through SMTP

   check-hotmail
     This script determines whether there are invalid Hotmail email addresses
     in amongst a list. It does this without sending 'test emails' to the
     addresses.
