1. INTRODUCTION [TOC] This file contains a summary of the purpose of various 'transformation' scripts which have been used in developing this documentation. The scripts are generally based apon the principle that plain text is a simple and flexible format. The text files which the following scripts are designed to transform generally have some kind of 'structure', which is also sometimes called 'markup' but that this structure is allowed to be flexible (as opposed to something like XML). The reason for this is that the sort of information which the text files contain is not in itself rigidly structured and for this reason it is somewhat artificial to attempt to force the 'data' or information into some kind of rigid system. Nevertheless, the power of XML based technical documentation systems is recognised by the author of these scripts (m.bishop) and it is envisioned that scripts will be written to allow the documentation text files to be transformed into something like DocBook XML. This may be compared to the approach adopted by the maintainers of the 'TexInfo' system. While they retain their own markup system, they ensure that their markup system is transformable to XML. The alternative to this current system would be use a fully fledged DocBook documentation system. The only real disadvantage to this is that all maintainers (namely me) need to have the software installed on every computer at which they will be writing or maintaining documentation. This is not a huge hurdle. Another possible argument for retaining a 'plain plain' text data source (as opposed to a structured plain text system) is that there already exists a large set of tools for working with plain text. This tool-set may be said to diminish slightly when one has to deal with structured plain text. One other argument is that of 'universal contribution'. For example, as it currently stands, in order for somebody to make any kind of contribution to the various linux documentation projects it is necessary for that person to have installed and understand a large set of software, including CVS, an XML parser, an XML editor and possibly and XML transformer, as well as the necessary DTDS and Style sheets required to deal with the various XML formats which the different distributions of Linux use. Also the web-visitor needs to obtain a password to the main document archive. If it is considered that a normal web visitor may simple spot a spelling mistake or some other trivial error (or which there are many in the tldp), it begins to seem ridiculous that there are so many barriers to people actually contributing. If a plain-plain text information and documenation datasource is retained then these barrriers can be significantly reduced. This concept owes something to the ideas elaborated by the 'wiki' web contribution systems, although my plain text is actually less structued than wiki markup. For the uninitiated in these 'web-collaboration' ideas, a wiki uses an extremely simple markup language in order to allow any visitor to a web-site to edit the pages of that web-site. Because the markup language is so simple and can be learnt in a matter of minutes, it is possible for any web-user to contribute, rather than only those who understand HTML or XML. If an XML system is used, then either the web-visitor needs to understand XML, or there needs to me some kind of XML editing web interface (and the user will have to learn how to use this, which would not be trivial), or the user needs to have XML editing software installed on her local computer as well as some kind of 'versioning system' to allow her to access the document archive. However, with a plain-plain text system, the users contributions can be incorporated into the existing document much more easily, simply because what the contributor will be typing will be much closer in structure (ie none) to the structure of the existing document. Of course this system loses much of the power of an XML system, since the transformation scripts essentially need to 'guess' at what types of data the plain text documents contain. For example, without any markup, the scripts need to guess what sort of string constitutes a URL (in order to hyperlink the thing in the outputted HTML, for example). In some cases this 'guessing' is pretty easy (if the text begins with 'http://' for example) but in other cases, the guessing is much harder (in the case of 'relative' urls which refer to the local file system. However the philosophy behind the scripts listed below, is that the data is more important than the presentation of that data, or in other words, it is better to have a badly formatted document than none at all. An additional problem with the current approach is that the contributor or documentor cannot have very exact influence over how the document is finally presented or displayed. For example if the contributor does not want a URL to be hyperlinked then she is in trouble with the current system. The idea is simple 'Is it really that important whether your URL is hyperlinked or not?' My answer is no. 2. THE TRANSFORMATION SCRIPTS [TOC] diary2html.sh, Turns a 'diary' style text file into HTML linkdoc2html.sh, Turns a text file which has a list of URL links and descriptions into HTML linkdoc2html-index.sh As above but also adds an HTML 'table of contents' for possible 'section headings' linkdoc2html-forum.sh Turns a test file with a URL list into an HTML file which has the capability to be contributed to by a web-visitor (using cgi-scripts) plaintext2pdf.sh, Turns a text file into a pdf file with an optional table of contents plaintext2html-simple.sh As below, but doesn't use certain 'bash' tricks plaintext2html.sh Turns a text file with possible section headings and urls into an HTML file glossary2xml.sh Turn a text file which is a sort of 'glossary' into a dodgy xml file alphabetize-glossary.sh Re-arranges a text file which contains a series of definitions of 'items' or 'terms' so that the items are ordered alphabetically. 3. SOME CGI SCRIPTS [TOC] add-comment This is the script which works in conjuction with the '-forum' transformation scripts. It allows the user to add comments to documents. This should become a Servlet or something similar. bashmail A very dodgy way of sending an email through SMTP |