General concept guide
_________________________________________________

    1) What is the purpose of this thing?
    _________________________________________________

    Well, the general purpose is quite simply to have a program that alleviates the task of automating tests that have previously
    been done by hand. Of course this doesn't mean that just because we have a testsuite, no work needs to be put into this anymore
    at all. Testscripts still need to be programmed, debugged, and maintained. What the testsuite is supposed to be good at is really
    executing these testcases en masse, logging the results and pointing out any caught exceptions and errors to the user. Once such an
    error has been found, it is up to the user (or some other developer) to find out where exactly it occurred, what might be causing it,
    and ultimately to remove the bug from the tested program.
    Notice that at the time of this writing, the testsuite doesn't really excel at what it does. This is mostly because I've had no 
    previous experience with test automation (and even if I had, I still think this is quite a deal) and I had to read up on most things
    as I was dealing with them. Nevertheless the work done so far represents a stepping stone for further expansion and modification, and
    I think much of the groundwork has been laid, so hopefully you'll be able to get it right and make this thing really good at its mission.

    2) How is it supposed to achieve that objective?
    _________________________________________________

    The broader concept of test automation can be broken down into these individual points:

    a) The ability to execute an array of testcases (which may or may not be logically related to one another) on one or several
       computers simultaneously. Notice that while the testsuite should be run on several host simultaneously, individual testcases
       on any one host should run sequentially. Execution may be triggered from one central host (which I normally refer to as the 
       'issuing' host, because it issues a request to all the other hosts, asking them to run the testsuite with the specified parameters).

    b) Logging any output to text files. Note that there are several kinds of output to be dealt with. First off, there's whatever the 
       program that's actually being tested spits out. Secondly, there's the output generated by the testcase script written specifically
       to test that program. Finally, there's the output generated by the testsuite program itself.
       Currently the testsuite is only capable of capturing any output that is directed toward STDOUT or STDERR (that's what you normally 
       see in your DOS box or shell console if you're on a UNIX rig). This is done by redirecting the aforementioned output file descriptors
       to a text file of your own.
       If the program being tested writes output to a logfile itself I have currently no feasible way of dealing with that ... 
       unless you want to hardwire the path of the logfile into your testscript. 

    c) At a later stage it might be nice to enable the testsuite to do some basic analyzing of the aforementioned log files. Maybe point out
       at precisely which line of the log things started going wrong or something like that. As of now this is still wishful thinking.

    d) Catching any errors at runtime without sending the whole testsuite down in flames. Even if the test script contains nothing but bogus,
       the testsuite itself should not be affected by this. It should catch the error, terminate the failed testcase and go on to the next
       testcase and inform the user about what happened afterwards. Most of that is already implemented but could probably use some
       refinement.

    e) Providing a fairly tolerable user interface through which the testsuite program and the user can communicate with each other. By
       'fairly tolerable' I mean it shouldn't be overly cryptic or hard to understand. After all people using the testsuite will not 
       necessarily be privy to the internals of the testsuite, so the interface ought to be set up in a way that not only the programmer
       of the testsuite can understand. Right now, the user interface is probably one of the most lacking components of the project. 
       The only way the user can currently communicate with the testsuite is via a shell console/DOS box, meaning any paramters are 
       specified on the command line. If you have the time, you __may__ want to look into Perl/Tk, which is basically a Perl extension
       that allows you to program dialog boxes, windows and all those funky gadgets usually seen in a graphical user interface.

    f) Keeping the way the testsuite program interacts with individual testcase scripts simple and rigid. It would be particularly nice
       if this interface could be programmed in such a way that developers of testcase scripts have to worry as little as possible about
       any restrictions and requirements imposed by the testsuite program as a whole. Naturally there have to be some conventions as to
       how a testcase script must be programmed, such as "exit with status 0 if successful and non-zero otherwise", but such dependencies
       should be kept to a bare minimum (although this may become increasingly harder as the testsuite becomes more powerful and 
       configurable).

    3) What we currently have
    _________________________________________________

    Conceptually, the current testsuite can be divided into 2 major components and a bunch of minor ones. The major components are:
    * the remote host startup mechanism and ...
    * the actual testcase execution mechanism.

    The remote host startup mechanism consists of 2 perl scripts, namely main.pl and start.pl. Its purpose is to cause an instance of 
    the testsuite program to be spawned and started on a remote host reachable through the network. Here's how we do that.
    The main.pl script takes any parameters passed to it on the command line and preprocesses them (where applicable). It then figures
    out on which hosts the user wants the testsuite program to be run. Next it loops through all the specified hosts and establishes
    an ssh (secure shell - the more modern, and safer version of rsh [remote shell]) connection to each of them, one at a time. 
    In doing so, the main.pl program basically achieves a login on that particular remote host. It uses that login to start another perl
    script, the start.pl script, which in turn creates a daemon process (more on that later) and then exits immediately. Now back to
    the main.pl script. As soon as the start.pl script has finished, the main.pl script closes the currently open ssh-connection, thereby 
    removing its login on the remote host. At that point the remote host startup mechanism is done with that particular host and can now
    turn to the next one. The daemonized start.pl program then starts the actual testsuite program, represented by the testsuite.pl
    script, which (since it was started by a daemon process) is also a daemon.
    
    Now let's get back to that daemon process I've mentioned. First off, if you don't know what a daemon process is
    I'll try and explain it to you real quick (if that doesn't do the trick, I guess there's always Google ;-)). A daemon is essentially
    a process which is not associated with a controlling console and is the session leader of its process group. Since it doesn't have
    a controlling terminal, it said to run 'in the background' (this is the same as Microsoft's Windows services). Daemons normally run
    silently, meaning they don't write stuff to the standard output channels (STDOUT and STDERR). Usually STDOUT and STDERR are simply
    redirected to a logfile. Notice that since file descriptors are inherited from the parent process to its children, any programs
    started from within a daemon whose output has been redirected to a logfile will also print to that logfile. This has the added
    bonus that the testcase scripts can be programmed to print all their output as if it would go straight to the console/DOS box,
    but in reality it will end up getting dumped into a logfile.
    Anyway, no controlling terminal means that the ssh-connection that was used to start the program can be discarded even though 
    the actual program keeps on running. If we hadn't used a daemon process we'd be unable to close the ssh-connection and it'd have 
    to remain open for the entire duration of the testsuite program instance on that host. Of course we don't want that, so that's 
    why we're using daemons.

    In my opinion the startup mechanism can be optimized somewhat by removing the script start.pl and shifting its code to either
    the testsuite.pl or the main.pl script, thus saving one invocation of the perl interpreter to run the script. This is not 
    too much of an issue, though, so if you feel the mechanism is fine the way it is, just leave it that way.

    That's that, next is the testcase execution mechanism. It pretty much simply iterates through the list of testcases to be run
    and runs them, albeit sequentially, meaning the next testcase will only be started once the previous has finished. The testsuite
    program checks the exit status of the testcase program scripts to determine whether that particular test was successful or not. In
    usual UNIX manner, an exit status of zero indicates success while anything else points to an error.

    So that's the two major components of the current testsuite. Of course there are also a multitude of smaller components. Among 
    these minor components are:
    * host selection mechanism (run on which hosts?)
    * testcase selection mechanism (run which testcases?)
    * parameter preprocessing (parse command line, canonize relative pathnames and things like that)
    * testcase config file parser (extract config info from the XML config file for an individual testcase)

    Let's go into detail about each one of those.

    A) The host selection mechanism
    _________________________________________________

    In the testsuite program's main directory, take a look in the subdirectory './Hosts/KnownHosts'. In there you should find a file
    'AllKnownHosts.xml'. This is an XML-document (XML is sort of like HTML, just different ;) ... not much, though) listing all
    known reachable UNIX hosts in the net, along with any options. Currently there really aren't any options, but if in the future
    the need arises to provide parameters for each host in the list, you can simply add them there. The first host in the list (athen)
    has a pair of unused parameters, just to demonstrate how you'd specify stuff should you ever need to.
    Initially I used my own hand-rolled parsers in the testsuite program, but I later decided to use XML for all but the most trivial
    tasks for two reasons. Firstly, XML is a mature, thoroughly standardized and well-tested technology. There are many parsers 
    available for C++, Perl, Python and just about any other programming language you can ever think of. So instead of writing your
    own potentially bug-infested parser and spending hours debugging it each time you add more options to your config files, why not 
    just use something that's already there.
    Besides, XML is standardized. It's the same everywhere. Anyone who understands XML will have little or no trouble understanding
    your config file. The same might not be true if you were to come up with a file format of your own. Currently we're using the 
    Perl module XML::Simple to do our parsing (you can download this module at CPAN). As the name implies, this module is extremely
    simple and easy to use. But there's a drawback. It's also not quite as powerful as I'd like it to be. Initially I wanted to use
    another module called XML::Parser::Checker. The reason is that XML::Parser::Checker is a validating parser (meaning it uses
    document type defintion - or DTD - file to check the parsed XML document for logical soundness) whereas XML::Simple is not.
    Unfortunately I couldn't get XML::Parser::Checker to work on the system because a lot of the modules required to make the parser
    work are not preinstalled on the system, and its dependencies are quite numerous. Every time I tried to install a module that 
    was needed for XML::Parser::Checker to run it quit and bitched at me that this new module in turn also needed some other module
    to be installed on the system. Before long I gave up and went with XML::Simple instead. But I still believe that eventually
    a switch to XML::Parser::Checker would be a good idea.

    Anyway, I'm getting carried away, back to the actual host selection mechanism. Basically the main.pl script reads the 
    'AllKnownHosts.xml' document and stores the info in a nested hash, where the hostname is the key. The value is currently of no 
    concern to the program since we don't currently have any host-specific options that we could put in the config file. The extracting
    of information from the 'AllKnownHosts.xml' file is currently done by the function 'getListOfKnownHosts', defined in the main.pl
    script.

    Okay, so now we have a list of all hosts on which we could potentially run the testsuite program. But we don't necessarily want to
    run the program on all of them. Perhaps the user would like to run the testuite only on the machine named 'memel', or he might want
    to run them on all machines except those named 'kiew' and 'bonn'. This must be specified in a file called 'RunOnHosts.list' which
    is a simple text file with one element of information per line. By default the program looks for this file in a directory called 
    '.dTestsuiteConfigDir' which in turn is looked for in the respective user's home directory. Alternatively the path of the directory
    can be specified on the command line in the following form: userConfigDir = /home/yourname/somedirectory/myconfigdir.
    The format of the 'RunOnHosts.list' is as follows. There are three possible tags that can be used, and you must only use one of them
    and not a comination of several. These tags are: [Explicit], [Exclude] and [All]. The tag should be the first line in the file and
    all hosts should be listed below the tag. Any hosts listed above a tag are simply ignored. Saying [Explicit] and then listing a 
    bunch of hosts under that tag means "Run the testuite __only__ on these hosts, not anywhere else!". Saying [Exclude] means
    "Run the testsuite on all hosts __except__ the ones I've listed here!". Lastly, saying [All] means (you guessed it) "Run the 
    testsuite on __all__ known hosts.".
    The function that reads the 'RunOnHosts.list' file and does the filtering is called 'determineTargetHosts', also defined in the 
    main.pl script.

    B) The testcase selection mechanism
    _________________________________________________

    The testcase selection mechanism is the mechanism by which the user can choose which testcases to run and which not to run.
    In this sense a 'testcase' is really a directory which contains the testcase script(s) and the testcase config file in XML format.
    Testcases can be grouped if they are logically intertwined. For example, you could have a directory called 'UNIX' which contains
    three sub directories called 'KernelPatch', 'HostId' and 'Toolkit'. The latter three directories would be testcases directories
    containing the scripts and config files for these three testcases. The UNIX directory wouldn't have any specific purpose other
    than accomodating our three UNIX-related testcase directories. If you view this structure as a tree then the 'UNIX' directory
    represents the root of the tree. And in fact trees are what the testcase selection mechanism is based on. By simply specifying
    a root directory you can make the testsuite.pl program traverse the entire tree which has its root in that root dir and store
    all actual testcase directories of the tree in a vector. The function doing that is called 'getTestcaseDirs', defined in the
    testsuite.pl script. It relies heavily on a function called 'traverseDirectoryTree', defined in the perl module 
    'DirsAndFolders.pm'.
    Note that currently the testsuite treats any end-nodes it finds (and __only__ end-nodes) as potential testcase directories. For
    a directory to qualify as an end-node it must not contain any subdirectories (or even symlinks to directories). To further
    qualify as an actual testcase directory, it also must contain at least a Make.pl script and a Testcasecfg.xml file. Optionally
    it may also contain a Clean.pl file, if there's stuff to be 'cleaned up' after a testcase has been run. To be honest, I'm not
    even sure if it really makes sense to have a 'clean' action at all, but maybe there are some scenarios that need it, though
    I can't think of any off the top of my head. I'm still going to leave it in for now.
    
    By the way, I placed all perl modules (currently 'Utilities.pm' and 'DirsAndFolders.pm') in a directory called 'lib'. This 
    directory must be located in the main directory of the testsuite project. 

    Anyhow, it is possible to specify more than one testcase group (i.e. root directory) at the same time. If so, the function 
    'getTestcaseDirs' checks to make sure that none of the specified trees are subtrees of one another. This is important because
    if we would allow overlapping trees, some testcases might end up in our testcase vector more than once. Needless to say that we
    don't want that. Before I forget it, as of now the function that is used to check whether a tree is a subtree of another tree
    doesn't work for soft (symbolic) links. Type 'man ln' in a UNIX shell if you need to read up on what symbolic links are. To make
    a long story short, a symbolic link is really a file that points to another file or directory. It's kind of like a Windows
    shortcut really. Anyway, the point is symbolic links can do nasty things to our testcase selection mechanism if we aren't very 
    careful. Imagine what would happen if there was a symbolic link in our 'UNIX/Toolkit' directory that points back at the 'UNIX'
    directory. If you were unaware of soft links and wanted to descend into every node of the tree, you would probably run into an
    endless loop, because everytime you reached the 'Toolkit' folder you would try to step into the directory referenced by our 
    soft link, only to be taken directly back to the 'UNIX' folder, thereby reentering the tree. The good news is that 
    'traverseDirectoryTree' is smart enough to avoid such endless loops. However, the 'getTestcaseDirs' function is currently
    not able to recognize overlapping trees and you msy have to look into that some time later.

    C) Parameter preprocessing
    _________________________________________________

    Since the testsuite may, in the future, take a lot more parameters than it does now, I've tried to keep the process of parameter
    passing as simple and robust as possible, so that it can easily be extended when necessary. Rather than passing arguments in a way
    that makes their position the determining factor for what they mean (i.e. 1st paramater means this, second parameter means that, 
    third parameter means blahblahblah) I went for a system of key-value pairs. So instead of just passing arguments as a concatentation
    of individual values you can now write "argument1 = value1 argument2 = value2 ....". This is longer but definitely more robust. 
    Note that it doesn't matter whether the assignment operator ( '=' ) is glued directly to either the key or the value or how many 
    whitespaces there are before or after the assignment operator. The following argument assignments:
        argument1=value1
        argument1 = value1
        argument1= value1
        argument1 =value1
        argument1        =    value1

    are all the same thing really. The function that does the argument parsing in the main.pl script is called 'getScriptArgs'. It draws
    on the function 'parseCommandLineArgs', defined in the module Utilities.pm.

    I figured that putting the assignment operator between the key and the value makes the whole thing more readable to the human eye.
    After all 'argument1 = value1' looks better than just 'argument1 value1' without the '=' in between. However, the scripts start.pl
    and testsuite.pl use the latter form of argument passing (i.e. just a whitespace-separated list if key-value pairs without the '=')
    because they receive their arguments from the script main.pl and start.pl respectively rather than from a human user, which means
    increased readability is not much of an issue here. 

    D) The testcase configuration file parser
    _________________________________________________

    This component also makes use of an XML parser to extract the information stored in the configuration files and put them in a 
    nested hash. There's not much to say here. Have a look at the function 'readConf' in the testsuite.pl script to learn more about
    what exactly is done to extract the information from the XML document. By the way, a testcase's configuration file must reside
    in the directory of that particular testcase and it must be named 'Testcasecfg.xml'.


   

    

 
    
    
    
    
    
    



