
WORDPERMS.exe    
=============


EXAMPLE OF HOW TO RUN:

  [open a Windows "Command Prompt", and change working directory to the location of your inputfile]

    wordperms myinputfile.txt
 
  Or to capture the output into a file:
   
    wordperms myinputfile.txt > results.txt
  

RECORD (LINE) DELIMITERS:

  these are translated to generic word-delimiters.
  The longest permissable single line = 100,000 chars  


WORD DELIMITER CHARACTERS:
 
   The following chars are all considered to be word-delimiters:
   space  ' ' 
   comma  ','
   period '.'
   minus  '-' 
   equals '=' 


INPUT CLEANING THAT THIS PROGRAM PERFORMS PRIOR TO PROCESSING:

 * strings such as "{plant}" and "{figure}" are stripped out if found in the input file as they are not part of the text proper.

 * Page/Line Identifiers such as "<f26r.P.1;H>" are removed, if they are found at the start of the input line.

 * leading whitespace chars on each line (spaces or tabs) are stripped out.

 * Any doubled space characters are translated to single spaces.

 * Uppercase 'A' to 'Z' are translated to lowercase. 


LIMITS AND ASSUMPTIONS:

 * There are no pipe symbols ('|') in the input file 
 * Any wordpairs longer than 63 characters long are ignored.


SPEED OF OPERATION:

  this program has not been thoroughly optimized for speed....and it gets quite slow for bigger samples. 
  An input file of 200kb should take about 20 seconds to process on yer average PC for all the fixed distances from 1 to 40.
  But the time the program takes is proportional to the size of the average wordpair vocab multiplied by the size of the input file.
  For example: a 2Mb English file could well take over an hour!
 

KEY TO OUTPUT VALUES:

  dist        : the fixed distance (in words) between words used to make the wordpairs.... 1="natural, adjacent words"
  numpairs    : the number of unique wordpairs formed
  reversible  : the number of the unique wordpairs for which the reverse-order pair also exists
  asd         : the average squared difference in frequency between the forward wordpair and the reverse wordpair
  asf         : the average squared frequency of the wordpairs


AUTHOR: 

  This program was written and is maintained by Marke Fincher ("marke@capscan.com" or "marke@sphere.me.uk").
  
  Please use it however you like,  but if you should publish any results generated from it outside of the "Voynich Mailing List" please make some reference to its use and its authorship.

 




















