wordseq.pl - Word sequence analysis
perl wordseq.pl [filespec]
The input for the program must be given as words, every word in a single line. The program creates a simple word sequence statistic, each word is followed by the following words and their frequencies. It is written for the study of the Voynich manuscript.
--nosingles
or -s
For each word in the input a line of output will be generated.
The word is the first token, it is followed by a colon.
After the colon there is an entry for each following word, sorted by decreasing frequency of occurence. The entry contains
The line is closed by an equal sign.
After the equal sign follows the number of displayed words and,
if you use the -s
option, the number of suppressed words in
curly brackets.
The last character of the line is the newline character.
The output format is generated in a way which allows you to pipe it
to the showeva.pl
tool, so you can read it in the EVA font. For
this usage the frequencies are formatted as EVA inline comments.
perl viat.pl -t H | perl vword.pl | perl wordseq.pl | perl showeva.pl -
Create a word sequence statistic for the VMS transcription of
T. Takahasi and show the statistic in the showeva.pl
tool.
The short source is far away from being an example of nice software engeneering. Do not read it, use it...
wordseq.pl was written by Michael Winkelmann, michael@weltretter.de