CSSS Coding Competition -- October 2002


Problem 3

Assume you are given an n by n array of letters and a list of words (a sequence of letters). Your program should determine if any word from the list can be found in the array, and if so, what is the position of its first and last letters. A word can appear in the array horizontally, vertically, and diagonally, each in both directions (there is no wrap-around). A word can appear in the array more than once or not appear in it at all. The format of the input will contain:

  1. on the first line: the size of the array n followed by the n2 letters in the array;
  2. on the second line: a series of words each preceded by the number of characters in the word. The last word is immediately followed by a zero.
Here is an example input file:
5abcedgratoitboyeargotrace
3bra3cab3eat3ate3boy3rae5trace4race3rat6simple5great2go0
This file represents the array:
        a b c d e
        g r a t o
        i t b o y
        e a r g o
        t r a c e
and the words to find are: "bra", "cab", ..., "go". The output of your program should contain a line for each occurrence of each word in the array. Each line should contain the word followed by the position of its first letter and the position of its last letter. For example, with the above file, your output should be:
        bra 3 3 5 3 
        bra 3 3 1 1 
        cab 1 3 3 3 
        eat 1 4 3 2 
        ate 2 3 4 1 
        boy 3 3 3 5 
        rae 4 3 4 1 
        trace 5 1 5 5 
        race 5 2 5 5 
        rat 2 2 2 4 
        rat 5 2 3 2 
        go 4 4 3 4 
        go 4 4 4 5 

Test Data

First:

5abcedgratoitboyeargotrace
3bra3cab3eat3ate3boy3rae5trace4race3rat6simple5great2go0

Expected output:

bra 3 3 5 3 
bra 3 3 1 1 
cab 1 3 3 3 
eat 1 4 3 2 
ate 2 3 4 1 
boy 3 3 3 5 
rae 4 3 4 1 
trace 5 1 5 5 
race 5 2 5 5 
rat 2 2 2 4 
rat 5 2 3 2 
go 4 4 3 4 
go 4 4 4 5 

Second:

21trihsyffupjsizanpuoscjracyllemsoxcdtkqkhprsulhecrlnvnhoomtnveeadinjakqtkhvacesriabnzyarixtvxvnozktltqprfykglbojvcwuiznjveacrfjnonaerbeetgiarsclnkmokmnishmcvthpyefdaczyeyscahgtirctsangromnadoomjmtignnsedndndvxuakchttrobhtciaeeckllbvpuddyyenwdapyrbweqxrozorbukwttprpabazmkkolkjccmcgpeuqodnnbrafragvbqgimltpopaiisrnntnehwjbvfitdjbyeacrtwsiooizdvwxcdtuatrnnxsekrsbbzfhvukxmdsbsnneaxgqigafcbffenaqvperzmbegaknirhswmyekardehteelainebenesrsxkramerm
7maestro12yadayadayada8soupnazi5cosmo11elainebenes8costanza6kramer12aboutnothing5frank6george6newman14crazyjoedavola8uncleleo9smellycar10juniormint8thedrake10puffyshirt7thebris12poppiespizza6mickey9shrinkage8bigsalad5bania13jonvoightscar6mrpitt5puddy12steinbrenner9jpeterman0

Expected output:

maestro 19 20 13 20 
yadayadayada 8 13 19 13 
soupnazi 1 20 1 13 
cosmo 11 2 7 2 
elainebenes 21 1 21 11 
costanza 2 13 9 20 
kramer 21 15 21 20 
aboutnothing 17 13 6 2 
frank 6 20 2 16 
george 15 1 20 1 
newman 10 14 15 19 
crazyjoedavola 1 21 14 21 
uncleleo 10 20 3 13 
smellycar 2 10 2 2 
juniormint 2 1 11 10 
thedrake 20 20 20 13 
puffyshirt 1 10 1 1 
thebris 9 7 3 1 
poppiespizza 15 12 4 12 
mickey 15 7 20 12 
shrinkage 20 9 20 1 
bigsalad 11 8 4 1 
bania 12 15 16 15 
jonvoightscar 1 11 13 11 
mrpitt 7 20 2 15 
puddy 12 1 12 5 
steinbrenner 18 14 7 14 
jpeterman 16 11 8 3 

Discussion

The first snag with this problem was the typo: the example input file begins 5abced but the first row of the array is a b c d e. I don't remember whether this was Leila's mistake or I introduced it myself while preparing the questions. Anyway, someone noticed the mistake and the judges said that it was a simple typo.

Then we noticed that solutions included eat 1 4 3 2 although "eat" does not appear in the array and we wondered whether programs were just writing the expected output for the given sample, because "eat" does not occur in the array as given in the problem statement, but it does occur in the expected output. Fortunately, we realized quite quickly that "eat" does appear in the actual array derived from the input data.

The second snag was the realization that my intepretation of "strings of reasonable length" and the teams' intepretation might be different. Also, I told the invigilators to make the announcement that the words to be found had fewer than ten letters -- which is not true for the second data set! Consequently, we could not fairly use the second test data set and we were stuck with the first data set -- which is exactly the same as that given in the problem! We therefore took the risk that the programs we were testing did not consist simply of a print statement to output the given results.

The problem is straightforward but tedious. The essential difficulty seemed to me to be minimizing the code required to search in eight different directions. I solved this by writing a function look() that matches a word without actually knowing which direction it is searching in -- this information is supplied by dr and dc. look() is called eight times from each position in the letter array.

A smaller difficulty is reading the embedded numbers in the strings. I wrote another function, getnum(), to handle this.


Code

#include <iostream.h>
#include <fstream.h>
#include <strstrea.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>

const char EOS = '\0';

ifstream in;
ofstream out;

int getnum()
{
   int n = 0;
   char c;
   while (1)
   {
      in >> c;
      if (!in || !isdigit(c))
      {
         in.putback(c);
         return n;
      }
      n = 10 * n + c - '0';
   }
}

char a[100][100];
int size;

void look(char *w, int r, int c, int dr, int dc)
{
   int nr = r;
   int nc = c;
   char *nw = w;
   while (true)
   {
      nw++;
      if (*nw == EOS)
         break;
      nr += dr;
      nc += dc;
      if (nr < 0 || nr == size || nc < 0 || nc == size || *nw != a[nr][nc])
         return;
   }
   out << w << ' ' << r << ' ' << c << ' ' << nr << ' ' << nc << endl;
}

void main(int argc, char *argv[])
{
   in.open(argv[1]);
   out.open(argv[2]);

   size = getnum();
   int r, c;
   for (r = 0; r < size; r++)
      for (c = 0; c < size; c++)
         in >> a[r][c];

   char wds[100][50];
   int nw = 0;
   while(true)
   {
      int len = getnum();
      if (len == 0)
         break;
      for (int i = 0; i < len; i++)
      {
         char c;
         in >> c;
         wds[nw][i] = c;
      }
      wds[nw][len] = EOS;
      nw++;
   }
   in.close();
   
   for (int w = 0; w < nw; w++)
   {
      for (r = 0; r < size; r++)
         for (c = 0; c < size; c++)
            if (wds[w][0] == a[r][c])
            {
               look(wds[w], r, c, -1, -1);
               look(wds[w], r, c, -1,  0);
               look(wds[w], r, c, -1,  1);
               look(wds[w], r, c,  0, -1);
               look(wds[w], r, c,  0,  1);
               look(wds[w], r, c,  1, -1);
               look(wds[w], r, c,  1,  0);
               look(wds[w], r, c,  1,  1);
            }
   }
   out.close();
}
Hosted by www.Geocities.ws

1