Shortest Path Algorithms

Hi, my name is Lee Kin Seng, from Singapore. Please email me if you have any suggestions on my web-site. I still strive to figure out how to derive the equations. But I really busy at my work, so I may not have time to do it. Thank you.

Although human eye can understand the graphs above or might even roughly guess out the shortest routes, but on the other hand computer can't. Therefore we need to convert all the locations from the graphs to lists such as below so computer can recognize.

Before goto the main program, few points need to be emphasized here. Route maps sketched by hand are unreliable, in the sense that from the same routes list, we can product millions of different shape route map graphes. Therefore, doing the path searching using the graphes will mislead you. (except Country road maps)

Another point is Numbering is vital to the programs. Numbering is very objective, depend totally how people sketch their route map graphes. Therefore, different people will produce their own Numbering. However bad Numbering will dramatically increase the workload, and proper Numbering will minimize the job. Graphs with numbering not only acted as labels, but also at the same time ranking the nodes, define nodes' priorities. Hence, we can construct a priority queue. Here, smaller numbering indicate higher priority.

Always put Numbering direction x align with the direction pointing from Starting Point to Destiny. Labeling for axis y can be either clockwise or anti-clockwise. Henece, we can gradually search from breadth circular to depth circular (imagine from inner core to outer layers). In some cases, to simplify problems, we can treate them as individual layers. We can do paths search either from source to destiny or vice verse. That's mean the other way, start search from destiny to source, just that in priority queue, bigger numbers will have higher priority (depth to breadth).Orientation will be opposited in this case. Overall, the method remain unchanged. Therefore, there will be no different in term of computation efforts.

As mentioned, labellings are very objective. If you think circular labellings are ineffective, you can just label the nodes that you think are appropriate. Thus, alter the priority queue sequence at your will. However, on the other hand, in the event of we tackling with the very big graph (e.g. world flight routes), simple visual checking & labelling no longer work properly. Detail on how to label nodes will be explained later.

All the edges are vectors. Therefore weights can only be an absolute value. If not, the search program will be trap inside the loop.

The worst case scenario will be the shortest path always oppose axis xand y direction, which will significantly increase the workload. Same nodes will be path searches again and again (repeated). In an ideal worst case condition, running time will be = n + (n-1) + (n-2) + ... + 1 = S_i=1ⁿ i, where n = total routes. However in the real situation, worst runtime will be much lesser than the above. Because there are no way it repeated linearly from start till the end.

Therefore, proper Numbering is important.

In real world, most of the path problems are oriented. e.g. traffic roads have single way or two way, broadband connections have faster download speed and slower upload speed. Therefore, it is practical to define orientations to the arcs. Where hereby we call it oriented graph or directed graph. There is a difference between the above 2 cases, traffice roads problem is a single edge problem, where vertices n to n' or the opposition share the same weights. Broadband connection is a double edges problem, vertices n to n' and the opposition don't share the same weights. However, these don't matter. We treate all the problems as the all-pairs shortest path problem.

The ideal case will be the search function only execute n times, where n = total routes. Make sure to check whether the routes are single way or two ways. Two ways that's mean there are 2 routes, and the search function will has to execute from the both nodes.

The subpath of the shortest path itself is a shortest path. Hence, any routes changes in the subpath wouldn't affect the shortest path itself. However, we still need to re-calculate all the 'Total Shortest Distance' (Since the routes changed). That's mean, we need to do the search from that point again (by pushing nodes back to the queue).

#include <stdio.h>

#include <iostream>

#include <limits>

#include <malloc.h>

using namespace std;

2) Include librabries to construct priority queue

#include <queue>

#include <vector>

#include <deque>

#include <list>

3) Library <limits> included so that we can defined our Symbolic Constant integer, INFINITY = 2147483647

#defineINFINITYnumeric_limits<int>::max( )

4) Library <malloc.h> for creating dynamic memory allocation. Our Structure Definitions:

struct Graph {

int num_connections; // Number of node's edges or nodes' connection

int shortestDistance; // Total shortest distance from source to respective nodes
int prevPath;

struct {

int weight; // Weights of node's edge

int dest; // The node where edge heading to

} edges[5];

};

typedef struct Graph graph;

typedef graph *graphPtr;

Hereby, presumed maximum number of edges' is upto 5. Therefore we can create our graph using our Structure Definitions:

graphPtr myGraph; // create 'myGraph' pointer to store graph data

myGraph = (graphPtr)calloc(nodecount, sizeof(graph)); //create and allocates dynamic arrays

5) Find out how many nodes in the graph

int nodecount = 30

6) Initially, we define source's shortestDistance to be 0. And all the other nodes' shortestDistance to be infinity, representing the fact that we do not know any path leading to those nodes. When the algorithm finishes, shortestDistance will be the cost of the shortest path from s to v or infinity, if no such path exists.
By the same time, we set all the nodes' previous path as infinity/no defined yet. Therefore we set them '-1' to show that nodes are not connecting to anything other nodes.

if(myGraph != NULL) {
        for(i=0; i<nodecount; i++) {
                if( i == 0 )
                        myGraph[i].shortestDistance = 0;
                else
                        myGraph[i].shortestDistance= INFINITY;

                 myGraph[i].prevPath = -1;
        }
        cout << "Graph initialize established.";
}
else
        cout << "Graph not established. No memory available.\n\n";

7) For conveniences, store the Routes list in a file (.txt, .db). Retrieve it:

FILE *cfPtr;

if( (cfPtr = fopen("Fig2.txt", "r")) == NULL)

cout << "File couldn't be opened\n" << endl;

else {

//build the graph

for(i=0; i<nodecount-1; i++)

fscanf(cfPtr, "%d%d%d%d%d%d%d%d%d%d%d", &myGraph[i].num_connections,

&myGraph[i].edges[0].dest, &myGraph[i].edges[0].weight,

&myGraph[i].edges[1].dest, &myGraph[i].edges[1].weight,

&myGraph[i].edges[2].dest, &myGraph[i].edges[2].weight,

&myGraph[i].edges[3].dest, &myGraph[i].edges[3].weight,

&myGraph[i].edges[4].dest, &myGraph[i].edges[4].weight);

fclose(cfPtr);

cout << "Graph establish.\n";
}

8) Using 'prevPath' is more efficient than stack, although C++ does come along with stack class. But nodes' shortest routes often repeated (As there are only one 'shortest path'). Therefore better to create a custom 'stack' rather than using C++ stack. By implementing 'prevPath', we only create 'nodecount' number of nodes.

Fig. 3: Custom 'stack' inside using 'prevPath' for Fig.1

7) Nodes' ranking is the core of our algorithm. Bad nodes' ranking will severely affect the algorithm performance.Therefore we need a queue that sort nodes according to its ranking, so that some nodes will have higher privilege over the rests. Those nodes that are more possible to be the shortest path. We call it 'Priority Queue'. Higher priority nodes will be extract first. It doesn't necesary to be minimum, can be either maximum numbering, depend whether is source-to-destiny search or destiny-to-source search.

C++ does come along with 'priority_queue' library. However we want a queue with sorting ability but not repeat. Therefore, we need to add additioanl code to the 'priority_queue'.

There are two ways to initialize a priority_queue. One is to push all the nodes into the queue, the other one push only source node '0' into the queue. The latter one is better as to reduce repeated nodes.

priority_queue <int, vector<int>, greater<int> > q; // Here we define our priority_queue as minimum extract first

q.push( 0 );

9) Here is the main operation of the algorithm

int top = -1, sum;

while( !q.empty( ) ) {
        if( top == q.top( ) )
                q.pop( );
        else {
                top = q.top( );

< div >                 q.pop();
                for(i = 0; i<myGraph[top].num_connections; i++) {
                        sum = myGraph[top].shortestDistance + myGraph[top].edges[i].weight;
                        if(myGraph[myGraph[top].edges[i].dest].shortestDistance > sum) {
                                myGraph[myGraph[top].edges[i].dest].shortestDistance = sum;
                                myGraph[myGraph[top].edges[i].dest].prevPath = top;

                                q.push(myGraph[top].edges[i].dest);
                        }
                }
        }
}

The basic operation of the algorithm is edge relaxation: if there is an edge from u to v, then the shortest known path from s to u can be extended to a path from s to v by adding edge (u,v) at the end. This path will have length d[u]+w(u,v). If this is less than d[v], we can replace the current value of d[v ] with the new value.

Edge relaxation is applied until all values d[v] represent the cost of the shortest path from s to v. The algorithm is organized so that each edge (u,v) is relaxed only once, when d[u] has reached its final value.

Always search from the top of the 'priority_queue', where the node has the highest priority than the rest. Keep-on doing untill 'priority_queue' is empty. When the nodes are removed from the top of 'priority_queue', nodes are relaxed.

When d[w] != INFINITY and d[w] > d[u]+w(u,v), push u back to the 'priority_queue'. Because hereby, the shortest path is opposing axis xand y direction. (An example of bad labelling)

There are many ways to interpret the method. This is one of the ways which I put everything into one file.

Firstly, Visual C++ 'priority_queue' nodes are repeatable. Therefore additional codes are added to eliminate repeated computations. New variable 'top' was added to record-down previous top queue. If "top" equal to the current top queue, then the program skip the computation. Everytime, "top " will record-down the current top queue for the next computation.

10) Print the shortest path out:

for(i=0; i<nodecount; i++) {
        if(myGraph[i].prevPath == -1)
                cout << "Shortest path from node \'0\' to node \'" << i << "\' is not yet being found\n";
        else {
                cout << "Shortest path from node \'0\' to node \'" << i << "\':\n"
                        << i << " --> ";

                int preRoute = myGraph[i].prevPath;
                while(preRoute != -1) {
                        cout << preRoute << " --> ";
                        preRoute = myGraph[preRoute].prevPath;
                }
                cout << "End\n";

}

cout << "Shortest distance is " << myGraph[i].shortestDistance << "\n\n";

}

The disadvantage of this method is we need to calculate all the node's shortest path from source in order to get to our destiny, since pc can't predict which way is the shortest path. And we never know which way is the shortest routes.

Following will illustrate how to avoid such problem, so that we can 'stop' half-way through instead of search all the nodes. The trick is label the nodes probably. This method wouldn't be so obvious in small graph. But will be good in big graph.

Sample code:
Version 1: my13.cpp
Version 2: my11.cpp
Version 3: my14.cpp
Version 4: my16.cpp priority_queue1.cpp priority_queue1.h record_routes.cpp record_routes.h Fig2.txt
Version 3: my17.cpp

Self Generate Graph Code

For graph without discrete or distinct nodes, it is rational to make the graph in matrix form.

Below is the program to generate a matrix graph:

1) Create the Structure Definitions for our nodess

struct Node {
   int x, y;
    int num_connections;
    int shortestDistance;
    int prevPath;
    struct {
        int weight;
        int dest;
    } edges[4];
};
typedef struct Node node;
typedef node *nodePtr;

node *myGraph = new node[nodecount];

2) Initialize the graph,

if(myGraph != NULL) {
    for(i=0; i<nodecount; i++) {
       if(i == 0)
           myGraph[i].shortestDistance = 0;
        else
            myGraph[i].shortestDistance = INFINITY;

        myGraph[i].prevPath = -1;
        myGraph[i].num_connections = 4;
    }

cout << "Graph initialize established." << endl;
}
else
cout << "Graph not established. No memory available.\n\n";

2) Below is the program to auto generate the graph we wants

i = 0;
myGraph[0].x = myGraph[0].y = 0;
myGraph[0].edges[0].dest = 1;
myGraph[0].edges[1].dest = 3;
myGraph[0].edges[2].dest = 5;
myGraph[0].edges[3].dest = 7;

myGraph[1].x = 1;
myGraph[1].y = 0;
myGraph[1].edges[0].dest = 10;
myGraph[1].edges[1].dest = 2;
myGraph[1].edges[2].dest = -1;
myGraph[1].edges[3].dest = 8;

int myGraphLeft = nodecount-1;
int t=1, layer=1;
i= 2;

rightFrame:
myGraph[i].x = layer;
myGraph[i].y = layer - 1 - t;

myGraphLeft--;

if(myGraphLeft <= 0)
goto stop;< BR >

     else if( myGraph[i].y == -layer) {
             myGraph[i].edges[0].dest = (2*(layer+1)-1)*(2*(layer+1)-1) + 2*(layer+1)-1 - 1;
             myGraph[i].edges[1].dest = (2*(layer+1)-1)*(2*(layer+1)-1) + 2*(layer+1)-1 + 1;
             myGraph[i].edges[2].dest = i + 1;
             myGraph[i].edges[3].dest = i - 1;
             i++;
             t = 1;
             goto bottomFrame;
  }
    else {
             myGraph[i].edges[0].dest = (2*(layer+1)-1)*(2*(layer+1)-1) + t + 1;
             myGraph[i].edges[1].dest = i+1;
             myGraph[i].edges[2].dest = (2*(layer-1)-1)*(2*(layer-1)-1) - 1 + t;
             myGraph[i].edges[3].dest = i-1;

             i++;
             t++;
             goto rightFrame;
  }

bottomFrame:
myGraph[i].x = layer - t;
myGraph[i].y = -layer;

myGraphLeft--;

     if(myGraphLeft <= 0)
        goto stop;
     else if( myGraph[i].x == -layer) {
             myGraph[i].edges[0].dest = i - 1;
             myGraph[i].edges[1].dest = 4*(layer+1)*(layer+1) - 1;
             myGraph[i].edges[2].dest = 4*(layer+1)*(layer+1) + 1;
             myGraph[i].edges[3].dest = i + 1;

             i++;
             t = 1;
             goto leftFrame;
    }
    else {
             myGraph[i].edges[0].dest = i - 1;
             myGraph[i].edges[1].dest = (2*(layer+1)-1)*(2*(layer+1)-1) + 2*(layer+1)-1 + t + 1;
             myGraph[i].edges[2].dest = i + 1;
             myGraph[i].edges[3].dest = (2*(layer-1)-1)*(2*(layer-1)-1) + 2*(layer-1)-1 + t - 1;

             i++;
             t++;
             goto bottomFrame;

}

leftFrame:
myGraph[i].x = -layer;
      myGraph[i].y = -layer + t;

    myGraphLeft--;

      if(myGraphLeft <= 0)
             goto stop;
      else if( myGraph[i].y == layer) {
             myGraph[i].edges[0].dest = i + 1;
             myGraph[i].edges[1].dest = i - 1;
             myGraph[i].edges[2].dest = 4*(layer+1)*(layer+1) + 2*(layer+1) - 1;
             myGraph[i].edges[3].dest = 4*(layer+1)*(layer+1) + 2*(layer+1) + 1;

             t = 1;
             i++;
             goto topFrame;
      }
      else {
             myGraph[i].edges[0].dest = 4*(layer-1)*(layer-1) + t - 1;
             myGraph[i].edges[1].dest = i - 1;
             myGraph[i].edges[2].dest = 4*(layer+1)*(layer+1) + t + 1;
             myGraph[i].edges[3].dest = i + 1;

i++;
t++;

goto leftFrame;
}

topFrame:
myGraph[i].x = -layer + t;
myGraph[i].y = layer;

myGraphLeft--;

     if(myGraphLeft <= 0)
            goto stop;
     else if( myGraph[i].x == layer+1) {
             myGraph[i].edges[0].dest = (2*(layer+2)-1)*(2*(layer+2)-1) + 1;
             myGraph[i].edges[1].dest = i + 1;
             myGraph[i].edges[2].dest = i - 1;
             myGraph[i].edges[3].dest = (2*(layer+2)-1)*(2*(layer+2)-1) - 1;

      t = 1;
      i++;
      layer++;
      goto rightFrame;
      }
      else {
             myGraph[i].edges[0].dest = i + 1;
             myGraph[i].edges[1].dest = 4*(layer-1)*(layer-1)+2*(layer-1) + t - 1;
             myGraph[i].edges[2].dest = i - 1;
             myGraph[i].edges[3].dest = 4*(layer+1)*(layer+1)+2*(layer+1) + t + 1;

             i++;
             t++;
             goto topFrame;
      }

stop:
      for(i=0; i<nodecount; i++) {
           for(j=0; j<4; j++) {
                    if(myGraph[i].edges[j].dest > (nodecount-1) ) {
                           myGraph[i].edges[j].dest = -1;
                           myGraph[i].edges[j].weight = INFINITY;
                           --myGraph[i].num_connections;
                    }
                    else
                           myGraph[i].edges[j].weight = 1 + rand()%100;

}

myGraph[1].edges[2].dest = -1;
mysGraph[3].edges[3].dest = -1;
myGraph[5].edges[0].dest = -1;
myGraph[7].edges[1].dest = -1;

< /SPAN > < /div>

The Relationship between Labelling & Statistic, Possibility, Tendancy

Labeling in path search is so important that virtually supersede pure search path itself. The reasons are very simple. Good labelling will significantly reduce the search effects (to emphasize, very dramatically). On the contrary, bad labelling will increase the workloads, and in turn suck-out all the computing resource on doing meaningless, repeatedly, hopeless searching.

As mention before, the worst case scenario will be the shortest path always oppose our labelling direction (axis xand y ). It's doesn't work well when we dealing with small graphs. However, when we dealing with big graph, the advantage of systematic labelling are very obvious.

Labelling is virtually equal to defining nodes' priority. Therefore, the important of labelling is undeniable. The following topics will discuss about how to systematically do labelling.

Method 1: Virtual landscape rendering
Treate the graph as a map. Draw the contours in a high contrast colour or draw Landscape Heightfield line. Then use your eye to do visual labelling.

landscape1

But obviously, this mothod is difficult to implement. Edges' weights in most cases are not consistent (not like geographical landscape), as result the virtual landscape we render might be quite confusing. However, there still might have chances this method works (e.g. country road map).

Method 2: Statistic
When we dealing with big graph (a lot of nodes), we can use statistic to do the labelling. As we mention before, a graph drawing should not be confused with the graph itself (the abstract, non-graphical structure) as there are several ways to structure the graph drawing. All that matters is which vertices are connected to which others by how many edges and not the exact layout. In practise it is often difficult to decide if two drawings represent the same graph. Depending on the problem domain some layouts may be better suited and easier to understand than others.

Should we take note that, estimation is just a prediction. It may not not always true and always able to label appropriately. Estimation is not equal to shortest path itself.

There are 2 ways to represent a graph - List Structures & Matrix Structurres.

Although most of the times matrix structures only symbolically representing there are connections between nodes in a graph. However to solve our estimated ranking problem, we will be implementing matrix structures.

In matrix structures, there are few parameters will be used: radius, angle. To simplify the problem, split nodes to layer-by-layer. In the most inner layers, path are alway uncertain. In most outer layers, path will be pointing to the destination. This is what called the tendancy.

Use the radius length to represent layers, use angle to represent where the nodes located. Therefore, geometric will be heavily utilized.

Inner layers are ussually higher priority over the outer layers.

When all the edges' weight are close to the sample mean and sample standard deviation, s is small
For this case, we can say all the nodes are almost alike.

This is the most simple form of graph we gonna work with. It is an exponenial decay problem.

Here are some basic rules before we try to build our formula:

Inner layer nodes always have higher priority over outer layer
Inner layers path tendency are mostly uncertain compare to outer layer
In outer layer, path is clearly prone to our destiny
Trace the path tendency by their layers & angles of the nodes' position

Path's tendancy in inner layers are uncertain.Therefore we will rank all the inner layers nodes highest priority, 0⁰~360⁰. In the outer the layers, path's tendancy becoming more obvious. Hence we can narrow down our angle.

General Equation:
ąAˇexp(-b/cn)

Currently I try to find the these parameters.

The web-site still under construction.

Author: Lee Kin Seng
E-mail: [email protected]

Counter

Node	#1	#2	#3	#4	#5	#6	#7	#8	#9	#10
Connections & Weight	#2 - 4 #3 - 8	#3 - 3 #4 - 7 #5 - 4	#2 - 3 #6 - 1 #7 - 3	#2 - 7 #5 - 4 #8 - 9	#2 - 4 #4 - 4 #6 - 3 #8 - 2	#3 - 1 #5 - 3 #7 - 7 #9 - 4	#3 - 3 #6 - 7 #9 - 3	#4 - 9 #5 - 2 #9 - 8 #10 - 5	#6 - 4 #7 - 3 #8 - 8 #10 - 6	NA