Chapter 5

Chapter 5

Data-structures

Of stacks and queues

And whether queues

Can be larger than a tree !

There is no need now to define what a data-structure is or why it is there as you are already familiar with it. There are four types of data-structures in C & C++. They are 1) Stacks 2) Queues 3) Linked list 4) Binary trees. They are used for various tasks like storing values, to provide dynamism in storage, for controlled data access and sorting respectively. Although we know what a data-structure is, do we know these types of data-structures? We do not. So, let us start by defining them. But, before we proceed, we must make note of this point that, even if we do succeed in defining them with precision we will need to implement these data structures programmatically to understand them fully.

Since data-structures are about data access only, let us consider the data operations microscopically by applying some set standards of data access in programming with C & C++. Data operations are at their optimum when all the data are simultaneously present in memory. And data structures is all about memory as storage space only. Sorting data demands or rather, is allotted, dynamic memory from the heap as stored data may and must often be accessed/shared by different functions within the sort. Dynamic memory is memory allotted at run-time. Static memory is memory allotted at compile-time. That is, the compiler can sometimes place a stack before a heap or vice-versa. As mentioned before, the CS, DS, SS or ES registers, although, are indicative of the address referred to, compilers do not necessarily address the heap through the ES or always address static data (eg. int a) through DS. Since binary trees are most often used for sorting, the heap may seem to be the most obvious storage space to know about, when tackling binary trees and the stack may seem the obvious storage space while tackling the stack data structure, but, as you must have understood by the above statement, it is not as simple as that.

It will be easier to understand which segment of the memory is used for which data-structure and why if we understand how memory is accessed when our program is first loaded into memory. Let us first consider the registers in each of the segment in memory.

‘X’ group

DS		AH	AL
DS		BH	BL
DS		CH	CL
DS		DH	DL
SS	BASE POINTER			BP
CS	INSTRUCTION POINTER			IP
SS	STACK POINTER			SP

Pointer group

16 bits or 28 bits

INDEX group Segmentation

Group

DS	DESTINATION INDEX	D1
DS	SOURCE INDEX	S1

CODE SEGMENT	CS
DATA SEGMENT	DS
EXTRA SEGMENT	ES
STACK SEGMENT	SS

FLAGS

Carry

Parity

Auxiliary Carry

Zero

Sign

Trap

Interrupted

Direction

Overflow

Carry: is true after an arithmetic operation results in a carry out of the high-order bit, or a borrow in subtraction .The carry flag also participates in certain shift and rotate instructions.

Parity: is true if the sum of the low-order 8 bits in an arithmetic operation is even.

Auxiliary carry: is identical to the carry flag except that it reflects a carry /barrow out of bit 3 in 8-bit operations.

Zero: is true whenever an arithmetic operation results in zero.

Sign: is true if an arithmetic operation results in a value with high order bit true. Thus the sign flag is true for negative results.

Trap: When True, this flag puts the register into a special "debugging" mode that returns program control to a predetermined location after each instruction.

Interrupt enable/disable: is the system master interrupt enable. When this flag is True, the register acknowledges interrupts after completion of the current instructions; when false, maskable interrupts are ignored.

Direction: String primitive instructions automatically increment or decrement their pointer operands after every iteration. A TRUE direction flag increments the operands, FALSE decrements them.

Overflow: This is the XOR of carries/borrows into out of the high-order bit following arithmetic operations and is used to detect magnitude overflow in signed arithmetic. Specifically, it is TRUE if there was a carry into or out of the most significant bit (MSB). It is FALSE if there was a carry into and out of the MSB.

You may wonder as to why we have branched off to registers & flags and segments. Chiefly because, these 3 elements compare data co-operation using memory and, understanding these basic elements will make our task of understanding the data structures much more easy.

Segment Registers and typical c storage allocation

High memory Rom Video buffer etc.
Unused RAM
Stack (grows down)
Free Memory Pool	SS
Heap (grows up)	HS
Static data	DS
Code
Low memory DOS ,interrupt vectors etc	CS

Code Storage (CS): This area of memory contains all the C instructions. The size of this area is constant during execution of your program.

Static Data Storage (DS): Variable declares static or variables declared out side a function are stored here. The size of the area is constant.

Allocation storage - the heap: The free memory not committed to the storage of static data and not reserved for the stack is available for allocation to the program during execution. A C program manages this space dynamically during execution by allocating the de-allocating blocks of storage, usually via the alloc family of library functions. Memory allocated from this free pool is referred to as the heap. The size of the heap therefore changes the program’s executions.

Automatic storage: the stack

Unless data inside a function is declared Static, it is placed in transitory, or automatic, storage on the stack. Stack storage differs from heap storage in one important way - the heap allocates data to the programmer while the stack allocates memory to the program. That is, data in stored on the heap as a result of the program’s design, but is stored on the stack as part of the compiler’s function process .The stack is used not only for the storage of data variables, but also to pass variables between functions.

In the context of our study of data structures, we must note a very important point about the stacks. The stack pointer (SP) is decremented after every push or a call to a function. As the amount of storage on the stack grows, the stack pointer will point to a memory location progressively lower in memory. So, a potentially dangerous situation can arise due to this dynamic growth of both the stack as well as the heap. To counter this dangerous situation, we use a stack overflow detection which means that the compiler will limit the size to which the stack will be permitted to grow. Before a function allocates automatic storage on the stack, it first tests whether the allocation will cause the stack to grow beyond its specified limits or "overflow". If granting the request will result in an "overflow", the program either issues a warning messages, terminates, or both. The memory break technique is used to restrain a memory overflow on the heap. This is a function called sbrk (a low-level, may be deprecated function by now) which would extend the boundary of the initially allocated free-pool memory but which can subsequently be retraced through another function called rbrk. Just like malloc() and free().

The stack data structure is a data structure modeled on the stack and although it uses the stack, it is not the stack as discussed above which is a portion of the many memory segments of the computer .These are 3 main operations of a stack data structure (to be referred as stack from hereon)

1.Push 2. Pop 3.Print

The push function pushes values into the stack .The pop function retrieves values and the print function prints the retrieved values. The stack, programmatically, is a structure, which has two members, a value which is the item pushed into the stack and can be of any basic data type according to the value to be pushed in and an integer variable which denotes the top of the stack.

Since there are three different function of a stack each to be represented by a computer function, the structure variable needs to be a pointer, as structures when passed through functions, can only be passed as pointers. Thus a static structure and its variables will be declared as follow.

struct stack{

int top;

char str [100];}

struct stack *st; //static variable

The push function will be as follows

void push(struct stack*s){

*s.top++;

*s.str[top]=str;// where str may be a global variable .

}

The pop function

void pop(){

struct stack *ss;

str=*ss.str[top];

*ss.top--;

print();

}

and the print function

void print(){

printf("%s",str;}

Since the last item pushed is the first item to be popped out (Last In First Out - LIFO) a stack is typically used for reversing a string or inverting an array. And since arrays form the basis of a stack, a static is also known to be a static data structure in that is cannot grow dynamically.

A linked list, on the other hand, is a dynamic data structure in that it can grow or shrink dynamically. A linked list is made up of nodes with each node pointing to the next. Since these nodes are added at run time, it is called a dynamic data structure. Let us see a linked list looks through a simple diagram.

Node #1 Node #2 Node#3 Node#4

As you can see, each node has two sections in it. An "Info" part and a "Next" part. The "info" part contains the values that needs to be stored and the "Next" part contains the address of the subsequent node except for the first node whose "Next" part will contain "Null". This is because as we traverse back we need to traverse back on a condition which is the "null" value in the first node which will denote that their beginning of the list has been reached. Let us now see how the linked list is implemented through a computer program. Again, a structure needs to be used as the data structure needs to have more than one element i.e the info part and the next part.

struct list{

char info;

list *next;

};

Here we have an unusual pointer. A pointer of type structure within the structure. It is not very complex to understand if we following simple reasoning. A box in the following diagram denotes the memory allocated to the structure with a hypothetical address.

Ox8000fh

This box is drawn by us as representing the memory allocated for the structure (since we must use a pointer to the structure and pointers need to be allocated memory; failing which your program may/will crash). A structure which is going to be passed through functions can only be passes as pointers, and hence, the memory allocated for it is the best place to start.

The address of the memory allotted is 0x8000fh. Let us consider this box as the first node of our linked list. As already noted, the first node’s "next " part needs to contains "null". So,

struct list* l1;

l1 = new list; // c++ syntax

l1= malloc(size of (list)); //c syntax

We have created a node through the above statements and we have also assigned Null to the "next" part.

0x8000fh

Info

l1--> next =Null;

The next node needs to be similarly built.

l1=new list;

Another box appears, whose address is 0x9000th and whose "next" part needs to point to the first node but as you have been, we have not stored the address of the first node; so, how do we assign the first node’s address to the ‘next" part of the second ? So that there is a link between the two nodes and our linked list can be built. Let us see through this complexity through the following simple program which swaps values between a integer variables.

void main(){

    int a, b;

    scanf("%d",&a);

    // Suppose 5 is entered

    scanf("%d",&b);

    // Suppose 7 is entered

}

Now we want that a should contains 7 and5 should be assigned to do. How do we accomplish this seemingly simple task ? If we say

a=b;

then the value of "5" in a gets overwritten by the value of b and hence we cannot say,

b=a;

because now a contains the value "7". So we need to use another temporary variable called c; The above program now needs to be modified a little bit.

void main(){

int a,b,c;

scanf(&a);

scanf(&b);

c=a;

a=b;

b=c;

}

Now a is 7 and b is c which contains "5"(c=a). Similarly, we will need to modify our linked list program, too.

struct list*l1, *start;

Now we have two pointers pointing to the same structure .Why do we need *start ? Simply for the purpose of swapping the addresses of each node. Let us again create a node and for the swap node.

l1 = new list; and

start = new start;

Node #1 Start

0x8000fh 0x9000fh

Instead of directly assigning Null to our node is next part, we will do the assignments through the start node’s next part, we will do the assignment through the start node’s next part as follows.

start-> next=null;

l1-> next =start next;

start-> next=l1;

This statement stores the address of the first node in the "next" part of the start node. So now, we are ready to go about the swapping. And when we create our next node in the linked list, we can link it to the first node as we have retained its address in our start node just like we did a’s value in the previous example.

Node #2

Ox1000fh

l1 -> next =start-> next;

start->>next=l1;

Now the second node’s address is in the "next" part of the "start" node. As already mentioned, the storage of null at the beginning of the list is chiefly so that we can have a condition for backward traversal through the list which means that we cannot traverse forwards. This would seem like a major limitation as logic or logical tools cannot be restricted; otherwise, computer programming will be greatly faulty. But we have identified the problem so we can simply assign another element (a "previous" part) to each node of the list which will point to the previous node. Let us see such a linked list diagrammatically

This is called a circular linked list.The structure list in this case, will look like below

struct list{

char info;

list *prev;

list *next;

}

The first node’s "previous" part will contain NULL as in the linear linked list but how will we now require two start two nodes for swapping addresses? Let us see. Let us create the first node and the start node.

list *li , *start;

l1 =new list;

start=new list;

and assign Null to the previous part as the first node.

start ->prev = null;

l1->prev=start->prev;

Now, to assign the next node’s address to this node is "next" part.

start =l1;

stores the address of the first node temporarily in the start node. Now ,when we create the second node,

l1 = new list;

We will store the first node’s address in the second node’s "previous" part.

l1->prev=start;

and "start " will be assigned the address of the seconds node But what of our first node’s "next" part? It still does not contain any value. If the "previous" part of each node points to another node and since our first node is the beginning of the linked of the linked list, its "next" part too may contain NULL as we do not need to seek any other node once we have reached the beginning of list . But we need the second node’s address once we have reached the beginning of the list so as to transverse forwards which is the aim behind creating a circular linked list so how do we assign the second node’s address to the first node? By creating a phantom linked list? Seems the most logical means given the above problem or by going to the first node by its address in the "start" node and storing the recently obtained second node’s address and then returning to the second node through the first node’s "next" part and then proceeding like wise for each node then on.

Binary trees

A binary tree is, again, a complex data-structure which can be used to implement sorting numbers or quite simply, any data that is sortable.

Logic, to be applied, needs to have a pattern. If there is no ordered pattern, logic cannot be applied and if applied, regardless, will have an impurity in form. Similarly, data to be ordered needs to be in a sortable form. If there is no scope, for even parameters in the data, the data cannot be sorted at all or it can be sorted only to a certain extent. Understanding such essentials are vital in understanding how and where to implement the data structures that we are studying. Creating a data structure is easy, but to implement it for good is the real goal behind our study of the data-structures. Data structures are one of the most powerful tools available to a programmer and if utilized to its optimum power, can yield some fantastic solutions to seemingly impossible problems.

A binary tree can best be understood in the form of a pictorial representation of its structure.

A binary tree

If the number is less than the current node then assign it to the left node of the tree and if it is greater than the current node assign it to the right node of the tree and so on.

The structure of the binary tree is as follows

struct binarytree{

                                        int num;

                                        binarytree *rightnode;

                                        binarytree *leftnode;

};

If the next number is 46 which is less than 57 but greater than 32 and another number is greater than 52 but less than 61 then the tree further expands as follows

Hosted by www.Geocities.ws