TOPIC 4.2: DATA STRUCTURES
INTRODUCTION TO DATA STRUCTURES
- All data processing on a computer involves the
manipulation of data.
- This data can be organised in the computer's
memory in different ways according to how it is to be processed, and the
different methods of organising data are known as data structures.
- Computer languages such as Pascal have built-in
elementary data types (such as integer, real,
Boolean and char) and some built-in
structured or composite data types (data structures) such as record,
array and string. These composite data types are made up of a
number of elements of a specified type such as integer or real.
ALGORITHMS AND PSEUDOCODE
- An algorithm is a sequence of instructions to
solve a given problem.
- There are many different methods for writing
down algorithms inducing flowcharts, structure diagrams and pseudocode.
- Pseudocode is an intermediate stage between
plain English and the programming language in which the solution will
eventually be coded - it enables the writer to concentrate on the steps in the
solution without worrying about the syntax rules of a particular language.
PSEUDOCODE ALGORITHMS
- Typical pseudocode constructs include the
following:
IF..THEN..ELSE..ENDIF
CASE..OF..ENDCASE
FOR..TO..ENDFOR
REPEAT..UNTIL..
WHILE..DO..ENDWHILE
PROCEDURE..ENDPROC
FUNCTION..ENDFUN
DATA TYPES
- Data that is to be manipulated by a computer may
be held and organised in several different ways, depending on how it is to be
used.
- Programming languages such as PASCAL allow
variables to be declared as integer, real, char (character) or Boolean (true
or false). These are called as primitive, simple, unstructured or elementary
data types.
- A language is said to be strongly typed if the
type of every object that stores data has to be declared before it can be
used.
- Consequently the compiler can detect and report
illegal use of data types.
- For example, if variables A and B are declared
as type real, and C is declared as type string, the compiler of a strongly
typed langue should detect a type error in a statement such as
A := B/C; since C is not a real or integer variable
- Checking done by a compiler is said to be
static, while checking done when the target program runs is termed
dynamic.
STATIC AND DYNAMIC DATA
STRUCTURES
- Using primitive data types such as integers and
characters, composite data types or data structures such as arrays and records
may be built up by the programmer using language-supplied constructors, e.g.
Array [..] Of ...., Record.....End.
- Data structures such as arrays are called 'static
data structures' because they are defined within the program as being
of a certain size, and the space is reserved in memory to hold the array.
- Other data structures such as stacks, queues,
lists, linked lists and binary trees are said to be 'dynamic data
structures' because they can get bigger or smaller during the
execution of the program.
ARRAYS
- An array is defined as a finite ordered set of
homogeneous elements.
- Finite means that there is a specific number of
elements in the array.
- An array is a set of variables of the same
type, such as integer or real.
- The number may be large or small but it is fixed
and it must be at least one.
- Thus an array is a static data structure.
- Ordered implies that there is a first, second,
third etc element of the array.
- An array x consisting of fifty integers may be
declared in Pascal as follows:
| var |
| x : array[1..50]
of integer; |
- Arrays may have several dimensions. In Pascal, a
one-dimensional array to hold monthly sales values could be declared as:
| Sales : array[1..12]
of Real; |
- The third element of this array would be
refereed to as Sales[3].
- An equivalent pseudocode declaration would be
written something like 'Sales is an array of 12 real variables'.
MULTI-DIMENSIONAL ARRAYS
- The concept of a one-dimensional array can be
extended to two or more dimensions.
- A two dimensional array may be declared in
Pascal as, for example,
| x : array[0..10]
of array [1..6] of integer; |
- The declaration can be abbreviated to:
x : array[0..10, 1..6]
of integer;
- A two dimensional array to hold 12 values for
each of 5 regions could be declared as
Regional_Sales : array[1..12, 1..5]
of Real;
- The sales for the third month and fourth region
would be referred to as Sales[3,4].
RECORD
- A record is the most common data structure in
computerised data processing.
- A record is a more complex data structure
than an array in that the individual elements may be of different types; a
personnel record on a disk file.
- For example, may have fields as follows:
| Name: |
String; |
| Sex: |
Char; |
| Salary: |
Real; |
| Department: |
Integer; |
- In order to read a record into memory, space
must be set aside to receive it, and the Record data structure is used.
- If several records need to be held in memory
simultaneously, perhaps to be sorted into a particular sequence, then an
Array of Records (a table) can be defined.
- Many languages such as COBOL and PASCAL (but not
BASIC) have a built-in composite data type record.
- Records are somewhat similar to arrays in that
they consist of a number of related data items, but they differ in that the
data items or fields may have different types.
- A student record might have the following
fields:
| name |
address |
course |
date_of_birth |
| Ben Ramsey |
17 Elmer Street, Ipswich |
PD7015 |
170873 |
- The field identifiers here are name,
address, and so on.
- We also need to be able to identify the entire
record, may be the name student.
- We can then refer to a particular field of this
record using the 'dot' notation, i.e.
| student.name, student.address, etc |
FIXED AND VARIABLE LENGTH RECORDS
- The records discussed so far have all been
fixed length records.
- If all the records belong to a file, every
record in the file will have the same fixed number of fields and characters.
- With variable length records, all the records in
the file ill not be the same size.
- This could be for one or both of the following
reasons.
- The size of some of the fields coould vary - e.g. a field containing a
name or an address.
- The number of fields could vary. For example, a customer's invoice
record could have a filed )or several fields) for each item that he has
purchased.
- There is no easy way in Pascal of implementing
variable length records, since records are static data structures and
the space is allocated in the computer's memory at compilation time, not at
run time.
- The advantage of using fixed length
records, therefore, is that they are the simplest for the programmer, as the
fixed-length record data structure is a built-in feature of the language.
- The disadvantage of fixed length records
is that they can be very wasteful of space if individual records have fields
which vary in length, or which have varying numbers of fields.
LINKED LIST
- A linked list is a dynamic data structure
consisting of a number of nodes.
- An external pointer points to the first node,
and each node holds data plus a pointer to the next node in the sequence.
- A linked list may be implemented in some
languages such as PASCAL and C using dynamic storage allocation, whereby
whenever a new node is required, it is created dynamically on the spot during
program execution, without having to previously reserve a certain amount of
space for the list.
- It can also implemented using records in a file
held on disk, or in memory using a table (array of records).
- This implementation is static rather than
dynamic, but being relatively straightforward, it will be used to demonstrate
the general principles of adding, deleting and printing elements in a linked
list.
- Fig.12.1
USES OF LINKED LIST
- A linked list is a useful data structure for
keeping data items in a particular sequence as they are added, without having
to sort the items.
- For example in a customer accounts application,
all purchases made by a particular customer during a month could be held as
application, all purchases made by a particular customer during a month could
be held as a linked list in date sequence, ready for printing on the monthly
statement.
- A file of Start pointers giving the first record
for each customer could be separately maintained.
- A linked list may also be used to implement
another data structure called a queue and used in queuing applications.
STACKS
- A stack is a dynamic data structure which has
the special property that it can only be accessed at one end, like a stack of
plates in a cafeteria.
- It is known as a last-in first-out (LIFO) data
structures.
| |
|
| |
| |
| Mark |
←
Top of Stack |
| Daniel |
|
| Roxanne |
USES OF STACKS
Stacks are used in many different situations in
computing, for example:
- to store return addresses, parameters and
register contents when subroutines are called. When the subroutine ends, the
address at the top of the stack is popped and the computer continues execution
from that address;
- in evaluating mathematical expressions held in
reverse Polish notation, i.e. a form of notation used by compilers as an
intermediate step in translating expressions such as A := (B*C) + D/E.
QUEUE
- A queue is a first-in first-out (FIFO) data
structure; new elements are added tot he rear of the queue, and elements leave
from the front of the queue.
- A queue can be implemented as an array with a
pointer to the front of the queue and a pointer to the rear of the queue.
- An integer holding the size of the array (the
maximum size of the queue) is needed, and it is useful to have an extra
variable giving the number of items currently in the queue.
MaxSize = 6;
After 2 people have left the queue and 3 more have
joined, the queue will look like this:
Only 4 people are in the queue but the end of the
array has been reached. To overcome this problem, the queue may be implemented
as a circular queue, so that when the next person joins she enters at the
front of the array:
| Bryony |
|
Davin |
Jon |
Ian |
Georgina |
USES OF QUEUE
Queues are used in a variety of applications such
as:
- holding jobs waiting to be run by the computer;
- a keyboard buffer, to allow a whole line to be
typed and edited while the processor is busy doing something else;
- spooling output on to a disk to await printing.
RESOURCES:
1) P M Heatcote & K R Bond, [A Level
Computing], Letts Educational Ltd, 1997.
2) P M Heatcote, [A Level Computing, 3rd
Edition], Letts Educational Ltd, 1998.