Home Page Back to Home Page





An Introduction to Computers


Contents
Computer Hardware Computer Development Computer Reliability

Data Archiving & Backups
Application Programs Programming & Languages


The main components of a modern computer are:
Diagram of the components of a computer system


General Overview of Computer Operation

When a computer is operating it takes in information, from a keyboard, an existing list stored on a disk, or some other external source. It then processes this information by following a set of instructions which are held in the temporary storage, having been transferred there from permanent storage. The temporary store is also used to keep intermediate results. Finally the results are sent to an output device such as a television type screen or printer, or they may be put back into the permanent store. The physical parts of the computer are known as the hardware and the list of instructions as the software.

Back to ContentsReturn to Contents List

Input Devices

Despite the advances in computer electronics, the main way of interacting with a computer is still a keyboard, using a key layout inherited from the early mechanical typewriters. Once information has been typed in it can of course be stored for future use, in which case a disk drive acts as an input device.

Each character typed on the keyboard is assigned a number between 0 and 255, since all a computer can understand is numbers. When results are displayed on e.g. a printer the computer sends these numbers to the printer which then converts them to the actual characters to be printed. The 256 different possible numbers permit all the letters, digits and punctuation marks to be stored as well as a range of foreign letters and mathematical symbols and various 'control codes' such as marking the end of a line.
The standard system of converting characters to numbers starts with space=32, followed by digits 0-9 and punctuation symbols, then A=65, B=66.... a=97, b=98, c=99 etc. and is known as ASCII. (American Standard Code for Information Interchange.). Numbers below 32 are the control codes and from 128 to 255 they are the foreign letters and graphics symbols. These may vary from one computer to another. When computers are asked to sort words into alphabetical order they frequently sort on the ASCII code, with the unexpected result that, for example, Zygote comes before aardvark. (Because capitals come before lower case in ASCII sequence.)
A different coding scheme is now coming into use, known as Unicode, which has 65536 different codes giving plenty of space for all the foreign characters such as Chinese pictograms.

For more specialised applications computers may also take information from electrical sensors such as thermocouples or electronic balances but the electrical signals must still be converted to numbers before they are fed to the computer.

Back to ContentsReturn to Contents List

Digression:Binary Numbers

We are accustomed to using the decimal system of numbering which needs ten symbols, 0 to 9. Computers have to handle digits in the form of electrical voltages and it has not proved practical to build a computer which can recognise ten different voltage levels because electrical resistance in the circuits will lower the voltage, causing digits to change. It is much simpler to use just two voltages, typically zero and around five volts, there then being little chance of the two voltages being mistaken. This means that computers have just two digits in their numbering system, 0 and 1, i.e. it is a binary system.

Binary numbers work in the same way as decimal except that instead of digit values increasing by a factor of ten when moving one place to the left they increase by a factor of only two. This means that whereas decimal numbers are made up of units, tens, hundreds, thousands etc., binary numbers consist of units, twos, fours, eights etc.

For example, 14 = 8+4+2 and in binary would be written as 01110 representing
0     1     1    1     0
16s   8s    4s   2s   units

Binary numbers are longer than their decimal equivalents but the arithmetic is actually simpler (the multiplication table stops at 1x1=1).

e.g. 6+3 in binary is

  110 (6)
 +011 (3)
  ----
 1001 (9)

Any carries in binary addition are of two rather than ten. (There are never any carries in multiplication.)
The arithmetic in a digital computer is carried out with logic gates. Multiplying two digits together is equivalent to the AND operation since the result is only 1 if the first number and the second number are 1. Addition can be carried out using an XOR (eXclusive OR) gate together with an AND. The result of adding two digits is 1 if the first digit is 1 or the second is 1 but not both. The carry is the result of ANDing the digits.

The information stored in a computer, whatever it may represent in the real world, is held in the form of binary numbers.
'Binary digit is often abbreviated to 'bit'. Computers usually store and process information in groups of eight binary digits, known as a byte. The 256 different numbers used to represent characters just fit in eight bits (because 2 to the eighth power =256.) Binary fractions are also used, in the same way as decimal fractions, so that e.g. 0.11 in binary means ½ + ¼ = ¾.

When decimal numbers are to be stored there are two ways to do it. The decimal digits can be stored in the form of their ASCII codes or the numbers can be stored in pure binary form. The binary form is more compact and numbers must be converted to this form if they are to be used in calculations. If numbers are stored in binary they will need to be converted back to decimal when they are displayed and there is no guarantee they will be displayed in the format the user expects. (e.g. the fraction ¾ could be written as 0.75, .75, 0.750, 7.5E-1 etc. )

A common problem which often puzzles newcomers arises when comparing numbers which should be equal, such as 5000 x 0.1 and 1000/2. A computer will often report that they differ by a tiny amount. This is because computers only use a fixed number of digits for their arithmetic and just as there are fractions such as 1/3 which cannot be written exactly in decimal, there are fractions which cannot be written exactly in binary, so that there is a slight error in the stored form of the number. Unfortunately the common fractions 1/10 and 1/100 are among these. This produces small errors in any calculations using these fractions, just as in decimal 0.33333 x 6 = 1.99998 instead of 2.
Pocket calculators avoid this problem by using binary coded decimal which simulates calculating in decimal. In binary coded decimal (BCD) each group of four bits represents a decimal digit from 0 to 9, the other six possible values not being used. The arithmetic is carried out as if pure binary were being used but after each calculation a correction step is necessary to bring the digits back into the range 0 to 9.

e.g. 26 + 47 = 73
   0010  0110 BCD                  2  6 decimal
 + 0100  0111 BCD                  4  7 decimal
 = 0110  1101 uncorrected BCD      6  13 uncorrected decimal ('sixty thirteen')
 = 0111  0011 corrected BCD        7  3 corrected decimal

Binary coded decimal is useful if results must not contain any errors, e.g. financial calculations, but is slower than binary because of the corrections.

Back to ContentsReturn to Contents List

Temporary Storage

This is used to store the information that the computer is currently working on and the list of instructions that it is currently following. The key features are that the processor can get at the information very quickly, in less than a millionth of a second with current designs, and that the information can just as easily be changed by the processor. It is often called Random Access Memory (RAM) because any part of it can be reached as easily as any other. In modern computers it consists of either transistor type circuits which act as electronic switches which can be either on or off (i.e. binary) or of capacitors which can be charged or discharged (known as Static RAM and Dynamic RAM respectively), packed at one million or more per silicon chip.

Computers also usually contain a small amount of memory whose contents are fixed, known as Read Only Memory (ROM). This is used to start the computer when it is first switched on and contains instructions (often called the BIOS or Basic Input Output System) tailored to the particular hardware in that computer. ROM is necessary because RAM loses its memory if the power is disconnected for even a fraction of a second. (There may also be a very small amount of RAM which is powered by an internal battery to store hardware settings such as the screen resolution - CMOS RAM.)

A further type of memory which is usually present in newer machines is cache RAM. Since the first microprocessors appeared in the early 1970s they have become several thousand times faster. This means that they need to access memory very quickly, but there is a limit to how fast the memory can be read, known as its access time in nanoseconds. The access time of conventional RAM has only decreased from around 300ns to at best 60ns despite the vast increase in speed of processors over the same period. Even 60ns memory cannot be read from quickly enough to keep up with modern processors which means the processor would need to pause frequently to wait for the memory.
However, much faster memory is available but at too high a price to use as the main memory. The compromise is to have a small amount of fast, 'cache' (meaning 'something hidden from view'), memory in addition to the main RAM. Then when data is read from main memory it is also stored in the cache so that if it is needed again it can be accessed more quickly. Computer programs tend to work on relatively small amounts of data or short sections of instructions at a time which results in a high probability that the required data is available in the cache.
The details of tracking what data is in the cache and ensuring that changes get written back to main memory make cache memory quite complex but with a fast processor it can increase running speed several times. Often there is a small and very fast cache built into the microprocessor itself and known as level 1 cache plus a larger but slower external cache known as level 2.

Other types of memory which may be encountered are:-

  • PROM - Programmable Read Only Memory which is initially blank but can be set to fixed values by a pulse of electrical current.
  • EPROM - Erasable Programmable Read Only Memory which is similar to PROM but can have its contents erased by exposure to ultraviolet light and can then be reused.
  • EEPROM - Electrically Erasable Programmable Read Only Memory which can be erased simply by sending the correct electrical signals into it.
  • Flash RAM - This is a newer version of EEPROM which is easier to re-use and is now widely encountered as USB-connected (see later) 'pen drives'. These compact storage devices have a large capacity, no moving parts and are quite cheap. They are often used for transferring information between computers instead of floppy disks or CDs. Flash RAM can typically be erased and re-recorded about 100,000 times before it wears out.

The memory is usually divided into blocks of eight binary digits or the multiples of eight - 16, 32, and 64 - which are handled as a unit. There is no special significance to eight except that it is a power of two and is therefore convenient to use in a binary system. In general the more bytes of memory a computer has the better it performs. In the early 1980s RAM was expensive and computers had only a few thousand (kilobyte or K) to a few tens of thousands of bytes. However since 1980 the price of memory has dropped by a factor of several thousand and a desktop computer normally has at least several tens of millions of bytes (megabyte or M). The next unit up is the gigabyte, which is a thousand megabytes. Each byte of memory can store one character of text or about two and a half digits worth of a decimal number.

Back to ContentsReturn to Contents List

Permanent Storage

Since the contents of random access memory are lost when the computer is switched off it is necessary to have some way of storing information in machine readable form so that it can easily be put back into RAM the next time the computer is used. Early computers used punched paper tape, the presence or absence of holes representing the binary digits, but modern computers normally use some form of magnetic recording. One of the first was magnetic tape but this has the disadvantage that to reach something near the end of the tape it is necessary to fast forward through all the preceding tape - a slow process and hence tapes are not random access. However tapes are still sometimes used to make back-up copies because although they are slow they are also reliable.

Back to ContentsReturn to Contents List

Disk Drives (sometimes spelled disc)

Disk drives store information on a disk coated with the same type of magnetic particles as magnetic tape uses. The disk rotates at high speed and a read/write head moves radially across it. The advantage of a disk is that the recording head can reach any part of it quickly.
Floppy disks are made of flexible polyester in a protective plastic case and are now almost invariably the 3½ inch size, the older 8 inch and 5¼ inch types being obsolete. They are usually double sided, meaning that there is a read/write head above and below and both sides have a magnetic coating. The information is stored along concentric circles called tracks which are each divided into sectors. Disk drives are available in several densities, referring to the number of tracks (usually 40 for low density and 80 for high density) and the number of sectors each is divided into (typically 9 or 18). Since each sector stores the same amount of information (usually 512 bytes), high density disks can store more but need to be of a higher quality, with a finer magnetic coating.

Externally the only difference between a low density and a high density 3½ inch disk is an extra hole in the high density one, through which an actuator on a switch passes when it is in the drive, so that the drive can tell which it is. In the early days the high density disks were quite expensive and some people actually went to the trouble of drilling holes in low density ones to fool the drive into using them as high density. This was not a good idea because although the disk often worked initially the recording tended to fade with time and become unreadable. (Due, for example, to the lower coercivity of the coating on low density disks.) The process of setting out the pattern of tracks and sectors on a blank disk is called formatting.

Hard disks operate in the same way but use a rigid aluminium or glass (which are non-magnetic) disk with a magnetic coating. (Hence hard disk). In practice they normally have a stack of disks on a single spindle, with read/write heads in between. Floppy disks can be removed from the computer and different ones inserted whereas hard disks are a complete unit with the drive mechanism and are therefore sometimes called fixed disks. However hard disks can store much more information (typically several hundred megabytes to over 100 gigabytes) than a floppy disk which is limited to around 1 megabyte.
Hard disks also spin continuously while the computer is switched on, while a floppy disk has to be spun up to speed before it can be read. Hard disks are much faster in use than floppies.
In order to pack the extra capacity onto hard disks the read/write head has to be positioned very close to the disk surface, which is rotating at high speed. The heads are aerodynamically shaped to 'fly' just over the surface of the disk. If the computer is given a sudden jolt the head can touch the surface and cut a hole in the magnetic coating, losing whatever information is stored there. Hence a computer should not be moved too much while the disk is running.

A 1973 IBM design had two 30Mb disks and was known as the '30-30'. It then became known as a 'Winchester' disk after a Winchester rifle where one 30 is the calibre and the other is the grain weight, and hard disks are still occasionally called Winchester drives.

A more recent development in mass storage is the optical disk or CDROM which is similar to an audio CD and can store six hundred megabytes. They can be used in multi media devices to include sound and television playback with computing facilities. Some optical disks are read only but writable disks are also available and have largely replaced floppy disks for transferring data about, mainly because of the much greater capacity of CD ROMs. At present though compact discs are slower to access and read than hard disks, and have lower capacity, so are not likely to oust magnetic hard disks in the near future. Data is sometimes copied from the CD to the magnetic disk before use to speed up access.
DVDROMs work in a similar way to CD ROMs but have an increased capacity, of several gigabytes, obtained by using a shorter wavelength of light to read the disc so that data can be packed closer together, plus they can be multi-layer and double sided.
None of these forms of 'permanent' storage is totally reliable and one computer adage says "if it's not on a box of stripey green paper it's not backed up".

When a computer is operating it is often working on more information than will fit into its RAM and it may then transfer blocks of memory to and from disk as needed. The use of disk storage in this way is called virtual memory. A hard disk is needed here because of its extra speed although it is still slower than if everything would fit into RAM.
A RAM disk is an area of the computer's memory which is set aside to act like a disk drive. Files can be copied to it from a real disk drive and the RAM disk provides temporary storage for program output. Since the data is held in RAM it can be accessed much more quickly than even a hard disk, which is useful for programs which require a lot of swapping between files on disk. Remember that all the information on a RAM disk will be lost when the computer is switched off unless it is explicitly copied to a genuine disk.

Back to ContentsReturn to Contents List

Output Devices

A television type display (VDU - Visual Display Unit) is now almost universally used for interactive purposes. True computer monitors give a better image than domestic televisions. Until recently most monitors used cathode ray tube (CRT) technology and were heavy and bulky. Portable computers and most new desktop computers use a liquid crystal type of display.
Characters or drawings on the screen are made up of dots or pixels (picture elements), which corresponds to the way television pictures are produced by scanning an electron beam across the screen in horizontal lines and varying the intensity as it moves. Computer displays are usually bit-mapped, meaning that there is a block of RAM in the computer set aside for the display and each pixel on the screen is represented by a group of bits in this memory. The computer displays something on the screen by manipulating the bits that correspond to where it is to appear on the screen. (Some displays are text only and only the ASCII code of the characters is stored rather than the pattern of dots needed to make them up. This is more economical in memory use but cannot display pictures. Now that RAM is cheap text-only displays are seldom seen.)

The number of bits used to store each pixel determines how many colours or shades of grey can be produced. If only one bit is used then only two colours are possible (the bit can be 0 or 1) and the display is monochrome. Images intended to be as realistic as photographs can use 24 bits, in 3 groups of 8, to allow 256 intensities for each of the primary colours red, green and blue which gives a total of over 16 million different colours. However this uses a lot of memory and processing time and many systems compromise by allowing a smaller number (e.g. 256 or 32,000) of different colours on the screen at once, and sometimes the smaller number of colours can be chosen from a larger palette of several thousand.
The refresh rate of a display is sometimes quoted. Since the picture is built up line by line there is a pause after each line is scanned before it is redrawn and during this time the image on a cathode ray tube fades slightly. This results in a slight flickering of the image, which is most noticeable if seen from the corner of the eye.
The first home computers, using a normal television as the display, were limited to only redrawing the picture 50 times a second (a refresh rate of 50 Hz) which can produce quite bad flicker. A purpose designed computer monitor normally has a higher rate of 60 or 75 Hz and may also have long delay phosphors, resulting in a much steadier picture.

Displays are also referred to as interlaced or non interlaced. A television picture is made up of 625 lines but is produced by first scanning all the odd numbered lines and then 1/50th of a second later scanning all the even numbered ones. Hence the picture is made up of two interlaced half pictures, repeated 25 times a second. (This is done to reduce the amount of information which needs to be broadcast whilst maintaining good picture resolution.) One result of this is that any one scan line is only refreshed on every alternate scan, i.e. every 1/25th of a second. This can be a problem with graphic displays if the vertical resolution is one television line since it will flicker at 25 times a second - producing a shimmering image.
For this reason early computers which were designed to use a TV normally limited their vertical resolution to less than 300 lines so that each computer line consists of two TV lines, each of which is scanned at 50 Hz. Modern computer monitors however are normally non interlaced so that each scan line is refreshed on every scan and a steady picture is produced even at high resolutions.

Back to ContentsReturn to Contents List

Printers

Printers come in several varieties depending on exactly how the ink is deposited on the paper. A dot matrix printer has a print head containing fine pins (9 or 24) which press onto the paper through an inked ribbon to form characters as a pattern of dots in the same way as a VDU. (Strictly speaking nearly all printers print characters in the form of a grid of dots but the term 'dot matrix printer' normally only refers to the type just described.) The printing tends to be of a poorer quality than that produced by other designs since the dots are visible but these printers are cheap, fairly fast and are capable of printing in a variety of typefaces and of printing simple graphics. They are also the only common printers which can be used with carbon paper. Colour dot matrix printers were made, which used a banded ribbon, but they could only really cope with simple illustrations and did not give good results when printing photographic pictures.

The hammering of the pins onto the paper is noisy and the more popular, now almost universally used, alternative is the ink jet (or bubble jet) which squirts droplets of ink directly onto the paper and is almost silent. Ink-jet printers produce clearer type than dot-matrix printers and do not suffer from steadily lightening print as the ribbon is used up. With suitable paper they can produce coloured prints almost as good as photographs but the downside is that the ink cartridges tend to be phenomenally expensive for the volume of ink they contain, and can dry up if not used regularly.

The now obsolete daisywheel printers had pre-formed letters as on a typewriter, arranged like petals around a hub, and gave the same quality of output as an electric typewriter. They could not print pictures but it was possible to change the typeface by swapping the print wheel.

Laser printers deposit ink powder (toner) electrostatically on a coated metal drum and then roll it on to the paper in a similar manner to photocopiers. They can print quickly since they print a full page at a time, give good quality results and are fairly cheap to run. Colour laser printers are now available and the price is falling so that they are a good alternative to ink-jet types for anyone who does a large amount of printing.

The computer either tells the printer which characters to print by sending their ASCII code numbers which are then converted into dot patterns or whatever by the printer or else it sends the dot patterns directly. The former method restricts the typefaces (fonts) to those which the printer knows about whereas the latter allows any style of text or pictures to be printed but at a slower speed because much more information needs to be processed and sent to the printer.
The two older means of communicating the codes (which are binary numbers in 8-bit bytes) was to send the bits one at a time down one piece of wire, known as a serial or RS-232C interface, or to send all eight at once down eight wires, known as a parallel or Centronics interface. In practice both systems need several extra wires to control the flow of information. Printers were normally connected via parallel interfaces because the information could be sent more quickly. Serial ones were used when a longer range was required, as parallel interfaces are unreliable if the connecting cables are more than a few metres long. There are standards for the wiring of plugs and sockets and for what each signal means but unfortunately manufacturers do not always keep to the standards and getting a computer and printer to work correctly could sometimes be a matter of trial and error, particularly with serial cables.
Most modern printers are connected via the Universal Serial Bus (USB) which is a high speed serial system using four wires, although only computers made since about 1999 are likely to have suitable sockets and operating system support. USB has largely replaced RS-232C as a means of connecting peripherals, though it is limited to cable lengths of a few metres unless special repeaters are used. RS-232C can cope with cables up to 50 or even 100 metres long.

The baud rate which is quoted for RS-232C interfaces is a measure of how fast information can be sent along it, being roughly equivalent to bits per second. (The term baud has nothing to do with binary digits but is derived from a name, Baudot.) Since the computer can send characters quicker than the printer can print them the printer often contains its own memory (a buffer) to store them until they can be printed, freeing the computer to do something else. (Storing items to be printed in an intermediate memory is also called print spooling, either from the days when the memory would have been a spool of magnetic tape or standing for Simultaneous Peripheral Operation OnLine.)

Back to ContentsReturn to Contents List

The Processor

The processor or Central Processing Unit (CPU) is the 'intelligence' of the computer. It performs arithmetic, makes decisions on what to do next based on the result, and generally controls the flow of information to and from the memory, disk drives etc. When the computer is operating the CPU follows a list of instructions stored in RAM - the program (not programme). The general order of events for each instruction is:-

  1. Fetch an instruction from memory
  2. Decide what type of instruction it is
  3. Fetch any more information needed from memory
  4. Carry out the instruction
  5. Store the result (in RAM or special memory within the CPU)
  6. Move on to the next instruction

The instructions themselves are quite simple, such as adding together two binary numbers, transferring bytes to and from memory and jumping to a different part of the program depending on the result of a calculation. One difference between large mainframe computers and desktop models is in the complexity of the processor. Mainframe CPUs can usually handle large numbers with fractional parts (floating point numbers) whereas the simpler microprocessors in early desktops could only handle small whole numbers. (To do calculations on larger numbers a microprocessor has to perform several instructions and build up the answer in stages, similar to long multiplication by hand.)
The clock speed of a processor (e.g. 50MHz) is the number of the above steps that it carries out per second. It is not in general the number of instructions carried out per second since each instruction may require several clock cycles. (e.g. a 50MHz processor may do 50 million steps per second but only around 10 to 20 million instructions depending on their type.) Early microcomputers generally managed a few hundred thousand instructions per second whereas mainframes use a technique called pipelining where the next instruction is started before the current one is finished and also have higher clock speeds, allowing up to several thousand million instructions per second (MIPS) under the right circumstances. It has to be said however that recent microprocessor designs are blurring the distinction between mainframes and desktops. Many current microprocessors have built-in floating point units and use pipelining so that the latest desktop microcomputers have the same computing power as mainframes of ten to fifteen years ago.

The power of a processor comes from the speed and accuracy with which it operates. The number of 'bits' of a processor is the number of binary digits it can handle as one unit. More bits means the computer does more work with each instruction and operates faster. Sixteen or 32 bits is typical for a desktop computer at present, while mainframes are sixty four bit or more. Microprocessors are usually known by numbers such as 6502, Z80, 8086, 80286, 68000, ARM9 etc. Many computers have an optional or built in maths coprocessor. This is specifically designed for doing arithmetic and works alongside the main processor. It can speed up calculations, but not unless the program that the computer is using has been written to use a coprocessor.

Back to ContentsReturn to Contents List

The History of Computer Development

A computer is essentially a general purpose, programmable machine, though usually specialised for arithmetic. The earliest programmable machine was the Jacquard Loom, using punched cards to control the movement of needles to weave patterns automatically. The first machine specifically designed for complex calculations was Charles Babbage's Analytical Engine, begun in 1836. It used a mechanical system of gear wheels and levers to follow a program on punched cards. However the precision manufacturing technology of the time was insufficient and the machine was never completed, though a fully working version has recently been built to the original design.
In the 1880s it became clear to the American Census Bureau that some improved system was needed to collate and tabulate the census returns of a growing population. Herman Hollerith devised a method of storing statistics as holes punched in cards (again) which could be read automatically by an electro-mechanical device which detected the holes and recorded totals. This system was improved upon and eventually Hollerith founded a company making mechanical calculators, which became International Business Machines, IBM. (There is a myth that the wayward computer in 2001 A Space Odyssey was named HAL in order to stay one step ahead of IBM, although the author denies it.)

Several electro-mechanical numerical computers, using relays as switches, were built but they suffered from being slow and unreliable. Completely electronic computers, using valves, could operate more quickly and the best known example was the Electronic Numerical Integrator And Computer, ENIAC, used by the American military during the second world war for calculating missile trajectories. ENIAC however had two serious shortcomings. Firstly it was 'hard wired', meaning that to change the calculations it performed it was necessary to change the wiring with a soldering iron. Secondly, because it used vacuum tubes it was very big (30 tonnes) and the large number of tubes, 19000, made it unreliable - the average time between breakdowns was seven minutes.

Later computers stored the instructions in an easily changed memory which allowed them to perform any type of calculation just by changing these instructions. (This was a system first suggested by John von Neuman. Alan Turing had proved that any problem which could be solved by following a finite list of instructions could be solved by such a stored program machine.)

The memory at this time was often a magnetic system using tape or magnetically coated drums for bulk storage and a smaller but faster type for working storage. One design used a mercury delay line which stored binary digits as pulses of sound travelling around a circular tube of mercury. This was a serial memory since it was necessary to wait for the information to circulate round to the detecting point before it could be read, in contrast to modern random access memory. Another design of memory around this time used a modified cathode ray tube (known as a storage tube) which had a metal plate over the front surface. Data was written to the memory as a pattern of illumination on the tube and could be read back by detecting the change in capacitance between the tube and the metal plate as the beam was scanned back over it. A later type of memory had a grid of criss-crossed wires with a tiny ferrite ring or 'core' over each intersection. Each one stored one bit of data by way of the direction of magnetisation of the ferrite. Main memory in a computer is still sometimes called 'core' memory for this reason.

Arguably the first true stored program computer was built at Manchester University in 1948 (unimaginatively called the Manchester Mk1) and Manchester remained among the leaders in computer innovation for some time. During the 1950s and 60s computers became more complex and faster because of the use of transistors in place of valves. The circuitry was built on small boards which connected together by being plugged into a framework, hence they became known as mainframes.
Nearly all the computers in this period were one-off designs, the first to be 'mass produced' being IBM's Univac, of which three were built, IBM eventually becoming the largest manufacturer of mainframes. (The demand for computers was grossly underestimated - a British government committee set up to look after Britain's computing needs in the 1960s concluded that three central computers would be sufficient for the whole country.)

Because computers were being virtually hand built the cost was high and to make full use of them it was necessary for several people, each using a separate terminal, to share one computer. This was possible because it generally takes far longer for a user to type commands into the computer than it takes for the computer to process them. There are also delays with slower devices like the magnetic tape stores used at this time. The computer was set to switch its attention from one person to another rapidly so that each appeared to be using it continuously. This is an example of a multi user system. It was also multi tasking since it could spend a fraction of a second running each of several programs in turn and thus appeared to be running all of them at the same time, albeit more slowly.

In practice few people used computers directly. The usual system was to write out the program and any fixed data, which was typed into a machine that converted it into punched cards. These were then fed into a card reader so that they could be entered into the computer's memory rapidly when processing time was available. Direct human input was kept to a minimum to avoid slowing down the computer and to make best use of it as an expensive resource. Using pre-prepared programs and data was known as batch processing.
When remote terminals were used at this time they were generally teletypes which operated like an electric typewriter that could also be controlled by the computer and printed everything on a roll of paper. This made it difficult to use the computer interactively since it was not possible to move around the display or back to previous entries, as is possible with a VDU. (Some software writers appear to have been slow to notice the change to VDUs. The DOS as used on IBM type PCs until the early 1990s is reminiscent of teletype systems.)

Back to ContentsReturn to Contents List

The PC

Things began to change more rapidly in the early 1970s when integrated circuits became available to replace individual transistors. Mainframe computers continued to be built and have become steadily faster with more memory. Mini computers which were scaled down and cheaper versions of mainframes but still multi user also appeared. The big growth area though has been in small, 'personal computers', PCs, designed to be used by only one person at a time. One of the earliest was the Altair of 1975 but this had to be programmed by setting switches on the front panel and gave its results by illuminating a row of lights.
One of the first practical personal computers was the PET (Personal Electronic Transactor) made by Commodore Business Machines. This contained a keyboard, video display, cassette tape mechanism for program storage and the electronics all in one case and could be used fairly easily without any knowledge of computers. Later versions had up to 32 kilobytes of memory (a lot by the standards of the time) and a disk drive. The PET was very popular in businesses, due to its ease of use, reliability and low price (less than £1000), from 1977 until the early 1980s. (Although Commodore did have trouble with the name PET when they tried to sell it in France.)

From 1980 the 'home' computer market took off, in the UK largely thanks to the low cost Sinclair ZX80 and ZX81. Numerous other manufacturers joined the market so that by 1984 there was a bewildering number of different home computers, e.g. Sinclair ZX Spectrum, Commodore 64 and VIC 20 (versions of the PET with colour displays), Apple II, Acorn Electron and BBC model B, Tandy TRS-80, Video Genie, Tangerine, Dragon 32, Atari 800, Oric, Sharp MZ 80 and many others.
Common features were that they used 8 bit processors, stored their programs on cassette tapes with disks an expensive option and were normally limited to 64K or less of RAM. They were usually supplied with the BASIC programming language built in on ROM and whilst in the early days the only way to use them was to write your own programs, by 1983 the software industry had got started and there was a wide range of ready written programs covering most areas, although with a preponderance of games.

Back to ContentsReturn to Contents List

Business Computers

None of the home computers was really successful outside home use. They tended to be of poor mechanical construction and had limited storage capacity which could not easily be increased. The biggest problem though was incompatibility. Every make of computer, and often even different models from the same manufacturer, differed in its internal design and general operation. This meant that a program written for one model had to be almost entirely rewritten to work on another, and it was difficult to even swap information on disk. (Compare this to, for example, audio cassette tapes which can be played on any machine.) The Japanese manufacturers tried to introduce standardisation into the home computer world with the MSX range but they entered the market too late, when other computers had already sold in very large numbers and created their own pseudo standards, and for once they were unsuccessful.

Around 1982 though, a new range of computers began to appear based on 16 bit processors which gave faster operation and much more memory capacity. They also had built in disk drives and tended to be of better quality construction.
An early contender was the Apricot range made by ACT but this eventually lost ground to IBM machines and IBM compatibles. Somewhat later came the innovative but quirky One Per Desk made by ICL, based in Manchester, but it was not a huge success. (ICL's fortunes went into further decline when a large British government contract to supply computer equipment for government departments was awarded to an American manufacturer. ICL is now owned by the Japanese electronics company Fujitsu.)
The first IBM, launched in 1981, was not much more powerful than the best of the home computers but it had the potential in its design to be 'expanded'. Extra memory could be added and extra circuit boards could be plugged in for things such as printer ports, high resolution colour graphics and networks.

The IBM was well made and reliable and largely due to the size and marketing power of IBM it established itself as the nearest thing the computer industry had to a standard.
Other manufacturers began to make clones of the IBM which worked in the same way electronically and used a compatible operating system so that a program written for any of these computers would work on any other (at least in theory...). This meant that there was soon a vast range of programs available for these models since programmers had a large market to aim at and did not have to divide their efforts over a number of differing computers. Later versions of the IBM type have a faster processor, more RAM, better display capabilities etc. whilst maintaining compatibility with older models so that businesses are encouraged to 'upgrade' their computers every few years.

Back to ContentsReturn to Contents List

Problems

Although the IBM PC looked like introducing standardisation things were not so simple. (As Samuel Johnson might have said, there are standards, broad standards and computer standards.) There have been numerous versions of the microprocessor used in the PC (Intel 8088, 8086, 80286, 80386, 80486, Pentium, K6, Pentium Pro, Celeron...) and a program written to use the extra features of the newer models may not work on the older ones. They have different clock speeds (8, 16, 25, 33, 50, 133MHz ... up to over 2GHz) which means that programs operate at different speeds, occasionally causing problems.

The floppy disk drives can have various capacities (180K, 360K, 720K, 1.2M & 1.44M) and came in two sizes (3½" & 5¼"). The display can be monochrome, Video Graphics Array (VGA), Enhanced Graphics Adapter (EGA), Super Video Graphics Array (SVGA) or others, each with different resolutions and different colour capabilities. Programs sometimes have to be told which display adaptor and other hardware is in use before they will work properly (and it is not always easy to find out on an unknown PC).

There were six versions of the disk operating system (DOS), each with several sub versions. There are subtle differences between them (some even contain serious mistakes) and some programs refuse to work with early versions. In use the operating system required a large number of obscure commands which are very fussy about the way they are typed (a missed space will give an error message) and was entirely command line based, not making use of the computer's visual abilities.
(Contrary to common belief the original version 1 of DOS was not written by Microsoft but was bought almost complete by Microsoft from Seattle Computer Products for $50000. It was written by Tim Patterson and was quite closely based, perhaps illegally closely, on the CP/M operating system written by Gary Kildall of Digital Research. DOS was originally called QDOS which stood for Quick and Dirty Operating System.)

However most programs manage to adapt themselves to the specific hardware and there are utilities to hide the complexity so that PCs work well if properly set up. Hence despite its shortcomings the IBM "standard" remained popular due to the vast range of software available and the open design of the hardware which allows extra features to be grafted on. 'Clones' of the IBM type, made by various other companies, are now more numerous and usually cheaper than the original.
(Question: Why did Intel call later versions of the microprocessor Pentium rather than 80586? Answer: Because other chip manufacturers were making their own versions of the 80x86 series and calling them by the same number as the Intel version, taking business away from Intel. In American law it is not possible to copyright a number but a name can be copyrighted. Hence other manufacturers making a version of the Pentium are unable to call it Pentium.)

Back to ContentsReturn to Contents List

Rivals to IBM

Another computer which has some popularity in business and the home is the Apple Macintosh. This originally used what was usually considered to be a better processor than that in the IBM (a Motorola 68000 series) but was therefore incompatible since the two processors use a different machine code language. (In fact the differences are not too great and there are programs called emulators which will translate one microprocessor's instructions into the other's so that programs can be run. Emulators are slower than the genuine machine and there are such differences in the rest of the hardware that they are seldom really successful.) Later Apple and IBM cooperated to design a new processor (the Power PC) which was faster than either the Intel or Motorola types but was capable of running their programs through a form of emulator.

From the start the main selling point of the Macintosh was its use of a graphical operating system (otherwise called a Graphical User Interface or GUI. Nearly everything in the computer world has a TLA, Three Letter Abbreviation.) where commands are issued by pointing to an on-screen menu or object and pressing a button. The normal pointing device is a mouse, a small box which is moved over the surface of a desk and moves a corresponding pointer on the screen. With this type of operating system the screen is set out as a picture of a real 'desk top' with drawings (Icons) representing the disk drives and programs available. There is a menu bar across the top of the screen and when an item is selected it may produce a 'drop down' menu below it of further choices. When a disk drive is selected it opens a 'window' on top of the desktop showing the contents of the disk. The window can be changed in size and moved around the screen if desired and can be closed to remove it from view. This type of operating system removes the need to remember the exact commands since they are in the menu or in many cases are not needed because of the ability to point to and move objects using the mouse.

Programs written for the Macintosh also used this system which introduced a uniformity in the operation of programs and made it much easier to learn to use new ones. Such a user interface is called a WIMP (Windows, Icons, Menus and Pointers) and the idea was copied by several other manufacturers. E.g. the Atari ST range used GEM or Graphical Environment Manager and the Acorn range used RISCOS). There tends to be better compatibility between different versions of computers using WIMPs because commands to access the screen and other hardware normally go through a Virtual Device Interface (yes, VDI) which converts the command into a form suitable for the particular computer in use. There is a theory that by using pictures rather than words the operating system is understood by the left hand side of the brain rather than the right hand and is more intuitive as a result.

IBM though was sluggish in bringing out a graphical operating system to compete with Apple's. IBM is a large bureaucratic company which responds too slowly to changes in the market and as a result sometimes gets left behind. In 1992 IBM achieved the distinction of making the biggest corporate loss in history, $3.3billion, although their finances have since improved.

Back to ContentsReturn to Contents List

Windows ©®TM

In the early 1990s Microsoft produced a window-based GUI for IBM compatibles called Windows (much thought obviously went into that name) which thanks to marketing hype rather than merit became the most commonly used operating system. It is not quite as simple or intuitive to use as the Apple system, partly because Apple threatened to sue Microsoft if Windows copied any of the Apple's patented features too closely. There have been numerous versions of Windows. Version 1 was no more than a vaguely graphical menu system for starting programs, version 2 introduced a limited form of windowing environment, version 3 was the first which was actually usable and version 4, Windows 95 (which was originally intended for launch in 1994 and almost ended up as Windows 96), rivalled the Apple system for features. It was followed by Windows 98, Windows Millennium, Windows XP and Vista, together with several versions of Windows NT and Windows 2000 which were intended for business rather than home users.
Windows requires a fairly powerful PC with a fast processor, a lot of RAM and a large hard disc to run. (Cynics have suggested that the reason Windows uses so much processing power, while some other GUIs don't, is that existing PCs were already powerful enough for most programs that did not use Windows and users would not have bothered buying the newer models.)

Back to ContentsReturn to Contents List

Portables

Much of the growth in the conventional PC market recently has been in portable, 'laptop', computers. These can do almost anything a full size one can but run on batteries and have a built in flat screen. The main disadvantages of laptops are that the screen can be difficult to see and battery life is very short, usually only a few hours. They can be used to enter information 'in the field' which is then transferred to a full size PC back at base, or can be run off the mains as a replacement for a full-size desktop model (though since portables are more expensive to buy and maintain than non-portables this seems pointless.)

Back to ContentsReturn to Contents List

The Future

There has been little qualitative change in computer hardware since the mid 1980s. Processor speeds have increased, memories are bigger and graphics have improved but the basic design has remained the same. Two new designs which have appeared are:-

  1. The RISC processor (Reduced Instruction Set Computer)
    A conventional microprocessor (known as a CISC or Comprehensive Instruction Set Computer) spends a large part of its time deciding what the next instruction does, since there are up to 1000 different instructions in the instruction set of most processors. A RISC processor does away with all but the most frequently used and is therefore able to spend less time on 'decoding' the instruction and more on actually carrying it out. It hence manages more instructions per second but this is partly offset by the fact that a RISC processor sometimes has to carry out several simple instructions to obtain the same effect as a single one in a CISC processor. RISC processors also generally need fewer components on the chip and are cheaper to make. One RISC computer was the Acorn RiscPC, which was very fast but because of its incompatibility with 'standard' PCs and lack of marketing had only a limited amount of software available and only a small share of the total market. (Acorn were one of the first British manufacturers of home computers but were bought out by Olivetti, and in the late 1990s closed their desktop computer business.)
    It is worth pointing out that several other new designs of RISC microprocessors have appeared from manufacturers such as Apple, Hewlett-Packard and Sun Microsystems whereas the basic design of the two most commonly used CISC processor families, the Motorola 680X0 and Intel 80X86 series, dates back to the early 1980s and late 1970s respectively. In fact almost all new microprocessor designs since the 1980s have been of the RISC type. Even recent Pentium class processors, though behaving as CISC processors, are based on RISC architecture internally
  2. The Transputer
    A transputer chip is a complete and fast microprocessor of the RISC type in itself but the key feature is that each transputer can easily be connected to up to four others. This allows large networks of transputers to be built up and given the right software it is possible for all the processors to work at the same time on different parts of a problem, greatly increasing the speed of the computer. This is an example of parallel processing.
    For some programs it is necessary to know the result of a previous calculation before the next one can be started and in this case a transputer is no advantage. One area where transputers have been successful is in image processing for television special effects. A small area of the screen is assigned to each transputer in an array and they can then each manipulate their own section of the picture simultaneously. Complex fades, slides and distortions of the image can then be applied to 'live' pictures at 50 frames per second. The use of transputers for more mundane tasks is limited by the difficulty of dividing problems into independent sections. A special programming language called Occam has been developed to assist with this. (The transputer was developed by the British company Inmos when it was state-owned. Just as the transputer reached the market Inmos was sold off to the French electronics company Thompsons.) Little has been heard of the Transputer in the last few years.

It seems almost certain that in the short term computers will continue to evolve along their current lines. Processor speeds will increase and hardware prices will drop. It is possible that cheap and very high capacity permanent electronic memory will be developed, which does not lose its contents when the power is switched off, like flash RAM but with an unlimited number of rewrite cycles. This could replace hard disk drives. Further ahead, much has been made of the possibilities of optical computing, using beams of laser light instead of electric currents and opaque/transparent windows as switches. Such systems are theoretically capable of operating at very high speeds but are still a long way from even the prototype stage.
The next qualitative change in storage could be to molecular memory. This uses the property of isomerism of some organic molecules whereby they can exist in two stable forms which can represent the binary digits. One type can be switched between the two states using a beam of light, which would fit in with optical computing. Since a single molecule could store one bit of information, such a memory could achieve a very high storage capacity in a small volume.

A possible total change in the design of computers is the use of networks which consist of nodes or neurons, each having the ability to both store and process information rather than these two functions being separated. Each node on its own would have very limited capability but a system of billions of them all interlinked would form a very powerful processing machine. This of course is exactly how the brain works. It is largely a philosophical question whether such a machine could think and be considered intelligent or whether intelligence is specific to chemical brains. The main difference between human (or animal) and machine intelligence is probably that current computers are digital whereas brains are analogue devices, rather like the difference between a calculator and a slide rule.

Further ahead, it may prove possible to build quantum computers. Whereas in a conventional computer a bit can be either 0 or 1 and a byte just holds a single number, in a quantum computer a 'qubit' can be both a 0 and a 1 at the same time, and a 'qubyte' could hold all possible values at once. Theoretically it is possible for a quantum computer to carry out a huge number of calculations simultaneously and at the end select the one which is needed. One application is in decrypting encoded messages. Some encryption systems would take thousands of years to break with the fastest conventional computers, but could be decoded almost instantly with a quantum computer.

Back to ContentsReturn to Contents List

Computer Reliability

Considering that they are performing millions of operations per second, modern computers are very reliable. However there are three main causes of errors.

  1. Wrong Input
    If the information fed into a computer is wrong, any results derived from it will be wrong. ("Garbage in, garbage out"). Computers have no intelligence of their own to spot unlikely input; it is up to the programmer to check data for nonsensical or impossible values. (E.g. checking that the month in a date is between 1 and 12.) The program cannot detect values which are reasonable but wrong (e.g. the wrong month). This kind of error is obviously common to any kind of machine.

  2. Hardware Faults
    Hardware reliability has been improved greatly by the use of integrated circuits and faults are rare. There can be faults in the basic design of components which will be common to all computers of that model but they are usually soon spotted and either have to be avoided by users or are corrected in the Mark II version. (The first version of any completely new computer or piece of software is usually best avoided.)
    Mechanical devices such as disk drives and printers are more likely to go wrong but these usually produce an error message and the computer will not complete the operation. It is generally safer for the computer to report an error than to give a wrong answer. In the case of floppy disks the magnetic disk itself wears out with time, so that it is best to copy old disks onto fresh ones periodically. Most computers will make several attempts at reading a disk before they give up and a noticeable slowness in reading a disk is an advance warning of problems.
    Most faults with the main components of the computer will stop it working entirely but memory faults can be insidious. It is possible for a very small part of the memory, even just a single bit, to become faulty. This only causes problems if something vital is stored there, when the results can be unpredictable. The error may not even be readily reproducible. Many computer systems can perform a memory test which involves setting (to 1) and resetting (to 0) every bit in memory and checking that they all work. Some systems use parity checking where every byte in memory has a ninth bit which is set or reset such that there is an even number of total bits set. (Even parity.) If a single bit changes then this will no longer be true and the hardware will detect an error when the byte is used. It has been suggested that as memory chips increase in capacity and more components are packed onto a tiny area of silicon, a point is reached where random thermal 'noise' can affect the stored values, so that occasional random errors are produced without there being any permanent fault. As computer systems expand this could become a problem but at the moment memory faults are very rare.
    Many hardware faults are due to loose connections on the circuit board. An unofficial repair procedure for mysterious faults is to lift the computer a few inches off the desk and drop it down again, reseating all the integrated circuits in their sockets.

  3. Software Faults
    Software is the weak spot of modern computers. Over the years the hardware side of computers has advanced greatly, with mass production reducing the cost and improving reliability more than for any other technological device. The software side however has changed little since the first high level languages appeared in the late 1950s and early 1960s. Programs are still written by hand which makes them expensive and leaves plenty of room for error. There have been some improvements though - modular programming and higher level languages among them.
    Early programming languages were 'unstructured' - a mistake in one part of a program could produce strange results in a completely different part. Structured programming attempts to work from a general description of what the program must do, gradually adding more detail about how it is to be done until eventually the actual instructions are reached. This is known as top down programming.
    Part of this process involves breaking the problem down into smaller parts which are as near as possible independent of the rest of the program. The whole program may well be too complex for any one person to fully understand but the separate parts can be written individually, even by different programmers, and then slotted together to make the entire program. These parts are sometimes called modules. If this approach is carried out to the full (and it often isn't) then a mistake in one of the modules should not affect any of the others, though obviously it still causes a problem when that particular module is used.

    High level languages attempt to allow the programmer to concentrate more on what the purpose of the program is and how to state it accurately rather than worrying about precisely how it is to be done. They allow programs to be written in a form which is closer to the way people think than the way computers behave. This of course only works if the language is then able to convert the program into a form the computer understands without losing any of the meaning.
    The basic problem with programming is that a program of any useful size contains so many paths through it and can have such a variety of input conditions that it is impossible to test for every eventuality. Despite considerable effort it is still not possible to prove that a computer program meets a written specification of what it should do. (And even if it did there is the question of whether or not the specification contains any omissions.)
    Programs tend to do what is expected of them under 'normal' conditions but even programs which have been around for some time and gone through several revisions to correct mistakes that are discovered still go wrong occasionally. Some studies have found that after a while the rate of finding new mistakes tends to a constant - revisions add as many new errors as they correct old ones. Programs to control critical operations, such as nuclear power stations or aircraft 'fly by wire' controls are often claimed to be totally reliable. This is not possible with current programming methods. The best that can be managed is very nearly totally reliable.

    Formal Methods is the name of a technique which aims to make programs more reliable by a mathematical analysis. The principle is that both the written specification of the program and the list of instructions that the final program consists of are converted to mathematical expressions. These are then manipulated to determine whether the two sets of expressions are mathematically equivalent, in which case the program meets its specification, or if they are not equivalent then the program contains mistakes.
    This process can certainly catch many programming errors and thus make programs more reliable, but it certainly does not prove that they will operate correctly, for the following reasons:
    1. The specification may itself contain errors or be incomplete.
    2. The job of translating the specification and program code into mathematical form largely has to be done manually. It is difficult and tedious and mistakes can occur.
    3. The program will normally be written in a high level language, but no microprocessor can execute this directly. The program must first be converted into the microprocessor's own 'machine code', and it is not easy to be sure that the machine code version is equivalent to the high level language version.
    4. Occasionally microprocessors contain design faults and do not carry out instructions exactly as the documentation says they should.
    5. Microprocessors are physical devices (pieces of silicon) and it is not possible even in principle to prove mathematically that a particular one has been manufactured correctly, or has not developed any faults since it was fabricated. Although they are tested by the manufacturer before sale the testing cannot be exhaustive because there are an almost infinite number of different combinations of bit patterns the microprocessor could operate on.

    /// Digression \\\
    An example of software fallibility (and users' lack of appreciation of this) came to light soon after this section was first written. The full story is reported in the IEEE journal Computer, July 1993, Vol 26 No 7. A summary follows. The device in question was the Therac 25 medical accelerator. This is a device which shoots high energy electrons at a metal target to produce X-rays for the treatment of cancer tumours. It was also possible to select use of the electron beam directly. Computer control was used to convert the required dose level into an operating energy and time and monitor the output of sensors measuring the actual dose received. Since radiation can kill healthy cells as well as cancerous ones it is obviously vital to use the lowest effective dose. Between July 1985 and January 1987 six known accidents occurred in which patients were given huge overdoses of radiation. Some of the patients died.
    Therac 25 was based on earlier machines which were largely manually controlled and had hardware interlocks and protection (e.g. fuses) to prevent overdosing, in addition to checks by the software. However, since software tends to be cheaper than hardware to experiment with in the development stage, the Therac 25 dispensed with hardware protection and relied entirely on the software. The assumption that the software on the early models was reliable without the hardware interlocks was the first mistake. It had been noticed that when students were training on an earlier Therac 20 the fuses would often blow. As the students became more accustomed to the computer interface the fuses did not blow so often. Obviously the mechanical protection was preventing errors in the software and mistakes by the students being converted into incorrect doses.
    The Therac 25 used much of the same software but not the hardware protection. The Therac 25 was made completely software controlled using an executive program specially written by a single programmer in PDP-11 assembly language. The program contained numerous errors. One was to do with synchronisation - the program did not properly check that the various parts of the machine were ready to use and could try to use the same piece of equipment for more than one job at the same time. This caused unpredictable errors.
    The user interface for setting the dose was an on-screen form into which the operator typed values which could then be verified against any manual settings. When the cursor came to the bottom of the screen this signalled that data entry was complete and the values were read in and used to set up the machine. However the operators had complained that they had to re-enter all the data for each patient even if most of it were the same as for the previous one and so the programmer allowed data in each box of the form to be reused by just pressing Return. This was a major bug. The new values were still only read in if the cursor were moved to the bottom of the form. If the operator only changed some of the data at the top of the form but did not move down to the bottom then, for example, the beam intensity could be altered and the new values would appear on the screen but would not be read in by the program and the settings of the machine would not be altered. The cause of the blown fuses on the Therac 20 seems to have been related to this. The Therac 25 did have ion chambers to measure the dose given and these should have warned the operator that something was wrong. Unfortunately at very high doses they saturated and only registered a low dose. If all this were not enough there was one other serious bug.
    A variable called Class3 was used to indicate that the machine was set up and ready. A zero value meant everything was working and non-zero indicated a fault. When the machine was first used on a patient a series of checks was gone through to ensure everything was in place and ready. If a fault was detected then the value of Class3 was incremented (i.e. 1 was added to it) to make it non zero and stop the machine operating until the fault had been rectified. The correct thing to do would have been to set it to a fixed value because it was possible for the increment routine to be called hundreds of times during a set up and Class3 was only a one byte variable. This meant that when it reached 255 it would reset to zero and if the operator tried to use the machine at this time then all hardware checks were switched off because the zero value indicated that everything was already checked and fine. Thus it was possible for example to operate the beam without safety shielding in place or with the beam stationary instead of scanning across the tumour.
    There were other problems. The error messages which the system produced were cryptic and undocumented. The possible malfunctions of the machine were not documented. The operators were so used to the machine stopping for no apparent reason that they tended to automatically tell it to continue. Despite all this the operators were apparently convinced that the machine was designed so that it could not give an overdose and although patients complained of being burned the manufacturers continued to claim that it was infallible.
    The root cause of the problem seems to be that people with no experience of programming tend to believe that computers can never be wrong, even with wrong input (garbage in, gospel out). In a complex mix of hardware and software such as the Therac 25 any software bugs could easily be blamed on transient hardware faults. The question which arises from all this is how were a programmer and management team who knew so little about software engineering allowed to work on a life-critical project?
    \\\ End ///

An example of a further problem which can creep in even if the program is totally correct is that most microprocessors are not designed from scratch to perform all their functions, such as division and other operations, because to design it all from simple logic gates would be extremely time consuming and error prone. Instead a very basic processor is designed which can only do the simplest of functions such as setting and resetting bits and logical operations. The more complex functions are then built up from these using a programming language called a microcode, built in to the processor. If there are any mistakes in the microcode they may only manifest themselves rarely and cause inexplicable faults in any program running on that microprocessor. (There was an instruction for the Motorola M6800 microprocessor, included for test purposes, which caused it to toggle some of its outputs between 0 and 1 as fast as it could. This dissipated heat and in some designs could damage the circuitry - the legendary Halt and Catch Fire instruction.)
By coincidence, almost a year after first writing this section, news emerged that early versions of Intel's flagship Pentium processor occasionally gave inaccurate results in floating point division calculations. Most users would probably never notice the errors but if the calculation were concerned with the design of an aircraft or with finance, for example, the effects could be serious. The fault has been corrected in current versions but Intel set aside $300 million to cover the cost of replacing the old chips, if their owners demanded it.
Surprisingly, it is at present not possible to prove mathematically from the design that the floating point arithmetic section of any processor gives the correct results since floating point numbers cannot in general be stored exactly. As a further postscript, in June 1997 it was found that Intel's latest Pentium Pro processor had a bug in converting floating point numbers to integers.

Back to ContentsReturn to Contents List

Data Reliability, Archiving and Backups

A modern desktop PC will contain many hundreds of megabytes or even gigabytes of the user's own information on its hard disk, in addition to installed commercial software. This information will include such things as word processor documents, personal photographs in digital form, music files, etc, which cannot be easily replaced. A computer used in business may well have data which is essential to the running of the business. In either case it would be at best extremely inconvenient and at worst disastrous if this information were to be lost.

Unfortunately hard disks do go wrong and can become unusable. Their average life expectancy is equivalent to several years of continuous use but that is no guarantee that yours will not fail after a few months. It is therefore essential to have back up copies of all important information, together with the original installation disks for the operating system and application programs, so that when a new hard disk is fitted it can be loaded with the contents of the old disk. (There is often some warning that the disk is failing as it may give occasional error messages or begin making a lot of noise. It is then possible to temporarily fit a new disk as well as the original and copy data directly between them.)

There are several options for making back-ups. A second hard disk drive can be fitted, either internally or externally, and the contents of the main drive regularly copied to the spare one. This is quite quick and easy so long as you remember to do it, but there is always the risk that both drives will be lost at the same time, e.g. if the PC were stolen.

Nearly all new desktop and laptop computers come with a DVD writer drive. This allows data to be copied from the hard disk onto compact disk (CD) or digital versatile disk (DVD). DVDs are particularly useful because of their high capacity, over 4 gigabytes, which means there is less need to split backups over several disks than would be the case if using CDs.
The disks are available in either write-once form (DVD-R and DVD+R) or rewritable form (DVD-RW and DVD+RW) which can be erased and re-used many times.

If the data on the PC is changing rapidly, such as stock and sales records in a business environment, then backups need to be done frequently, ideally daily. If you only use the computer to write the occasional letter then backups can be less frequent, monthly say. The rewritable disks are best for information which is changing whereas write-once disks are better for 'archive' data such as the previous year's transactions, which are not going to change again.

The backing-up process can be a little time consuming, though specialised software can partly automate it, and it should be noted that writable disks of any kind seem to have a high failure rate while they are being written to (though not once this is completed). For this reason you should never make a new backup onto your only copy of the previous backup or you may lose it. Instead, if using rewritable disks then have at least two and use them in a cycle, so that the last backup is still intact if the current one fails.

With the move to high-speed internet connections over the past few years an alternative form of backup has become feasible. This is to copy your files over the internet onto a server operated by the internet service provider (ISP). Some ISPs offer a certain amount of storage space as part of a broadband package.
Backing up over the internet can be very convenient and should cost nothing more than the normal broadband rental. However some points to consider are:

  • What happens if the ISP's storage develops a fault? They may well keep more than one copy of the data, on separate drives, but if they are all at the same location they could all be damaged by floods or fires.
  • What happens if the ISP goes out of business or simply decides to withdraw the backup service?
  • Unless you encrypt each file before uploading it any private files, which might include bank account details and passwords, could conceivably be read by anyone with access to the ISP's servers. If you do encrypt files, make sure you can decrypt them without your current PC, for instance by keeping a copy of the decryption key.
It would be risky to rely on internet backups as the sole archive of important data.


Long-Term Storage of Data

There may be some information on your computer which you want to keep for a long time, such as digital photographs. Conventional photographic prints from several decades ago are still perfectly viewable and can be of considerable historic interest, especially if they are of your own ancestors. Will digital photographs last so long? This is the field of data archiving and the answer depends on many factors, not just the medium used to store the pictures.

As previously described you should not rely on a magnetic hard disk for storage of more than a few years because it will eventually wear out, and even if it is not used the magnetic patterns will fade over time. The data needs to be copied to some other medium, but which is best?

It is not just the reliability of the storage medium itself which matters. The hardware needed to read it would also have to be available in the future, in working order and not just in a museum. For example, until CD-writers became common around the year 2000, magnetic tape cartridges (similar to a large audio cassette) were often the backup medium of choice. Just a few years later they are now seldom seen, and to make matters worse there were several incompatible designs so that it would be tricky now to read a cartridge from just ten years ago. You might think of keeping the reading hardware (e.g. tape drive, 5¼ inch floppy drive) as well as the medium, but there is no guarantee that future computers will be able to interface to obsolete drives. For example, until recently hard disk drives mostly used a parallel ATA (PATA) interface, but as of 2008 many are switching to serial ATA (SATA) connections, with a different design of plug and fewer wires. In five years time it may be impossible to attach a PATA drive to the latest PCs.

Secondly you need to consider whether any specialised software needed to make sense of the stored data will still be available. Twenty years ago the most popular spreadsheet was Lotus 123, but this is no longer produced and whilst current spreadsheets will often import 123 files, this will not be the case indefinitely. An 'open' data format such as Adobe Acrobat, details of which are freely published, is more likely to remain readable in the future than a 'closed' proprietary format like Microsoft Word. If the documentation on how the information in a file is laid out is not available then it may be very difficult to write new software to decode and display it, even if the file itself can still be read. For this reason a very simple format such as plain ASCII text is preferable to the highly complex formats used by many word processors.


Lifetimes of Storage Media

The lifetime of most media decreases in a fairly regular way with increased storage temperature. This means that by storing it at a high temperature for a short time its life expectancy at normal temperatures can be estimated.

  • Until ten years ago floppy disks were routinely used for backing up small amounts of data. However the disks may only have a lifetime of around 10 years and not all current PCs are fitted with even a 3½ inch disk drive. 5¼ inch drives were withdrawn years ago.
  • For magnetic tape the lifetime is typically 5 to 30 years. Certainly I have been able to read computer data on cassette tape after 20 years.
  • CD-ROMs and DVD-ROMs are estimated to last for 30 to 200 years for both pre-recorded and CD-R/DVD-R types. The lifetime of rewritable disks is less certain but is probably around 30 years.
  • Flash memory, as used in digital cameras, stores data as tiny electrical charges which slowly leak away, so that the memory may become unreadable after 10 to 20 years.
    Flash memory is not likely to be used for storing large amounts of data due to its relatively high cost but its shortish lifetime is important for another reason. The 'BIOS' in modern PCs is held in a form of flash memory and if this is no longer readable the computer will not start up. This a barrier to preserving existing hardware as a way to ensure future readability of stored data. (Until the 1990s home computers normally stored the equivalent of the BIOS in ROM memory which keeps its contents almost indefinitely. There is thus a better chance of a 1982 Sinclair Spectrum still working in thirty years time than of a 2008 PC.)


Conclusions

At the moment the best option for archiving data seems to be recordable DVDs, but make sure that no special software is needed to read them. For this reason it is best not to use compression programs to squeeze more onto each DVD since the decompression program may not work on future hardware.
Ideally archives which are still wanted should be copied to new media as they become available and before the previous media becomes completely obsolete. Hence you would have gone from 5¼ inch floppies to 3½ inch, to CD-ROMs, to DVD-ROMS, to ... whatever comes next. It may also be necessary to transfer data to new file formats, e.g. the latest version of a word processor, to ensure the archive can still be interpreted as well as read. Clearly though this would involve a considerable amount of work in loading and saving each individual file and it is all too easy to forget about upgrading your backups until you realise they are no longer readable.

Despite technological advances, it may well be that the safest backups for crucial data in the long term are paper printouts, since they do not require any specialised hardware or software to read. Similarly if you have a collection of photographs taken with a digital camera it is worth having the best ones commercially made into prints and keeping them in a good quality album (with acid-free paper), rather than relying on versions which are only readable by a machine.
It is of course possible to print them out on a home ink-jet printer, but it is known that some of the inks fade in a relatively short time, though the manufacturers are addressing this.

Once you have your archived data, in whatever physical form, it ought to be stored at a low and constant temperature and relative humidity. Also keep it away from direct sunlight, especially optical disks which are inevitably light-sensitive. Magnetic disks need to be kept away from strong magnetic fields such as are found near speakers and cathode ray tube type televisions.

For ultimate security it is worth keeping multiple copies in different locations. This applies for example to business users. There is no point keeping the only backup copy in the office next to the computer because in the event of a fire they would both be destroyed. Keep a spare copy at home as well.


BBC Domesday

A good example of the perils of computer data storage is provided by the BBC Domesday Project. The original Domesday Book, containing a census of Britain, was written in the year 1086 on sheepskin parchment. Several copies still exist in museums and can be read, with care.
In 1983 the British Broadcasting Corporation decided to put together an updated Domesday Book, to coincide with the 900th anniversary of the original. Contributions of local stories, statistics, photographs and short video clips were obtained from 14,000 schools plus other community groups and these were compiled into an interactive, multimedia format so that text and pictures could be displayed on a computer screen for any selected locality. This was quite an ambitious application of computers for that time.

The project was just about completed and all the data was stored on laserdisks. These had been invented by Phillips as a means of distributing films to watch at home. Laserdisks resembled a large CD and were read by laser, but the data was stored as an 'analogue' (continuously varying signal) signal rather than as a digital pattern of 0s and 1s. In the event of course video cassettes became the preferred medium for watching films and few laserdisk players were sold for home use.

For the BBC Domesday a specially modified laserdisk player was connected to an Acorn BBC Master home computer via a new interface (actually an early form of SCSI) so that textual information could be read by the computer using custom-written software while pictures were fed directly to a monitor. A video mixer allowed the pictures to be overlaid by the computer's text display. The system worked and contained a vast amount of survey data, but the complete package of laserdisk player, interface, computer and software cost £4000 and only a small number of systems were sold, to large libraries or well-off educational establishments.

However, the laserdisks can deteriorate over the years, the players are mechanical devices and wear out, and the BBC Model B has long been obsolete (though there are probably plenty still in the backs of cupboards.)
A few years ago it was realised that there were only one or two functioning Domesday Project systems still in existence, and soon there might be none at all. Hence a new project, called CAMiLEON (Creative Archiving at Michigan and Leeds Emulating the Old on the New), was launched to transfer all the information to a modern computer system. With a considerable amount of work this was eventually achieved but some of the problems encountered were:

  • laserdisk players had not been made for twenty years and it was not easy to find a working one.
  • The reflective coating on the laserdisks tarnishes and they become unreadable. The fact that so few had been produced in the first place made finding readable ones even more difficult.
  • There seemed to be no documentation on the special interface needed to convert the laserdisk signals into a form the computer could understand.
  • They controlling software was written for the BBC Master and will not run directly on a modern PC, so an emulator was used.

It is quite likely that had another ten years elapsed before starting project CAMiLEON the task would have proved impossible. Therefore the 'hi-tech' Domesday Book would have lasted no more than 30 years whereas the original, written by hand on the dried skin of sheep, should easily pass the 1000 year mark.
(As a final twist, it seems that the BBC neglected to have the copyright to the text and images transferred to them, and thus it still resides with the original contributors. This means it would be illegal to distribute copies of the modernized Domesday Project unless all the contributors could be contacted and asked for permission first.)

Back to ContentsReturn to Contents List

Application Programs

The first home computers sold mostly to the hobbyist market where the interest was in persuading the computer to actually do something, irrespective of whether it was anything useful. Commercial software was virtually non-existent and so programs, mostly simple games or mathematically based, were written by the owners using the built-in BASIC language. However for the machines to sell to a wider market and to people who were not interested in computers per se programs had to be produced which were genuinely of use and which could be used without any expert knowledge. Such programs are generally referred to as applications. There is a relatively small number of classes of application but usually a very wide choice of competing versions, although recently the trend has been for a few programs to take most of the market. Some of the different classes will now be considered.

Back to ContentsReturn to Contents List

Spreadsheets

A spreadsheet is essentially a large table of boxes which can be filled with numbers or with equations, which can refer to the contents of other boxes that may themselves be calculated from still more boxes. Spreadsheets are based on a system which was used for financial planning which had the boxes drawn on a blackboard and filled in by hand. The problem with the manual version was that because of the links between boxes, a change to one of the input values could result in many other values having to be recalculated which was obviously error prone.
A computerised version could perform the recalculations much more quickly and reliably. The first spreadsheet was VISICALC written for the Apple II microcomputer in 1979 and the availability of this program was a large factor in the early success of Apple computers in the general business market. The first spreadsheets just displayed a window onto the grid of numbers but as the more powerful PCs became available new features were added such as the ability to produce graphs and charts from the data and the ability to act as a simple database (q.v.). Well known spreadsheets include Lotus 123 (now obsolete) and Microsoft Excel.
Spreadsheets have the ability to perform complex series of calculations which makes them useful for analysing trends and testing the effect of changes but it can be time consuming to set them up and of course it is necessary to know the equations in the first place.

Back to ContentsReturn to Contents List

Word Processors

Word processors are probably the most common application. They are in essence a typewriter which allows typing errors to be corrected and text moved around before it is printed. A word processing system obviously needs a good printer if the output is not to look 'computerish'. The first word processors displayed a blank screen on which text could be typed, edited and moved around and it was also possible to include special codes to control the printer to produce effects such as underlining or bold. There was however little attempt to make the screen display resemble the final printed version, resulting in wasted paper to check the appearance before going back to modify it.

When PCs began to appear in business there was much talk about the 'paperless office', the idea being that records would be stored on computer disc rather than in printed form. It is ironic that the ease with which (maybe multiple copies of) pages can be printed from a PC has in fact resulted in far more paper being used. (It is a strange fact that it is much easier to read and check a document printed on paper than when it is displayed on the computer screen. Even Bill Gates has admitted that if he has a lot of text to read he prints it out.)

Whereas the first popular word processor, Wordstar, worked on a computer with only 64K bytes of memory the modern versions tend to take up many megabytes of disk space.
Some of the features of current word processors include:-

  • WYSIWYG display (What You See Is What You Get). This means that the text is displayed on the screen in as close as possible a form to how it will be printed, including such things as underlining, different fonts and text sizes and general page layout.
  • The ability to include automatic headings and footers on each page, for example to print a chapter title at the top and a page number at the bottom.
  • The inclusion of pictures produced by other programs, displayed on the screen along with the text.
  • Predefined layout styles e.g. for memoranda and faxes, with blanks to fill in for your message. (These seldom seem to match what you actually want and it is often better to create your own blank documents and store them on disk.)
  • Macros which can be as simple as automatically inserting a commonly used paragraph by pressing some key combination or can be virtually a complete programming language. (This latter form has been used to produce a computer virus which unlike most will run on any computer running that particular program, even if it uses a different processor.)
  • Spelling checkers to correct spelling mistakes and grammar checkers to find grammatical errors. The latter are practically useless since they have no understanding of the meaning of a sentence and their suggested corrections can leave the text garbled.

The latest word processors now have so many functions that few people will ever use or even understand all of them. Lotus AmiPro came with a 600 page printed manual - who ever needed a manual to use a typewriter?
(Microsoft Word now has 80% of the word processor market. Wordstar has almost disappeared due to management shortsightedness. The original Wordstar was written for 8-bit machines using the CP/M operating system and was a good program given the limitations of the hardware it was running on. It was written entirely in assembly language by a single programmer on a short term contract. Once the program began to sell well and an improved version had been produced the programmer was made redundant.
Shortly afterwards the first of the 16-bit PCs appeared with the ability to access far more memory, allowing other software houses to write better word processors. The owners of Wordstar wanted to convert their program to operate on these new computers but found that they had no documentation explaining how the program worked (and the original programmer had by now found another job). They were reduced to hiring a team of programmers who laboriously translated each instruction in 8080 assembly language into its nearest equivalent in 8086 language and managed to get the program to work, but it was no better than the original CP/M version. By the time a completely new Wordstar had been written specifically for the PC it had lost its lead.)

Back to ContentsReturn to Contents List

Desktop Publishing

A desktop publishing program is similar to a word processor but gives fuller control over the layout of the text, for example printing in columns and around pictures. Such programs make publishing small circulation magazines and booklets much cheaper since it is not necessary to send the copy to a publishing bureau to be laid out and typeset. The program can produce an output which is ready to be printed directly on a Linotype printer by a commercial printer, including the pictures. The Apple Macintosh is the preferred computer for this sort of work, mainly because it had a head start over the PC with its graphical display and ability to use different fonts which meant it could show pages on the screen exactly as they would be printed.

Back to ContentsReturn to Contents List

Databases

A database is used to store information in a form such that it can easily be retrieved by the computer. This requires that some order be enforced on the information and also usually limits what can be stored. Most databases use the concept of records each of which is made up of fields of data. For example an address database may have fields for name, address and telephone number and each group of a name, an address and a telephone number constitutes one record. It is usually necessary to assign a type to each field, that specifies what sort of information the field is going to hold, whether numbers, text or dates for example. The database restricts what can be entered into each field to data of the correct type. There is also often a limit on the size of each field, i.e. the maximum number of characters. Newer databases may also allow pictures or sounds to be stored in addition to text.
Databases were used on mainframes long before PCs appeared but PC versions tend to be easier to use.

The biggest advantage of storing information on a computer rather than in a card file is that it can be searched much more easily. Databases of any kind use a key which is the field on which the data is sorted into order, for example the names in an address file. This makes it easy to look up a telephone number given the name but it is very difficult to find a name given only the number. A computer is able to do the equivalent of searching through all the cards very quickly so that this reverse searching becomes feasible. It is also possible to link databases together so that information looked up in one can be used as a cross reference to another. Many commercial systems rely on these links. The information in a large commercial database can be very valuable since it would be time consuming to reenter it and often the original source may no longer be available. It is therefore vital to make regular backups and not rely on a single copy on a hard disk.

Back to ContentsReturn to Contents List

Drawing and Painting Programs

These allow the user to produce pictures on the screen, perhaps for use in other programs or as illustrations in a word processor document. There is a distinction between drawing and painting programs which is that a painting program generally works by setting individual pixels on the computer's screen to make up an image whereas a drawing program remembers the picture as a series of lines and shapes to be drawn. A painting program would be used to produce a realistic looking image on the screen but has the limitation that the picture is only stored with the same amount of detail as it is displayed so that if part of the image is magnified it becomes grainy. A drawing program on the other hand tends to produce simpler looking pictures (similar to hand-drawn illustrations) but they can be rescaled at will without losing detail. Drawing programs are used to produce technical type drawings rather than photographic quality pictures.

A painting program will have tools which simulate paint brushes of various widths together with spray gun effects to build up a colour gradually, all controlled by the mouse. Unlike a genuine painting a computer image can be invisibly modified as much as desired, including tricks such as changing all occurrences of one colour for another or applying a sepia tint to the whole image.
When using a drawing program the image is put together by specifying lines of controllable thickness, curves which can be reshaped at will (Bezier curves) and common shapes such as rectangles and circles. Each of these objects can be controlled independently and be reshaped or changed in size without affecting the other objects. After a large change or when scrolling it is possible to see the objects being redrawn. Sophisticated drawing programs are known as Computer Aided Design (CAD). For some reason graphics programs tend to have impressive sounding names like Hyper Paintshop Studio Pro.

Back to ContentsReturn to Contents List

Programming and Programming Languages

As previously stated, a computer needs a set of instructions telling it in complete detail how to perform each task it is meant to do, i.e. a program. There are some so-called 'program generators' which will automatically write some of the more common parts of a program, such as displaying output in a standard way, but most programming is still done by hand. Since the price of computer hardware has dropped drastically over the last 40 years but the time taken to write a program has only decreased moderately this means that very often the software costs of a computer system far outweigh the hardware costs. A typical largish program running on a PC may take several person-years of work to produce and may therefore cost in the region of £100,000 to produce. It is only because it costs very little to duplicate a program once written and because of the large number of potential customers that it is possible to sell programs for a reasonable price of up to a few hundred pounds.
In order to reach as many customers as possible programs have to be general in their application. This may be of advantage to the user who wishes to expand the use of it but it also means that many programs have a vast number of features which most users do not want and may never use. On the other hand, programs written for one specific purpose generally do not have the polish of commercial software since the author can seldom afford to spend a sufficient length of time on the 'fiddly' bits such as help messages and error checking.

The microprocessor in a computer can only understand very simple instructions in binary language. However, since it is very difficult for humans to understand binary, programs are normally written in a language which more or less resembles English and is then translated into binary by various means. (Because the early computers were designed in Britain and America, English is the usual language for programming. It is of course possible to use any other language and as long ago as 1984 a replacement ROM was available for the Sinclair ZX81 which enabled it to be programmed in Arabic.) It is probably now worthwhile looking at the range of programming languages in existence and their evolution over time.

Back to ContentsReturn to Contents List

Low Level Languages

Machine Code and Assembly Language

Each instruction that a processor understands is given a binary number called the machine code and the first computers had to be programmed directly in binary, so that for example a program to calculate the average of four numbers might look like this:-

          00111011
          00000000
          01111000
          10110011
          01111000
          10110100
          01111000
          10110101
          01111000
          10110110
          11000111
          11000111

which is not exactly easy to read.

A slight improvement on this was to use the hexadecimal system to write down the numbers. Hexadecimal means using 16 as the number base so that the digits 0 to 9 are as in the decimal system then A=10, B=11, C=12, D=13, E=14, F=15. Digits to the left of the units column then represent powers of 16 rather than powers of 10.
For example the decimal number 45 ( = 2 x 16 + 13 ) is written as 2D. Each group of four binary digits can be replaced by one hexadecimal digit so that the above program would become:
          3B
          00
          78
          B3
          78
          B4
          78
          B5
          78
          B6
          C7
          C7

which is shorter if no clearer.

A program written purely as a sequence of numbers is known as machine code. It is obviously very difficult to remember the meaning of each of the (possibly several hundred) different instructions written as a number and an additional problem is that the way a number is interpreted often depends on the previous instruction. For these reasons machine code was soon replaced by assembly language. In assembly language each instruction is given an abbreviated name, called a mnemonic or opcode, which describes what the instruction does. Examples are LD to mean load (i.e. put a number into a given location), ADD to add numbers together, and SRA which stands for Shift Right Arithmetically (i.e. move each binary digit one place to the right, which has the effect of dividing a number by two). The above program written in assembly language might look like this:-
          LD A, 0
          ADD (B3)
          ADD (B4)
          ADD (B5)
          ADD (B6)
          SRA A
          SRA A

which means:-

Put zero in the Accumulator (part of the processor)
Add in the contents of memory locations B3 - B6
Shift the result right twice to divide by four
and hence calculate the average.

Such a program is easier to write and for another programmer to understand than one written in machine code, at least for small programs. However it is important to remember that the computer can only understand the original binary version and a program written in assembly language must be translated into numbers either manually or using another program called an assembler. (Since there is a direct correspondence between assembly language and machine code the translation process is simple and the programmer knows exactly which instructions the assembly language will be turned into.)
Machine code and assembly language are known as low level languages because they are based closely on the hardware of the computer. They are still used for writing programs which need to interact directly with the hardware of a computer system, such as the BIOS part of operating systems which have to read from and write to disk drives. They also operate very quickly and are used for the animation in games.
Low level languages have their disadvantages however. One is that they are specific to one particular model of processor and converting a program written in assembly language to run on a different computer is so difficult that it is usually easier to simply rewrite the program. Another is that it is almost impossible to work out what an assembly language program of any complexity actually does unless the original programmer has included plenty of explanation and one is familiar with the computer system in use.

Back to ContentsReturn to Contents List

High Level Languages

From the late 1950s onwards the move has been towards high level languages which tend to use a combination of English and mathematical notation. The idea behind them is that the programmer can concentrate more on what the program has to do and less on precisely how it is achieved. For example to print the letter A on the ninth row and 15th column of the screen using assembly language would usually require the programmer to know the address in memory where the screen image is stored (the base address), to multiply the row number by the number of columns per line and the height of characters in pixels, add the column number multiplied by the number of bytes per column and add the total to the base address. The bit patterns for the character A would then be loaded into memory starting at this address. The assembly language might look like this:-

LD R1, 9               ;the line number
LD R2, 15              ;the column number
LD R3, 80              ;the number of columns per line
LD R4, 16384           ;the start of screen memory
MUL R1, R3, R6         ;9x80 and store in R6
MUL R6, 8, R6          ;8 bytes per character
ADD R6, R2, R6         ;add the column number
                       ;(assume 1 byte per column)
ADD R6, R4, R6         ;add the base address
LD R5, %00011000       ;bit pattern for top line of A
LD (R6), R5            ;put 'A' into the calculated address
. . .                  ;continue with the rest of
. . .                  ;the bit patterns

When writing a section of program such as this it is very easy to get bogged down in the details and lose one's thread of thought about the program as a whole. By contrast the same instructions in a high level language could be as simple as:-
          PRINT AT 9, 15;'A'

which is much clearer and requires less time and effort from the programmer. There are many different high level languages in existence and whilst it is possible to write almost any program in any language it turns out that certain languages are more suited to some purposes than others. This sometimes leads to a new language being specially developed for large programming projects. (The American Department of Defense (sic) currently has over 500 programming languages in use.)
In all languages, spoken as well as computer, there is a grammar which must be adhered to if statements are to make sense. For computer languages the term syntax is used to refer to the allowable ways in which the words in the language can be put together.

Whatever high level language is used, a computer only understands machine code and so the high level language must be translated into machine code before the program can be run. This is sometimes done with a program (which is itself running as machine code) called an interpreter. When an interpreted language is running the interpreter takes each statement in turn, checks the syntax for errors, and decides what it means using a table of commands. Other parameters of the command, such as the line and column numbers in the above example, are also read in and then one of the pre-written machine code routines contained in the interpreter is called to carry out the instruction.
This approach is useful when developing programs because if an error occurs it is sometimes possible to correct the program and continue where it left off without needing to go back to the beginning. Since the program is stored in its high level form the interpreter can normally say where the error occurred and allow the programmer to modify the text ready for another attempt. The disadvantage of interpreters is that when a part of the program is used many times, which is often the case, the whole process of checking the syntax and deciding on what action to take has to be repeated each time. As a result interpreted languages run very slowly, typically tens or even hundreds of times more slowly than if the program had been written in assembly language.

The solution to this slowness is to use a compiler. A compiler reads through the entire program and converts it into machine code instructions which are stored in memory or on disk. When the program is run these instructions can be understood directly by the processor, resulting in a large speed increase compared to an interpreter. Since a compiler has to store extra information in the compiled version of a program to allow errors to be related to the original high level language version and because compilers cannot usually produce machine code which is quite as good as if it were hand written, compiled programs tend to take up more memory than ones written directly in assembly language and do not run quite as fast. Typically they are a few times slower than assembly language, but much faster than interpreters.
The main difficulty with compilers is that if the program stops because of an error it is more difficult to find out the cause and when the program has been corrected it must be completely recompiled and restarted from the beginning. Speed of compilation can vary from a few lines per second on the early microcomputers to thousands per second on a mainframe. A large program will contain several thousand lines to hundreds of thousands of lines and may take tens of person-years to write (by a team of programmers).

A brief description of some of the more common high level languages now follows:-

Back to ContentsReturn to Contents List

BASIC

- Beginners All-purpose Symbolic Instruction Code

Originally devised at Dartmouth College in the USA in 1964 to teach programming to students. BASIC was usually interpreted to make debugging of programs easier and was therefore very slow. Now however compilers are available and BASIC can run quite fast.
BASIC as first devised was an unstructured language which means that the entire program was one unit rather than being built up out of separate blocks. This meant that different parts of a program could interfere with each other (for example by using the same variable names) and made it extremely difficult to write large reliable programs. BASIC was designed when teletypes were the normal input device and to make program editing possible each line of the program was numbered and lines were executed in line number order unless a jump instruction was used. The use of line numbers remained a particular feature of BASIC until the 1990s
As a teaching language, BASIC had the minimum number of complicated features possible and the only way to move between sections of the program was to use the GOTO command to jump to a new line number. A program containing many GOTOs is very difficult to understand or to correct and this led to (deserved) criticism that BASIC taught bad programming habits. BASIC became a popular language for home computers when Bill Gates of Microsoft co-wrote a (not particularly good) BASIC interpreter in the late 1970s.

Most of the early home computers, from 1980 onwards, had BASIC built in (frequently a Microsoft version) and it was still often supplied as the 'free' programming language with home computers until the PC became ubiquitous. In fact PCs were until quite recently supplied with Microsoft's QBASIC, though few owners used it. More recently Visual BASIC has become a common way to develop small programs. This allows full and relatively easy use of the graphical interface of Windows PCs and has many advanced features, but bears little resemblance to the old-style BASICs. As a result BASIC is now one of the most widely used programming languages.
Unlike most other languages there was never a single definitive version of BASIC and so the language was gradually extended to include structured elements (such as named self-contained sub programs) and to add extra commands as the hardware improved. (Colour graphics and sound for example.) Some modern versions of BASIC now contain so many commands that it can no longer be considered a beginner's language. There exists a lot of snobbery in programming languages and, because of its roots as a teaching language and the limitations of early versions, BASIC is often looked down on by 'professional' programmers. The facts seem to be these:

  • BASIC is still poorly equipped with features for handling text although it is reasonable with short strings of characters.
  • Early BASICs had no good way of handling information which is of several types, for example records which may contain numbers and text, though recent BASICs are much better.
  • Although modern versions of BASIC allow good programming methods they do not enforce them and care is necessary to produce reliable large programs.
  • BASIC is a good 'number crunching' language. It has a reasonable range of built in mathematical functions and it is usually possible to add extra user defined functions.
  • Short, one-off programs for solving a particular problem can generally be written and got running more quickly in BASIC than in any other language.
  • BASICs tend to be better equipped to take advantage of the newer features of computers, such as colour and WIMP systems than other more standardised languages.
  • This lack of standardisation often acted against BASIC in that a program written for one computer required extensive changes to work on another.
  • A well written BASIC program can be clear and reliable. A poorly written one (which many are) is impossible to understand and uncertain in operation.

Example:

library "gemaes"
Sub FileSelect (Path$, Filename$, Ok%)
Local ScreenStore%, Rez%
Rez%=Peekw(Systab) 'get resolution
Select Case Rez%
Case 1
XLeft%=156:XRight%=486:YTop%=50:YBottom%=368
Case 2
XLeft%=156:XRight%=486:YTop%=25:YBottom%=184
Case 4
XLeft%=0:XRight%=319:YTop%=25:YBottom%=184
Case Else :Exit Sub
End Select
Redim ScreenStore%((2*Rez%*(YBottom%-YTop%+1)*((XRight%-XLeft%)\16+1)+6)\2+2)
Get(XLeft%,YTop%)-(XRight%,YBottom%),ScreenStore%
Fsel_Input Path$, Filename$, Ok%
Do Until Right$(Path$, 1)="\"
Path$=Left$(Path$, Len(Path$)-1)
Loop
Put(XLeft%, YTop%), ScreenStore%, Pset
Ok%=-Ok% 'to return a value of -1 (i.e. true) if no errors
End Sub


This program allows the operator to select a file from disk using the WIMP system on the Atari ST computer.

To sum up, BASIC seems set to continue to evolve and can be a very versatile language whilst remaining fairly easy to use. It is not suitable for all purposes however.

Back to ContentsReturn to Contents List

ALGOL

- ALGOrithmic Language.

An algorithm is a step by step procedure for solving a problem, named after the Arabian mathematician Alkarismi. ALGOL went through several versions starting with ALGOL 58 in 1958, but is usually ALGOL 68. ALGOL is always compiled.
ALGOL was probably the first fully structured high level language and was designed to be all-purpose. It is extremely logical in its approach and it is possible to write large reliable programs using it. One feature of ALGOL is that each command or statement returns a value which can be used as a parameter in another statement. This can lead to programs being difficult to follow. ALGOL is a complex language with an excessively fussy syntax and a considerable amount of effort is required to learn the details of it before any programs can be written. Part of the problem was the main author (Charles Lindsay) of the language's refusal to acknowledge that it could be improved in any way. ALGOL was fairly widely used in the 1970s but is seldom heard of today.
Example:-
(.loc [-100:+100, -100:+100] .bool blocked :=
;.proc maze = (.int x, y, f) .bool:
.if blocked[x, y] .then .false
.elif x=0 .and y=0 .then .true
.else .loc .bool found := .false, .loc .int d := f
;.while ( d:= (d+1) .mod 4 ) /= f
.and .not (found :=
maze ( x + .case d+1 .in 0, -1, 0, +1 .esac
, y + .case d+1 .in -1, 0, +1, 0 .esac
, d ) )
.do .skip .od;
; found
.fi
;.print ( maze(0, 100, 0) ) )

This determines whether the centre of a maze can be reached from each point within it.

Back to ContentsReturn to Contents List

Pascal

- Named after the French mathematician Blaise Pascal

Designed by Niklaus Wirth in 1968 as a derivative of ALGOL, Pascal was designed to be easy to compile and as a result programs written in it tend to be fast. ("Pascal runs like Blaises.")
Pascal was designed as a language to teach good programming methods and to be difficult to make a mistake in. It has some good features, mainly allowing precise definition of the allowable values of variables. For example if a program needed to deal with the colours Red, White and Blue, in most languages it would be necessary to assign each colour a number (e.g. 1, 2 & 3) and simply store the number. This leads to problems such as what does 4 represent? In Pascal it is possible to define a new data type with :-
    TYPE COLOURS=RED, WHITE, BLUE
A variable which is of type COLOURS can then only contain one of the values RED, WHITE or BLUE and this helps to avoid some common programming errors.
Unfortunately Pascal is extremely limited in many respects. Its ability to handle numbers is poor and text handling is almost non-existent. One of the main advantages intended for Pascal was that it would be a standard language and a program written in it would work on any computer. (Assuming a Pascal compiler was available.) In practice the lack of features has led to Pascal being extended in non standard ways and this advantage has been lost. Pascal remains one of the easiest languages in which to learn the principles of structured programming but apart from a flurry of interest in the early 1980s when it was the fashionable language it is seldom used for large programming projects.

Back to ContentsReturn to Contents List

Modula 2

- Uses MODULAr programming

Derived from Pascal by Niklaus Wirth. A compiled language. Although Pascal allows a program to be broken down into separate operations, known as procedures, all the procedures used in a program must be part of that program. Modula 2 allows completely separate subprograms to be written which can be compiled individually before being slotted together. Each subprogram or module contains a definition of the input it requires and the output it produces but what happens within the module is hidden. This approach to programming is useful for large programs written by a team of programmers and can produce reliable programs. It was only in the late 1980s that implementations of Modula 2 became widely available but they tended to be difficult to use. If it had been developed it could have become a popular language but at the moment it has fallen out of favour.

Back to ContentsReturn to Contents List

ADA

Named after Ada Lovelace, the close assistant of Charles Babbage and daughter of Lord Byron and his half sister.
Descended from Pascal and the main language of the Department of Defense. Ada is a highly structured language and the general opinion seems to be that it is precisely what one would expect of a civil service language - unnecessarily large, awkward to use and overall a waste of time and money.
Example:-
last_random: real :=0.5+real 'epsilon;
.procedure random(rr: real .out) .is
.declare
a: .constant real := last_random*125.0;
.begin
last_random := a-trunc(a);
rr := last_random; .end;

This generates random numbers.

Back to ContentsReturn to Contents List

COBOL

- COmmon Business Oriented Language

Designed for writing business programs such as record keeping and stock control programs. COBOL is very verbose in written form and not very clear. It is well equipped with features for defining how information is to be inputted and displayed but has only limited ability to process the information. COBOL is the only common language to use binary coded decimal arithmetic. COBOL seems to have been largely superseded but many 'legacy' systems in business still use it and some specialised database languages have their roots in it.
Example:-
01 DATA-RECORD
02 FIELD1 PIC A(3).
02 VERSION1
03 F1 PIC A(3).
03 F2 PIC AAX.
03 F3 PIC 99 OCCURS 6 TIMES.
03 F3A REDEFINES F3.
04 G14 PIC X(3).
04 G56 PIC X(4).
02 VERSION2 REDEFINES VERSION1.
04 H1 PIC A(3)9(3)A(3).
04 H2 PIC X(9).
02 FIELD3 PIC X(4).

This defines the layout of a record in a database. Users of more recent and less 'flabby' languages tend to refer to COBOL users as "a load of old COBOLers".

Back to ContentsReturn to Contents List

FORTRAN

- FORmula TRANslation

The two best known versions are FORTRAN 66 and FORTRAN 77 (as in 1977), although FORTRAN 92 has appeared. A compiled language.
FORTRAN was specifically designed for numerical calculations and is the best equipped language for this purpose. It can handle most types of arithmetic, including complex numbers, and for this reason is widely used in the scientific world. Unfortunately FORTRAN shares many of the problems of the early BASICs. It is difficult to exchange information between different parts of the program and variable names are complicated by the fact that the type of a variable (i.e. whether it is an integer, real number, complex number or string of characters) depends on the first letter of its name. Badly written FORTRAN programs can suffer from almost random errors.
There are two main reasons why FORTRAN is still used. Firstly it is a reasonably standardised language and so programs are 'portable' between computers. Secondly there exists a vast number of ready-written 'library' subroutines to cover such mathematical tasks as statistical calculations and solving differential equations. (e.g. the NAG or Numerical Algorithms Group library.) Since these subroutines are not easy to write well it is a considerable advantage to the programmer to have tested versions already available.
Example:-
PROGRAM SQUARES
INTEGER I
INTEGER A(1:100)
DO 9 I = 1, 100
A(I) = I*I
9 CONTINUE
PR(A)
END
SUBROUTINE PR(B)
INTEGER B(1, 100)
DO 9 I = 1, 100
PRINT B(I)
9 CONTINUE
END

This calculates the squares of the numbers 1 to 100 then prints them out.

Back to ContentsReturn to Contents List

LOGO

- From the word for a drawing.

Designed by Dr Seymour Papert at MIT. Always interpreted.
LOGO is best known for its 'turtle' graphics which consist of a pointer on the screen (sometimes in the shape of a turtle) which responds to commands such as FORWARD 50 or RIGHT 90 to move it around the screen leaving a trail behind. Some versions had a robot turtle controlled by the computer with a pen attached which ran over a sheet of paper on the floor.
LOGO was originally intended to teach young children the principles of good programming whilst also learning geometry. LOGO encourages the user to explore the language and its capabilities interactively. LOGO is more than just a series of drawing commands however. It is a fully structured language which contains a relatively small number of built in features but allows the user to easily add more.
Take the following:-
TO SQUARE:SIDE
    FORWARD SIDE   ;move forwards
    RIGHT 90       ;turn clockwise 90 degrees
    FORWARD SIDE
    RIGHT 90
    FORWARD SIDE
    RIGHT 90
    FORWARD SIDE
END

This defines how to draw a square. The command SQUARE 40 would then draw a square 40 units to a side. LOGO is limited in its range of application, being useless for numerical applications for example.

Back to ContentsReturn to Contents List

LISP

- LISt Processing language (or alternatively Lots of Irritating Superfluous Parentheses)

Invented by John McCarthy at MIT in the late 1950s. LISP is based on the idea of variable length lists of 'objects' and trees (which are lists linked together in a specified way) as the fundamental data type. There is no real distinction between variables and commands in that each command returns a value and lists may contain commands which are carried out when the list is used. LISP is mainly used for 'Artificial Intelligence' programming because it allows items of information to be linked together.

Back to ContentsReturn to Contents List

PROLOG

- PROgramming in LOGic

Another language used in artificial intelligence. The classic example of artificial intelligence is expert systems such as those used for medical diagnosis. In this case a highly experienced consultant enters his accumulated wisdom into a computer system which produces a series of rules and conclusions. This can then be used by a less experienced doctor who will enter the symptoms of the illness together with any test results. The program will apply the rules which it has 'learnt' and list the possible diagnoses, perhaps including suggestions for further tests. It is important to remember that the expert system has no intelligence of its own and is only as good, or bad, as the information that was fed into it. At present such systems tend to be more artificial than intelligent.

Back to ContentsReturn to Contents List

FORTH

- Possibly from 'fourth'

Originally designed in the late 1960s for the control of radio telescopes. Interpreted.
FORTH is intermediate between a low level and a high level language. It retains the ability to directly access the hardware of a computer and for this reason was often used for computer control, for example of model robots. It also allows the user to define new commands as in a high level language. The statements in FORTH are stored in a form which is partly compiled and as a result it runs much faster than a purely interpreted language such as early BASICs. FORTH looked set to become popular on home computers in the early 1980s but in practice it lost out to BASIC compilers and faster processors.

Example:-

: run
fast
9249 x ! 9339 y ! 9479 z !
0 a ! 6 k ! b  3 9249 c !
begin
 m z @ w !
 g w @ z !
 y @ w !
 g w @ y !
 x @ dup y @ =
 swap z @ = or
until
slow 999 999 beep ;
This section of FORTH program is the main control loop for a pacman-type game, which is certainly not obvious from reading it!
Forth was not a particularly easy language to learn programming in.

Back to ContentsReturn to Contents List

C

- So named because it was derived from a language called B which itself derived from BCPL

Devised by Dennis Ritchie in the early 1970s. Always compiled.
C is a language which allows access to the hardware of a computer in a similar way to machine code and is therefore often used to write operating systems and small utility programs. It has the advantage over assembly language of containing the advanced features of a high level language. C does not force the use of a structured style however and listings of C programs can be very cryptic. It is easy to make mistakes and programs tend to difficult to debug. C was a fashionable language which tended to lead to it being used in large programming projects, such as databases and spreadsheets, for which it was perhaps not best suited.
The main advantages of C are firstly that it is possible to write programs which operate as fast as if they had been written in assembly language and secondly that C is probably the most portable language. This means that a C program will in most cases work on any computer with a C compiler with little or no modification. This rare attribute is useful to commercial software writers who do not need to produce a different version of their latest program for each make of computer.

Back to ContentsReturn to Contents List

C++

- Named after the increment operator in C

Inspired by a 1960s language called SIMULA which was written for real time simulation uses. (Also derived from SIMULA was Smalltalk, produced by Xerox Parc in the 1970s, which was the first language to use windows and mice.)
C++ is the latest version of the C language. C++ is an object orientated language. In most languages a program consists of a list of instructions to manipulate the input to produce the required output. In an object orientated language the program defines entities (objects) which represent the real world and how those objects interact with each other. In practice it requires a considerable shift in programming style to work with object orientated languages which many programmers find difficult. C++ has become a popular language since its properties are well suited to manipulating a graphical environment such as Windows.

Back to ContentsReturn to Contents List

Intercal

- Short for Computer Language With No Readily Pronounceable Acronym

Intercal was designed on the morning of 26th May 1972 at Princeton University by Donald R Woods and James M Lyon. The only current implementation is C-Intercal, a compiler written in C by Eric S. Raymond.
Intercal is a somewhat idiosyncratic language. It has an elegant solution to the problem of syntax errors: if the compiler (called Ick) does not understand part of a program it simply ignores it. Intercal makes creative use of the character set and a typical line of program may look like this:-

    PLEASE DO IGNORE .1<".1^C'&:51~"#V1^C!12~;&75SUB"V'V.1~

The format of input data to Intercal is numbers, the digits of which are spelt out in English, and the output is also numbers, printed as Roman numerals. Many academic programming experts believe that the GOTO statement, as used in BASIC, is harmful and leads to 'spaghetti code' which jumps around and is impossible to follow. Intercal avoids this criticism by not using the GOTO command. Instead C-Intercal uses the COME FROM statement. COME FROM specifies a line number from which the program will jump to the COME FROM statement, i.e. the opposite of a GOTO.
To make up for the lack of a conditional statement to determine whether or not a line is executed Intercal provides the ABSTAIN FROM command which specifies a line number which is not to be executed. Alternatively all statements of a given type can be abstained from. To revoke abstention the REINSTATE command is used. The designers of Intercal believed that the source code of a program should be easy to understand, or failing that it should be polite. Each program line begins with one of DO, PLEASE or PLEASE DO. The compiler will reject programs which are not sufficiently polite or are excessively obsequious. The line may optionally continue with NOT, N'T or n%. NOT and N'T are self explanatory while n% represents the percentage probability that the following statement is executed. The language has only five, rather esoteric, operators which all act on the binary pattern of a number, however a library is supplied to deal with the more usual mathematical functions.
Intercal may be the language of the future. It creates jobs since it is widely observed that if a programming language is unfriendly, unwieldy and difficult to master the response of management is to recruit more programmers rather than improve it. Intercal also helps to sell computer hardware. The convoluted programming needed to achieve anything in Intercal ensures that it runs so slowly that you absolutely must have the latest and fastest processor just to get anything out of it. Intercal is not widely used.

Computers Page
Back to Computers Page

Home Page
Home Page

 


Hosted by www.Geocities.ws

1