The computer you are
using to read this page uses a microprocessor to do its work.
The microprocessor is the heart of any normal computer, whether it
is a desktop machine, a server, or a laptop. The microprocessor you
are using might be a Pentium, a K6, a PowerPC, a Sparc or any of the
many other brands and types of microprocessors, but they all do
approximately the same thing in approximately the same way.
If you have ever wondered what the microprocessor in your
computer is doing, or if you have ever wondered about the
differences between different microprocessors. You will learn how
fairly simple digital logic techniques allow a computer to do its
job, whether its playing a game or spell checking a document!
Microprocessor History
A
microprocessor - also known as a CPU or Central Processing Unit - is
a complete computation engine that is fabricated on a single chip.
The first microprocessor was the Intel 4004, introduced in 1971. The
4004 was not very powerful - all it could do was add and subtract,
and it could only do that four bits at a
time. But it was amazing that everything was on one chip. Prior to
the 4004, engineers built computers either from collections of chips
or from discrete components (transistors wired one at a time). The
4004 powered one of the first portable electronic calculators.
The first microprocessor to make it into a home computer was the
Intel 8080, a complete 8-bit computer on one chip introduced in
1974. The first microprocessor to make a real splash in the market
was the Intel 8088, introduced in 1979 and incorporated into the IBM
PC (which first appeared in 1982 or so). If you are familiar with
the PC market and its history, you know that the PC market moved
from the 8088 to the 80286 to the 80386 to the 80486 to the Pentium
to the Pentium-II to the new Pentium-III. All of these
microprocessors are made by Intel and all of them are improvements
on the basic design of the 8088. The new Pentiums-IIIs can execute
any piece of code that ran on the original 8088, but the Pentium-III
runs about 3,000 times faster!
The following table helps you to understand the differences
between the different processors that Intel has introduced over the
years.
Name |
Date |
Transistors |
Microns |
Clock speed |
Data width |
MIPS |
8080 |
1974 |
6,000 |
6 |
2 MHz |
8 |
0.64 MIPS |
First home computers |
8088 |
1979 |
29,000 |
3 |
5 MHz |
16 bits, 8 bit bus |
0.33 MIPS |
First IBM PC |
80286 |
1982 |
134,000 |
1.5 |
6 MHz |
16 bits |
1 MIPS |
IBM ATs. Up to 2.66 MIPS at 12 MHz |
80386 |
1985 |
275,000 |
1.5 |
16 MHz |
32 bits |
5 MIPS |
Eventually 33 MHz, 11.4 MIPS |
80486 |
1989 |
1,200,000 |
1 |
25 MHz |
32 bits |
20 MIPS |
Eventually 50 MHz, 41 MIPS |
Pentium |
1993 |
3,100,000 |
0.8 |
60 MHz |
32 bits, 64 bit bus |
100 MIPS |
Eventually 200 MHz |
Pentium II |
1997 |
7,500,000 |
0.35 |
233 MHz |
32 bits, 64 bit bus |
400 MIPS? |
Eventually 450 MHz, 800 MIPS? |
Pentium III |
1999 |
9,500,000 |
0.25 |
450 MHz |
32 bits, 64 bit bus |
1,000 MIPS? |
Compiled
from The
Intel Microprocessor Quick Reference Guide
What is a Chip?A chip is
also called an integrated circuit. Generally it is a
small, thin piece of silicon onto which the transistors making
up the microprocessor have been etched. A chip might be as
large as an inch on a side and can contain as many as 10
million transistors. Simpler processors might consist of a few
thousand transistors etched onto a chip just a few millimeters
square. See How
Chips Are Made for details on how transistors are
fabricated on silicon. |
Information about
this table:
- The date is the year that the processor was first
introduced. Many processors are re-introduced at higher clock
speeds for many years after the original release date.
- Transistors is the number of transistors on the chip.
You can see that the number of transistors on a single chip has
risen steadily over the years.
- Microns is the width, in microns, of the smallest wire
on the chip. For comparison, a human hair is 100 microns thick. As
the feature size on the chip goes down, the number of transistors
rises.
- Clock speed is the maximum rate that the chip can be
clocked. Clock speed will make more sense in the next section.
- Data Width is the width of the ALU. An 8-bit ALU can
add/subtract/multiply/etc. two 8-bit numbers, while a 32-bit ALU
can manipulate 32-bit numbers. An 8-bit ALU would have to execute
4 instructions to add two 32-bit numbers, while a 32-bit ALU can
do it in one instruction. In many cases the external data bus is
the same width as the ALU, but not always. The 8088 had a 16-bit
ALU and an 8-bit bus, while the modern Pentiums fetch data 64 bits
at a time for their 32-bit ALUs.
- MIPS stands for Millions of Instructions Per Second,
and is a rough measure of the performance of a CPU. Modern CPUs
can do so many different things that MIPS ratings lose a lot of
their meaning, but you can get a general sense of the relative
power of the CPUs from this column.
From this table you
can see that, in general, there is a relationship between clock
speed and MIPS. The maximum clock speed is a function of the
manufacturing process and delays within the chip. There is also a
relationship between the number of transistors and MIPS. For
example, the 8088 clocked at 5 MHz but only executed at 0.33 MIPS
(about 1 instruction per 15 clock cycles). Modern processors can
often execute at a rate of 2 instructions per clock cycle. That
improvement is directly related to the number of transistors on the
chip and will make more sense in the next section.
Inside a Microprocessor
To
understand how a microprocessor works, it is helpful to look inside
and learn about the logic used to create one. In the process you can
also learn about assembly language - the native language of a
microprocessor - and many of the things that engineers can do to
boost the speed of a processor.
A microprocessor executes a collection of machine instructions
that tell the processor what to do. Based on the instructions, a
microprocessor does three basic things:
- Using its ALU (Arithmetic/Logic Unit), a microprocessor can
perform mathematical operations like addition, subtraction,
multiplication and division. Modern microprocessors contain
complete floating point processors that can perform extremely
sophisticated operations on large floating point numbers.
- A microprocessor can move data from one memory location to
another
- A microprocessor can make decisions and jump to a new set of
instructions based on those decisions.
There may be very
sophisticated things that a microprocessor does, but those are its
three basic activities. The following diagram shows an extremely
simple microprocessor capable of doing those three things:
This is about as simple as a microprocessor gets. This
microprocessor has:
- an address bus (that may be 8, 16 or 32 bits wide) that
sends an address to memory
- a data bus (that may be 8, 16 or 32 bits wide) that can
send data to memory or receive data from memory
- a RD (Read) and WR (Write) line to tell the
memory whether it wants to set or get the addressed location
- a clock line that lets a clock pulse sequence the processor
- A reset line that resets the program counter to zero (or
whatever) and restarts execution.
Let's assume that both
the address and data buses are 8 bits wide in this example.
Here are the components of this simple microprocessor:
- Registers A, B and C are simply latches made out of flip-flops
(See the section on "edge-triggered latches" in How Boolean Logic
Works for details).
- The address latch is just like registers A, B and C.
- The program counter is a latch with the extra ability to
increment by 1 when told to do so, and also to reset to zero when
told to do so.
- The ALU could be as simple as an 8-bit adder (See the section
on adders in How Boolean Logic
Works for details), or it might be able to add, subtract,
multiply and divide 8-bit values. Let's assume the latter here.
- The test register is a special latch that can hold values from
comparisons performed in the ALU. An ALU can normally compare two
numbers and determine if they are equal, if one is greater than
the other, etc. The test register can also normally hold a carry
bit from the last stage of the adder. It stores these values in
flip-flops and then the instruction decoder can use the values to
make decisions.
- There are 6 boxes marked "3-State" in the diagram. These are
tri-state buffers. A tri-state buffer can pass a 1, a 0 or
it can essentially disconnect its output (imagine a switch that
totally disconnects the output line from the wire the output is
heading toward). A tri-state buffer allows multiple outputs to
connect to a wire, but only one of them to actually drive a 1 or a
0 onto the line.
- The instruction register and instruction decoder are
responsible for controlling all of the other components.
Although they are not shown in
this diagram, there would be control lines from the instruction
decoder that would:
- Tell the A register to latch the value currently on the data
bus.
- Tell the B register to latch the value currently on the data
bus.
- Tell the C register to latch the value currently on the data
bus.
- Tell the program counter register to latch the value currently
on the data bus.
- Tell the address register to latch the value currently on the
data bus.
- Tell the instruction register to latch the value currently on
the data bus.
- Tell the program counter to increment
- Tell the program counter to reset to zero
- Activate any of the 6 tri-state buffers (6 separate lines)
- Tell the ALU what operation to perform
- Tell the test register to latch the ALUs test bits
- Activate the RD line
- Activate the WR line
Coming into the instruction
decoder are the bits from the test register and the clock line, as
well as the bits from the instruction register.
RAM and ROM
The previous
section talked about the address and data buses, as well as the RD
and WR lines. These buses and lines connect either to RAM or ROM -
generally both. In our sample microprocessor we have an address bus
8 bits wide and a data bus 8 bits wide. That means that the
microprocessor can address 28 = 256 bytes of memory, and
it can read or write 8 bits of the memory at a time. Let's assume
that this simple microprocessor has 128 bytes of ROM starting at
address 0 and 128 bytes of RAM starting at address 128.
ROM stands for
Read-Only Memory. A ROM chip is programmed with a permanent
collection of pre-set bytes. The address bus tells the ROM chip
which byte to get and place on the data bus. When the RD line
changes state, the ROM chip presents the selected byte onto the data
bus.
RAM stands for
Random Access Memory. RAM contains bytes of information and the
microprocessor can read or write to those bytes depending on whether
the RD or WR line is signaled. One problem with today's RAM chips is
that they forget everything once they power goes off. That is why
the computer needs ROM.
By the way, nearly all computers contain some amount of ROM (it
is possible to create a simple computer that contains no RAM (many micro controllers
do this by placing a handful of RAM bytes on the processor chip
itself), but generally impossible to create one that contains no
ROM). On a PC, the ROM is called the BIOS (Basic
Input/Output System). When the microprocessor starts, it begins
executing instructions it finds in the BIOS. The BIOS instructions
do things like testing the hardware in the machine, and then it goes
to the hard disk to fetch the boot sector (see How Hard Disks
Work for details). This boot sector is another small program,
and the BIOS stores it in RAM after reading it off the disk. The
microprocessor then begins executing the boot sector's instructions
from RAM. The boot sector program will tell the microprocessor to
fetch something else from the hard disk into RAM, which the
microprocessor then executes, and so on. This is how the
microprocessor loads and executes the entire operating
system.
Understanding Microprocessor
Instructions
Even the incredibly simple microprocessor
shown in the previous example will have a fairly large set of
instructions that it can perform. The collection of instructions is
implemented as bit patterns, each one of which has a different
meaning when loaded into the instruction register. Humans are not
particularly good at remembering bit patterns, so a set of short
words are defined to represent the different bit patterns. This
collection of words is called the assembly language of the
processor. An assembler can translate the words into their
bit patterns very easily, and then the output of the assembler is
placed in memory for the microprocessor to execute.
Here's the set of assembly language instructions that the
designer might create for the simple microprocessor shown above:
- LOADA mem - Load register A from memory address
- LOADB mem - Load register B from memory address
- CONB con - Load a constant value into register B
- SAVEB mem - Save register B to memory address
- SAVEC mem - Save register C to memory address
- ADD - Add A and B and store the result in C
- SUB - Subtract A and B and store the result in C
- MUL - Multiply A and B and store the result in C
- DIV - Divide A and B and store the result in C
- COM - Compare A and B and store result in test
- JUMP addr - Jump to an address
- JEQ addr - Jump if equal, to address
- JNEQ addr - Jump if not equal, to address
- JG addr - Jump if Greater than, to address
- JGE addr - Jump if Greater than or equal, to address
- JL addr - Jump if Less than, to address
- JLE addr - Jump if Less than or equal, to address
- STOP - Stop execution
So now the question is, "How do all of these instructions look in
ROM?" Each of these assembly language instructions must be
represented by a binary number. For the sake of simplicity, let's
assume each assembly language instruction is given a unique number,
like this:
- LOADA - 1
- LOADB - 2
- CONB - 3
- SAVEB - 4
- SAVEC mem - 5
- ADD - 6
- SUB - 7
- MUL - 8
- DIV - 9
- COM - 10
- JUMP addr - 11
- JEQ addr - 12
- JNEQ addr - 13
- JG addr - 14
- JGE addr - 14
- JL addr - 16
- JLE addr - 17
- STOP - 18
The numbers are known as opcodes. In
ROM, our little program would look like this:
You can see that 7 lines of C code became 17 lines of assembly
language and that became 31 bytes in ROM.
The instruction decoder needs to turn each of the opcodes into a
set of signals that drive the different components inside the
microprocessor. Let's take the ADD instruction as an example and
look at what it needs to do:
- During the first clock cycle we need to actually load the
instruction. Therefore the instruction decoder needs to:
- activate the tri-state buffer for the program counter
- activate the RD line
- activate the data-in tri-state buffer
- latch the instruction into the instruction register
- During the second clock cycle the ADD instruction is decoded.
It needs to do very little:
- Set the operation of the ALU to addition
- Latch the output of the ALU into the C register
- During the third clock cycle, the program counter is
incremented (in theory this could be overlapped into the second
clock cycle).
Every instruction can be broken down as a
set of sequenced operations like these that manipulate the
components of the microprocessor in the proper order. Some
instructions, like this ADD instruction, might take 2 or 3 clock
cycles. Others might take 5 or 6 clock cycles.
Performance
The number of
transistors available has a huge effect on the performance of a
processor. As seen earlier, a typical instruction in a processor
like an 8088 took 15 clock cycles to execute. Because of the design
of the multiplier, it took approximately 80 cycles just to do one
16-bit multiplication on the 8088. With more transistors, much more
powerful multipliers capable of single-cycle speeds become possible.
More transistors also allow a technology called
pipelining. In a pipelined architecture, instruction
execution overlaps. So even though it might take 5 clock cycles to
execute each instruction, there can be 5 instructions in various
stages of execution simultaneously. That way it looks like one
instruction completes every clock cycle.
Many modern processors have multiple instruction decoders, each
with its own pipeline. This allows multiple instruction streams,
which means more than one instruction can complete during each clock
cycle. This technique can be quite complex to implement, so it takes
lots of transistors.
The trend in processor design has been toward full 32-bit ALUs
with fast floating point processors built in and pipelined execution
with multiple instruction streams. There has also been a tendency
toward special instructions (like the MMX instructions) that make
certain operations particularly efficient. There has also been the
addition of hardware virtual memory support and L1 caching on the
processor chip. All of these trends push up the transistor count,
leading to the multi-million transistor powerhouses available today.
These processors can execute about one billion instructions per
second!