To understand the basic idea behind a cache system, let's start with a
super-simple example that uses a librarian to demonstrate caching
concepts. Let's imagine a librarian behind his desk. He is there to give
you the books you ask for. For the sake of simplicity, let's say you can't
get the books yourself -- you have to ask the librarian for any book you
want to read, and he fetches it for you from a set of stacks in a
storeroom (the library of congress in Washington, D.C., is set up this
way). First, let's start with a librarian without cache.
The first customer arrives. He asks for the book
Moby Dick. The librarian goes into the storeroom, gets the
book, returns to the counter and gives the book to the customer. Later,
the client comes back to return the book. The librarian takes the book and
returns it to the storeroom. He then returns to his counter waiting for
another customer. Let's say the next customer asks for Moby Dick
(you saw it coming...). The librarian then has to return to the storeroom
to get the book he recently handled and give it to the client. Under this
model, the librarian has to make a complete round trip to fetch every book
-- even very popular ones that are requested frequently. Is there a way to
improve the performance of the librarian?
Yes, there's a way -- we can put a cache on the librarian. Let's
give the librarian a backpack into which he will be able to store 10 books
(in computer terms, the librarian now has a 10-book cache). In this
backpack, he will put the books the clients return to him, up to a maximum
of 10. Let's use the prior example, but now with our new-and-improved
caching librarian.
The day starts. The backpack of the librarian is empty. Our first
client arrives and asks for Moby Dick. No magic here -- the
librarian has to go to the storeroom to get the book. He gives it to the
client. Later, the client returns and gives the book back to the
librarian. Instead of returning to the storeroom to return the book, the
librarian puts the book in his backpack and stands there (he checks first
to see if the bag is full -- more on that later). Another client arrives
and asks for Moby Dick. Before going to the storeroom, the
librarian checks to see if this title is in his backpack. He finds it! All
he has to do is take the book from the backpack and give it to the client.
There's no journey into the storeroom, so the client is served more
efficiently.
What if the client asked for a title not in the cache (the backpack)?
In this case, the librarian is less efficient with a cache than without
one, because the librarian takes the time to look for the book in his
backpack first. One of the challenges of cache design is to minimize the
impact of cache searches, and modern hardware has reduced this time delay
to practically zero. Even in our simple librarian example, the latency
time (the waiting time) of searching the cache is so small compared to the
time to walk back to the storeroom that it is irrelevant. The cache is
small (10 books), and the time it takes to notice a miss is only a tiny
fraction of the time that a journey to the storeroom takes.
From this example you can see several important facts about caching:
- Cache technology is the use of a faster but smaller memory type to
accelerate a slower but larger memory type.
- When using a cache, you must check the cache to see if an item is in
there. If it is there, it's called a cache hit. If not, it is
called a cache miss and the computer must wait for a round trip
from the larger, slower memory area.
- A cache has some maximum size that is much smaller than the larger
storage area.
- It is possible to have multiple layers of cache. With our librarian
example, the smaller but faster memory type is the backpack, and the
storeroom represents the larger and slower memory type. This is a
one-level cache. There might be another layer of cache consisting of a
shelf that can hold 100 books behind the counter. The librarian can
check the backpack, then the shelf and then the storeroom. This would be
a two-level cache.