|
Computational
Biology
A New Paradigm
Sure, you know about your regular biology. Cells,
photosynthesis, reproduction, prokaryotes, and all of that
other stuff. What you probably aren't used to seeing much of
in regular biology is physics-and all of the math that comes
along with it. Yet, if physics is meant to be the
mathematical language that describes all of nature, why
don't we see more of it in biology class? In physics, we can
calculate the trajectories of planets, the mass of planets,
the motions of objects, right? So why can't we calculate the
workings of viruses, how DNA replicates, whether a drug
could destroy cancer cells; why must we physically carry out
all of these experiments in a lab? Well, the answer is that
we don't. The important thing, though, is that it is not so
easy to calculate the outcomes of such situations.
Thomas Kuhn, in his book The
Structure of Scientific Revolution, believes in two
kinds of science: normal and revolutionary. The normal
science is much like the stuff you did in high school
chemistry labs; everything is done in accord to some
paradigm. Your procedure is drawn up based on the current
rules of this paradigm, and your results will only support
that paradigm. Revolutionary science, on the other hand,
contradicts the paradigm. It extends it, and creates a new
one in its place. When you finish your high school chemistry
lab, you usually don't expect to rewrite your
textbook with your results. However, when Newton created
physics, he was replacing the prevalent paradigm. Every
other attempt to describe the universe's workings was thrown
out in favor of Newton's laws. Later, when Einstein devised
relativity, it was then realized that Newton's paradigm was
insufficient. Biology has recently experienced such a
paradigm shift.
It all began in 1926, when
Erwin Schrodinger developed the famous wave equation which
today bears his name. This wave equation would become the
foundation of quantum mechanics, a field which in itself was
revolutionary science. Newton's laws of mechanics would
never have been able to describe the properties of systems
the size of molecules. It could not have properly described
the energy levels of a T1 cell, for instance; quantum
mechanics could. The problem with Schrodinger's equation,
however, was its complexity. It looks rather simple, right?
Unfortunately, however, it is impossible to solve for
anything bigger than a hydrogen atom (due to the infamous
many-body problem). But it is still possible to use, via
methods of approximation, most of which come from the field
of variational calculus. Even so, the problem remains
insurmountably difficult to work with. Just to perform an
energy calculation on a molecule like benzene would require
you to solve millions upon millions of integrals. Not the
easy kind of integrals, either, ones usually involving such
complex approaches as Fourier transforms to solve. To say
the least, back in first half of this century, solving
Schrodinger's equation for polyelectronic molecules was
ridiculous. People would think you either had no idea what
you were talking about or were crazy. To say you were overly
optimistic would be an absolute understatement.
Now, imagine another great
development this century: computers. By 1970, the CDC
corporation was building machines that could perform 10
million floating-point operations per second (MFLOPS). By
1999, Texas Instruments was building calculators the same
speed. Today's supercomputing facilities usually harbor
machines such as the Cray T3E, which with up to 128 300 Mhz
DEC Alpha processors can achieve peak speeds of nearly 40
billion floating point operations per second (GFLOPS). So,
does this mean that Schrodinger's equation can easily be
solved now, for any problem? We wish. Schrodinger's
equation, on very large molecules, are multi-molecule
systems, could still bring all of the computing facilities
this planet has to offer, easily until the end of time, long
after the whole universe collapses. To remain optimistic,
though, scientists routinely carry out quantum mechanics
calculations on organic molecules, even small proteins.
So, tying the two
developments together now, both quantum mechanics and
computers, we have a new field called computational biology.
No lab is necessary. Just a powerful workstation. Put in the
structure of a molecule, and it could calculate the energy
for you. It could also calculate the actual shape of the
molecule. It can also simulate chemical reactions, along
with many other things. It is much like the difference
between dropping a ball from the top story of your house and
timing how long it takes to fall, and pulling out your
physics textbook and calculating the time. Computational
biology offers many advantages over conventional biology.
Rather than using a nuclear magnetic resonance (NMR)
spectrometer to observe the tertiary conformation
(three-dimensional shape) of a protein, your computer could
calculate it for you using the laws of quantum physics and
thermodynamics. It also offers several advantages not
available in conventional biology. One could, for instance,
predict the ideal drug treatment for HIV, in terms of
electrostatic, conformational, and other properties, rather
than testing a ton of different candidates believed
to work. With the laws of physics, there is no guessing.
But, does that mean that
computational biology will supplant conventional biology?
Doubtful. Number one, the mathematical physics are quite
difficult, so don't expect to see it being taught in high
school any time soon. Second, modern computational resources
do not make computational biology applicable to every
problem. Because of the large amount of approximation
necessary to make calculations tractable, computational
results are many times less accurate than experimental ones.
Third, conventional biology is still very practical for many
different problems, mostly those that involve systems or
reactions too large for a computer to calculate. For
instance, cell reproduction. It would not be possible to
apply Schrodinger's equation to an atomic system as massive
as a cell using today's computational resources. Even when
it does become possible, it would probably be quicker to
check things out under a microscope. So, until the
computational ability to solve such problems with ease come
along, we must rely on conventional biology for the study of
many problems.
Nonetheless, in its current
state, computational biology does have many uses. Already,
Dupont Merck laboratories have developed HIV REV inhibitors
to treat the AIDS virus, using only computers.
So What's a Force-Field
or Hyper surface?
Most of the earliest problems attacked in the field of
computational biology (or its predecessor computational
chemistry) had to do with conformations, or the
three-dimensional shapes of molecules. The 2-D diagrams in
your chemistry book are nice, but they don't show exactly
what the molecule may look like. Nor are such diagrams in
three-dimensions, as real molecules are. Finding the
conformation of a molecule involves two major parts: an
energy equation and an optimization algorithm. By the laws
of thermodynamics, one knows that a system always assumes
the lowest possible energy. Thus, a molecule should have a
conformation which also has the lowest energy--this is
another way of saying that it is most stable. The collective
set of all possible conformations can be envisioned as a
surface, where the height of each point is its energy and
its x and y values represent its conformation, perhaps two
angles in the molecule. The lowest point on this surface is
the most stable, and is called the global minimum.
Now, add in all the different angles and bond-lengths and
your surface suddenly exists in many different dimensions;
thus it is called a hypersurface. The optimization algorithm
searches this surface for the global minimum, using the
energy equation as a criterion.
Schrodinger's equation can be used as an energy equation. In
the form H Psi = E Psi, we can solve for E, which is the
total energy. This is an extremely time-consuming process,
however, and is only used for very small molecules. A
force-field equation is a nice alternative. It is a simple
equation, usually of the following form: E = EBONDS
+ EANGLES + EVDW. The first term on
the right side of the equation is usually a sum of each
bond's energy, which may get bigger and bigger as the bond
length gets farther from a reference value. The angles term
is the same, getting larger and larger as the bond angles
depart from a reference value. The final term is a sum of
Van der Waals energies, or inter-atomic energies. This is
the interaction energy between any two different atoms in
the molecule, and could be positive or negative. For
example, as two carbon atoms come very close together, their
strong nuclear repulsion would cause this term to rise
greatly. However, a hydrogen on a water molecule and an
oxygen on another water molecule may have a negative VDW
term, since they are attracted by polar forces (hydrogen
bonding). Many times, these force-field equations can be parameterised,
or modified in such a way that they agree better with
experimental data. The important thing is that force-fields
have no mathematical basis whatsoever--they are mere
estimates. Thus, when compared to equations like
Schrodinger's wave equation, they are significantly more
accurate (but faster!).
Optimization algorithms, come in many forms, from very
mathematical methods such as Newton-Raphson, or from other
fields, such as the Monte Carlo method or the genetic
algorithm. Mathematical methods such as Newton-Raphson
almost exclusively work by calculating derivatives to either
guess or know where the minimum is, based on slope (and
sometimes acceleration). Methods like Monte Carlo and
genetic algorithm simply meander their way around the hypersurface
looking for good spots. In the case of the MC method, it
picks it spots by little more than random guesses. The
genetic algorithm uses the laws of natural selection to try
and identify the good traits of a stable conformation.
What's Beyond
Conformations?
A great deal! For instance, a field such as drug design
doesn't limit itself merely to conformations. It contains
many other important aspects, such as 3D-database searching,
ligand-receptor docking simulations, ligand design,
structure generation, and more. The problem is, is that the
field of computational biology is still growing to such an
extent that it is difficult to tell where it will end, or
what new fields will be created in the years to come. As the
computational resources of the scientific community
continues to grow, it has an increasing number of options to
choose from. Things that seemed ludicrous fifty years ago
are common-place today. Of course, I can't list every branch
that computational biology has expanded into since its
inception. However, some of the major fields that loom large
today include: the Protein-Folding problem, the design of
new drugs, analysis of X-ray and NMR data, among others I
can't recall off the top of my head. Nor is the use of
computers limited to biology. Just recently there has been
great growth in the field of computational quantum chemistry
in solid-state physics. The concept of the quantum computer
is one great application. As you can imagine, this one of
the most promising developments in the field of computers,
but studying atomic-level computers can't easily be done in
a lab. The need for quantum chemistry and theoretical
methods of studying such small systems has become crucial,
and more and more scientists are beginning to explore this
venue. Needless to say, there's much beyond conformations. |
|