computationalbiology

Computational Biology A New Paradigm Sure, you know about your regular biology. Cells, photosynthesis, reproduction, prokaryotes, and all of that other stuff. What you probably aren't used to seeing much of in regular biology is physics-and all of the math that comes along with it. Yet, if physics is meant to be the mathematical language that describes all of nature, why don't we see more of it in biology class? In physics, we can calculate the trajectories of planets, the mass of planets, the motions of objects, right? So why can't we calculate the workings of viruses, how DNA replicates, whether a drug could destroy cancer cells; why must we physically carry out all of these experiments in a lab? Well, the answer is that we don't. The important thing, though, is that it is not so easy to calculate the outcomes of such situations. Thomas Kuhn, in his book The Structure of Scientific Revolution, believes in two kinds of science: normal and revolutionary. The normal science is much like the stuff you did in high school chemistry labs; everything is done in accord to some paradigm. Your procedure is drawn up based on the current rules of this paradigm, and your results will only support that paradigm. Revolutionary science, on the other hand, contradicts the paradigm. It extends it, and creates a new one in its place. When you finish your high school chemistry lab, you usually don't expect to rewrite your textbook with your results. However, when Newton created physics, he was replacing the prevalent paradigm. Every other attempt to describe the universe's workings was thrown out in favor of Newton's laws. Later, when Einstein devised relativity, it was then realized that Newton's paradigm was insufficient. Biology has recently experienced such a paradigm shift. It all began in 1926, when Erwin Schrodinger developed the famous wave equation which today bears his name. This wave equation would become the foundation of quantum mechanics, a field which in itself was revolutionary science. Newton's laws of mechanics would never have been able to describe the properties of systems the size of molecules. It could not have properly described the energy levels of a T1 cell, for instance; quantum mechanics could. The problem with Schrodinger's equation, however, was its complexity. It looks rather simple, right? Unfortunately, however, it is impossible to solve for anything bigger than a hydrogen atom (due to the infamous many-body problem). But it is still possible to use, via methods of approximation, most of which come from the field of variational calculus. Even so, the problem remains insurmountably difficult to work with. Just to perform an energy calculation on a molecule like benzene would require you to solve millions upon millions of integrals. Not the easy kind of integrals, either, ones usually involving such complex approaches as Fourier transforms to solve. To say the least, back in first half of this century, solving Schrodinger's equation for polyelectronic molecules was ridiculous. People would think you either had no idea what you were talking about or were crazy. To say you were overly optimistic would be an absolute understatement. Now, imagine another great development this century: computers. By 1970, the CDC corporation was building machines that could perform 10 million floating-point operations per second (MFLOPS). By 1999, Texas Instruments was building calculators the same speed. Today's supercomputing facilities usually harbor machines such as the Cray T3E, which with up to 128 300 Mhz DEC Alpha processors can achieve peak speeds of nearly 40 billion floating point operations per second (GFLOPS). So, does this mean that Schrodinger's equation can easily be solved now, for any problem? We wish. Schrodinger's equation, on very large molecules, are multi-molecule systems, could still bring all of the computing facilities this planet has to offer, easily until the end of time, long after the whole universe collapses. To remain optimistic, though, scientists routinely carry out quantum mechanics calculations on organic molecules, even small proteins. So, tying the two developments together now, both quantum mechanics and computers, we have a new field called computational biology. No lab is necessary. Just a powerful workstation. Put in the structure of a molecule, and it could calculate the energy for you. It could also calculate the actual shape of the molecule. It can also simulate chemical reactions, along with many other things. It is much like the difference between dropping a ball from the top story of your house and timing how long it takes to fall, and pulling out your physics textbook and calculating the time. Computational biology offers many advantages over conventional biology. Rather than using a nuclear magnetic resonance (NMR) spectrometer to observe the tertiary conformation (three-dimensional shape) of a protein, your computer could calculate it for you using the laws of quantum physics and thermodynamics. It also offers several advantages not available in conventional biology. One could, for instance, predict the ideal drug treatment for HIV, in terms of electrostatic, conformational, and other properties, rather than testing a ton of different candidates believed to work. With the laws of physics, there is no guessing. But, does that mean that computational biology will supplant conventional biology? Doubtful. Number one, the mathematical physics are quite difficult, so don't expect to see it being taught in high school any time soon. Second, modern computational resources do not make computational biology applicable to every problem. Because of the large amount of approximation necessary to make calculations tractable, computational results are many times less accurate than experimental ones. Third, conventional biology is still very practical for many different problems, mostly those that involve systems or reactions too large for a computer to calculate. For instance, cell reproduction. It would not be possible to apply Schrodinger's equation to an atomic system as massive as a cell using today's computational resources. Even when it does become possible, it would probably be quicker to check things out under a microscope. So, until the computational ability to solve such problems with ease come along, we must rely on conventional biology for the study of many problems. Nonetheless, in its current state, computational biology does have many uses. Already, Dupont Merck laboratories have developed HIV REV inhibitors to treat the AIDS virus, using only computers. So What's a Force-Field or Hyper surface? Most of the earliest problems attacked in the field of computational biology (or its predecessor computational chemistry) had to do with conformations, or the three-dimensional shapes of molecules. The 2-D diagrams in your chemistry book are nice, but they don't show exactly what the molecule may look like. Nor are such diagrams in three-dimensions, as real molecules are. Finding the conformation of a molecule involves two major parts: an energy equation and an optimization algorithm. By the laws of thermodynamics, one knows that a system always assumes the lowest possible energy. Thus, a molecule should have a conformation which also has the lowest energy--this is another way of saying that it is most stable. The collective set of all possible conformations can be envisioned as a surface, where the height of each point is its energy and its x and y values represent its conformation, perhaps two angles in the molecule. The lowest point on this surface is the most stable, and is called the global minimum. Now, add in all the different angles and bond-lengths and your surface suddenly exists in many different dimensions; thus it is called a hypersurface. The optimization algorithm searches this surface for the global minimum, using the energy equation as a criterion. Schrodinger's equation can be used as an energy equation. In the form H Psi = E Psi, we can solve for E, which is the total energy. This is an extremely time-consuming process, however, and is only used for very small molecules. A force-field equation is a nice alternative. It is a simple equation, usually of the following form: E = E_BONDS + E_ANGLES + E_VDW. The first term on the right side of the equation is usually a sum of each bond's energy, which may get bigger and bigger as the bond length gets farther from a reference value. The angles term is the same, getting larger and larger as the bond angles depart from a reference value. The final term is a sum of Van der Waals energies, or inter-atomic energies. This is the interaction energy between any two different atoms in the molecule, and could be positive or negative. For example, as two carbon atoms come very close together, their strong nuclear repulsion would cause this term to rise greatly. However, a hydrogen on a water molecule and an oxygen on another water molecule may have a negative VDW term, since they are attracted by polar forces (hydrogen bonding). Many times, these force-field equations can be parameterised, or modified in such a way that they agree better with experimental data. The important thing is that force-fields have no mathematical basis whatsoever--they are mere estimates. Thus, when compared to equations like Schrodinger's wave equation, they are significantly more accurate (but faster!). Optimization algorithms, come in many forms, from very mathematical methods such as Newton-Raphson, or from other fields, such as the Monte Carlo method or the genetic algorithm. Mathematical methods such as Newton-Raphson almost exclusively work by calculating derivatives to either guess or know where the minimum is, based on slope (and sometimes acceleration). Methods like Monte Carlo and genetic algorithm simply meander their way around the hypersurface looking for good spots. In the case of the MC method, it picks it spots by little more than random guesses. The genetic algorithm uses the laws of natural selection to try and identify the good traits of a stable conformation. What's Beyond Conformations? A great deal! For instance, a field such as drug design doesn't limit itself merely to conformations. It contains many other important aspects, such as 3D-database searching, ligand-receptor docking simulations, ligand design, structure generation, and more. The problem is, is that the field of computational biology is still growing to such an extent that it is difficult to tell where it will end, or what new fields will be created in the years to come. As the computational resources of the scientific community continues to grow, it has an increasing number of options to choose from. Things that seemed ludicrous fifty years ago are common-place today. Of course, I can't list every branch that computational biology has expanded into since its inception. However, some of the major fields that loom large today include: the Protein-Folding problem, the design of new drugs, analysis of X-ray and NMR data, among others I can't recall off the top of my head. Nor is the use of computers limited to biology. Just recently there has been great growth in the field of computational quantum chemistry in solid-state physics. The concept of the quantum computer is one great application. As you can imagine, this one of the most promising developments in the field of computers, but studying atomic-level computers can't easily be done in a lab. The need for quantum chemistry and theoretical methods of studying such small systems has become crucial, and more and more scientists are beginning to explore this venue. Needless to say, there's much beyond conformations.
	Books to read

	Computational Biology
	Molecular modeling

	GO!
	Related sites