|
Jochen Hurlebaus, Institute for Biotechnology 2, Research
Centre Jülich, 52425 Jülich, Germany
Join our Team:
-
You are interested in Mathematical Modeling Techniques in Biology?
-
You have experience with Numerical Methods and Algorithms?
-
You have experience in Software Engineering?
-
You want to join a multidisciplinary research team in the largest Research
Center of Germany?
Then you should contact us for
a position leading to Diploma-Thesis or a Ph.D. degree.
Simulating Metabolic Networks in Micro-organisms with
Dynamic Structured Mathematical Models
Project Overview 2000
Jochen Hurlebaus
Project Goal:
Dynamic mathematical models are developed for simulation of metabolic
networks of micro-organisms with focus on bacteria (Escherichia coli).
The goal is to identify control mechanisms in the complicated metabolic
networks of such micro-organisms and make predictions of the influence
of specific genetic modifications (in silico analysis). As a result, simulations
of dynamic cell behaviour under varying fermentation conditions should
be possible (Figure 1).
Figure 1: State of the art tools are able to simulate intracellular
steady-state fluxes. Dynamic response of cells to changing fermentation
conditions is still difficult to predict and require advanced mathematical
models.
Methods:
The project
-
dynamic intracellular data gathered from pulse-experiments with Escherichia
coli
-
steady-state flux data analysed with NMR techniques
-
available information about in-vivo enzyme kinetics
Mathematical models used are structured models based on the stoichiometry
of the metabolic pathway. The models allow simulation of intracellular
metabolite concentrations under varying extracellular substrate concentrations
and fermentation conditions. Hypotheses of regulations present in micro-organisms
can be examined.
Model development and formulation is done with a Modelling Tool
Software that was produced during the project. The software allows data-fitting
and simulation with identified parameters including graphical output that
is useful for experimental design and is based on a relational database.
Current Project State:
The Modelling Tool Software mentioned above is coded in the script
language Tcl/Tk. All relevant information of models is stored in a relational
database (MySQL, free software for non-commercial use):
-
Metabolites, enzymes, alarmones, nucleotides and all other substances (E.g.
names, ECC, chemical composition)
-
Reactions of the metabolic network, i.e. the stoichiometric matrix, also
including uptake and excretion systems of the cell
-
Effects of chemical reactions important for regulation and control of intracellular
dynamics
-
Kinetic relations describing single reaction steps of the cell. Includes
activators, inhibitors, and other effects of enzymatic reactions.
-
Experimental data or connections to external data files
-
Parameters used for the mathematical algorithms for optimisations and simulations
-
Results of simulations and optimisations. This includes simulated intracellular
metabolic concentrations as well as calculated fluxes over time.
The database was build using Normal-Forms 1-3 for relational databases.
The amount of data to be stored is only limited physically (available harddisk
space).
Figure 2 shows presentation of the metabolic network in the Modelling
Tool Software:
The left part of the image contains a differentiation of the network
in different scales set-up by the user (here extracellular and cytoplasm).
-
Scales can be differentiated further into Reactions Groups
(here PTS and Glycolysis), allowing the user to structure the reaction
network according to his own desires.
-
Each Reaction Group contains a set of single Chemical Reactions.
-
Reaction Details of these reactions are shown in the right part
of the window. The figure shows details of the reaction from Fructose-6-Phosphate
F6P to Fructose-1,6-Bisphosphate F1,6P).
-
Name (F6P->F1,6P; 2.7.1.11 (5)) and Description (Phosphorylation by ATP)
-
Site of Reaction (cytoplasm) and Reaction Group (Glyolysis)
-
Stoichiometry of the reaction: Number of molecules of substrate (1 ATP,
1 b-D-Fructose-6P) and products (1 ADP, 1 b-D-Fructose-6P2). These names
are colour-coded , blue and red for substrates and products respectively
is the default, a different colour would indicate a substance of a different
Site
or Reaction Group.
-
Effects (6-Phosphofructokinase, ID: 2.7.1.11)
-
Mathematical formulation of the function (Hill-kinetics)
-
Edit shows details of the functional form. This feature is explained
below.
Figure 2: Main view of a Metabolic Network in the Modelling
Tool Software. An expandable tree on the left shows the structure, on the
right details of single reactions can be seen.
The Edit button shown next to the Hill Kinetics in Figure 2 leads
to another main feature of the Modelling Tool Software, the detailed definition
of each kinetic relation. Figure 3 shows the popup window giving the user
full control over the mathematical model.
Figure 3: Control window for set-up of mathematical model details.
The windows main features are:
-
On top, the reaction name and the mathematical equation (here the Hill
kinetics) are shown, including a short description of the buttons in the
left window part named Substances.
-
The substance window includes each letter of the mathematical equation
in the top of the window. In Figure 3, S1 was defined by the user to be
b-D-Fructose-6P, whereas Kos, V, and h have been defined as parameters.
-
For S1, metabolite b-D-Fructose-6P of the stoichiometric network,
the options dep, dic, data, opt or fixed can be chosen.
-
Dep and dic will include b-D-Fructose-6P as a dependent
variable of the mathematical model, i.e. an ordinary differential
equation will be created for b-D-Fructose-6P concentration.
-
Data means that the simulation code expects b-D-Fructose-6P concentration
as an input, e.g. from experimental data.
-
Opt treats b-D-Fructose-6P as a parameterthat does not change
over time, however, it will be part of the optimisation and might
be changed for the next simulation to improve data fitting.
-
Fix assigns b-D-Fructose-6P concentration a fixed number that does
not change with time and is also the same for all simulations.
-
Parameters Kos, V, and h do not change over time, therefore only the choices
opt and fix apply here, with the same meaning as for b-D-Fructose-6P concentration.
-
The right part of the window allows the user to define starting values
(Value) as well as ranges. Ranges apply only to parameters and substances
that are subject to optimisation during data-fitting. On the very right,
the first column of a different optimisation can be seen. This allows direct
comparison with other data-fitting attempts, simulations, or results.
After a model is metabolic pathway model is defined, simulations and optimisations
(data fitting) can be performed with the Modelling Tool Software.
Complex optimisation - i.e. fitting a high parametric non-linear
differential equation model to given data, requires patience and often
continued retries. Best results will be retrieved with growing experience
on how to set optimisation parameters. This is the reason why the user
of the Modelling Tool Software has full control over all optimisation and
simulation parameters. After gaining insight in the system with the default
values, a user can find parameters that result in the greatest performance
for his model and data.
-
Figure 4 shows a window with details for simulation and parameter identification:
-
Accuracy of the solution algorithm for the differential equations
-
Optimisation algorithm (so far only one available)
-
Random number generator initial
-
Change of parameter when a restart is performed during optimisation (%
of initial value, will be chosen randomly with uniform distribution)
-
Range extension if larger or smaller ranges should be chosen for all parameters
during the next optimisation
-
Result specification: all, after improvement during optimisation, or only
final, best result can be save
-
Weights for each optimised substance and parameters for the specific optimisation
algorithm (e.g. accuracy) are specified in a sub-window.
Figure 4: Main Optimisation Window for data-fitting and simulations.
All results, also intermediate results, can be stored and viewed with View
Results
All results of simulations and optimisations are stored automatically.
The user chooses if also intermediate results are stored. Because all details
- including optimisation and simulation parameters, time-steps, used data-file
for fitting, initial parameters and ranges, are stored in the database,
reproduction of all results is possible. Figure 5 shows a window with a
result table of one specific optimisation. The first column shows the sum
of squares of deviations of the model compared with the experimental data.
Double-click on a row displays all details of an simulation or optimisation
respectively. Figure 5 shows an example of the graphical output displaying
amount of metabolite over time.
Figure 5: Example of Result Overview from an Optimisation
Graphical output
Metabolite concentrations and intracellular fluxes over time calculated
and stored can be seen directly on the screen. Often in situations where
high-parametric
models are fitted without good initial values, graphical output gives good
hints on reasons for unsatisfactory performance. A simple least-square-value
will not give enough information, therefore graphical output has to be
quick and clear. The output of the Modelling Tool Software prints a graph
of metabolite concentrations and fluxes over time for each metabolite and
flux, including also the experimental data if available for comparison.
The Modelling Tool Software can also create Postscript files ready for
printout.
Figure 6: Example of graphical output of metabolite concentrations
over time, produced from parameter fitting to experimental data.
Software Internals
The Modelling Tool Software in Tcl/Tk language interacts with
algorithms in Fortran and C for random number generation, solution of the
differential equations, and optimisation. All routines are public-domain
downloaded from the Netlib collection (www.netlib.org). For solution of
the system of differential equations lsoda was chosen because it
works also for stiff equations. The optimisation algorithm used so far
is a modified simplex method. The algorithm was tested on a system of 9
ordinary
differential equations with 47 parameters that was fitted to experimental
data. Data fitting was not successful for all experimental data at the
same time, however, some effects of control could already be identified.
It is therefore thought that the subplex algorithm is useful for optimisation
but the models so far are not. Different algorithms will be tested to answer
the question whether an incorrect model is the reason for the fitting problem.
Example: Enzyme Kinetics
Two of the 9 differential equations used in the test-model will
be shown here. The first describes the change per time of Glucose-6-Phosphate
in the cell:
The shown influx results from sugar (Glucose) uptake of the cell with the
PTS-system (Phosphotransferase-system). The mathematical equation
a modified standard Michaelis-Menten equation for a two-substrate reaction.
The shown netflux consist of an outflux to Fructose-6-Phosphate and an
backflux from that metabolite due to reversibility of the reaction. A reversible
Michaelis-Menten equation was chosen to model this reaction.
The second kinetic shown here describes the change per time of Fructose-6-Phosphate
in the cell:
The first term is the same occurring in the first reaction for Glucose-6-Phosphate
with changed sign. The second and the third term describe the netflux to
the successor in the metabolic network (Fructose-1,6-Bisphosphate). Again
the modelled chemical reaction is reversible. Therefore the third term
is the flux from Fructose-6-Phosphate to Fructose-1,6-Bisphosphate whereas
the third term describes the backflux. Both fluxes have the same mathematical
representation, the Hill-Kinetic.
In principal all reactions of the metabolic network have a similar form
in our model. Changes in the exact mathematical form as well as the amount
of detail of the model can change results significantly. All mathematical
formulations are suggestions and simplifications only. For most cases an
exact formulation cannot be given, in many cases chemical reaction details
are not known.
Good and Bad Examples: I. PTS-System
Figure 7 shows results of a model of the PTS-system only. Only
the first peak of the Glucose uptake can be simulated. The reason is that
uptake of Glucose requires Phosphoenolpyruvate which is only available
at the beginning. During the experiment the graphs in Figure 8 show that
the Phosphoenolpyruvate concentration recovered at t=15s resulting in a
new peak of Glucose uptake. But the experiment also shows high Glucose-6-Phosphate
concentrations while Phosphoenolpyruvate is rare at the end of the experiment.
This could not be reproduced with the model that just considered the PTS-uptake
system. A biological explanation, however, is also not available for this
phenomena. We hope to explain this phenomena by including more details
in our model. The above mentioned model with 9 differential equations and
47 parameters was also not able to explain the phenomena. It seems to be
still lacking some control mechanisms of the cell.
Figure 7: PTS Model and Experimental Data of intracellular Glucose-6-Phosphate
concentration: Only the beginning is reproduced by the model.
Figure 8: Intracellular Phosphoenolpyruvate concentration from
two experiments. High phosphoenolpyruvate concentration correlates with
high Glucose uptake, low concentration stops the uptake. The solid line
is a visual aid only, the model did not include prediction of Phosphoenolpyruvate
concentration but used the data as input.
Good and Bad Examples: II. Glycolysis System
A much more detailed model for the metabolism from Glucose was
developed. It included a complete model for Glycolysis and black-box models
for the Citric-Acid cycle and the Pentose-phosphate pathway. Figure 9 shows
the model components. Two effects of inhibition could be identified that
allow reproduction of the Glucose-6-Phosphate and the Phosphoenolpyruvate
dynamics after the glucose-pulse. The assumptions made are based on experimental
findings from the literature, whereas parameter values are estimated from
literature and with optimisation. Figure 10 shows simulated dynamics of
intracellular Glucose-6-Phosphate and Phosphoenolpyruvate during a pulse
of Glucose.
Sorry we cannot give you all the details about the findings at this
time. Please contact us if you have specific questions.
Figure 9: Overview of the model that allowed identification of possible
control mechanisms in the cell. All substances shown with a blue background
are simulated by the model. Only Glucose was used as an input variable.
Figure 10: Simulated intracellular Glucose-6-Phosphate and
Phosphoenolpyruvate concentrations over time during a Glucose-pulse experiment.
The pulse was given at time t=0.
Future prospects:
With the availability of the Modelling Tool software now models
can be build and fitted in approximately one hour. The resulting great
development speed for new models has already resulted in a model that describes
dynamic phenomena after a glucose pulse very well. A check of many models
will hopefully provide models that describe all experimental data reasonably.
To choose good models also model selection techniques will be developed.
Additionally, new optimisation algorithms will be tested. In order to identify
shortcomes of current models structure identification would be very helpful.
Some effort will be put in this direction, however, so far no standard
methods are available for structure identification of chemical reaction
networks.
last modified: September 2000
|