Flux Analysis in Large Metabolic Networks

When working towards an overall improvement of the yield of a given product from a certain substrate it is of great help to identify all possible routes (or pathways) between the substrate and the product and to obtain quantitative information about the relative activities of the different pathways involved in the overall conversion. Particularly in connection with metabolic engineering (see Chapter 1), where directed genetic changes are introduced in order to reroute the carbon fluxes towards the product of interest, it is essential to know how the different pathways operate at different growth conditions. As discussed in Section 2.1.1 the in vivo fluxes are the end result of many different types of regulation within the cell, and quantification of the metabolic fluxes therefore represent a detailed phenotypic characterization. Since quantification of metabolic fluxes goes hand in hand with identification of the active metabolic network, approaches to quantify fluxes have been referred to as metabolic network analysis (Christensen and Nielsen, 1999). Metabolic network analysis basically consists of two steps:

• Identification of the metabolic network structure (or pathway topology).

• Quantification of the fluxes through the branches of the metabolic network.

The extensive biochemistry literature and biochemical databases available on the web (see e.g. www.genome.aH jp) provide much information relevant for identification of the metabolic network structure. Complete metabolic maps with direct links to sequenced genes and other information about the individual enzymes is typically retrieved. Thus, there are many reports on the presence of specific enzyme activities in many different species, and for most industrially important microorganisms the major metabolic routes have been identified. However, in many cases the complete metabolic network structure is not known, i.e. some of the pathways carrying significant fluxes have not been identified for the microorganism that is investigated. Here enzyme assays can be used to confirm the presence of specific enzymes and to determine the co-factor requirements, e.g. whether the enzyme uses NADH or NADPH as co-factor. Even though enzyme assays are valuable for confirming that a given pathway is present and is active, they are of limited use for a rapid screen of the totality of pathways, that are present in the studied microorganism. For this purpose isotope-labeled substrates are a powerful tool, and especially the use of l3C-labelled glucose and subsequent analysis of the labeling pattern of the intracellular metabolites has proved to be very useful for identification of the metabolic network structure. This aspect is discussed further in Section 5.4.2.

When setting up the metabolic network it is important to specify a reaction (or a set of reactions) that leads to biomass formation. This reaction will specify the drain of precursor metabolites, or of building blocks, if the synthesis of these is included in the model. In some cases the stoichiometry for biomass formation has a significant influence on the analysis, and Note 5.3 shows how the so-called biomass equation is set up.

When the metabolic network structure has been identified the next step is to quantify the fluxes through the different branches in the network. In all cases the flux quantification relies on balancing of intracellular metabolites, just as was illustrated with several examples in Section 5.3 for simple networks. In flux analysis more detailed models are applied, but as in Section 5.3 a model with J fluxes v and K constraints will in practice have several degrees of freedom F=J-K, and there is an infinite number of solutions v to the model. In order to identify a unique solution v it is necessary to add more information or impose further constraints on the system. This can be done in three different ways:

• Use of directly measurable non-zero rates. This approach is the same as that discussed for the simple metabolic network models in Section 5.3, and when exactly F rates are measured the fluxes in the network can be calculated using eq. (5.1) - or eq. (5.25). This approach is discussed further in Section 5.4.1.

• Use of labeled substrates. When cells are fed with labeled substrates, e.g., glucose that is enriched for 13C in the first position, then there will be a specific labeling of the intracellular metabolites. As there are different carbon transitions in the different cellular pathways, the labeling pattern of the intracellular metabolites is a function of the activity of the different pathways. Through measurements of the labeling pattern of intracellular metabolites and application of balances for the individual carbon atoms in the different biochemical reactions, additional constraints are added to the system. As discussed in Section 5.4.2 this is used to quantify the fluxes, even when only a few rates are measured.

• Use of linear programming. It is possible using linear programming to identify a solution (or a set of solutions) for the flux vector v that fulfills a specific optimization criterion, e.g., the flux vector that gives maximum growth yield. This approach is the subject of Section 5.4.3.

Note 5.3 Biomass equation in metabolic network models

Biomass formation is the result of a large number of different biochemical reactions. In Chapters 2 and 3 we looked at some of the many different reactions that are involved. Biomass synthesis starts with the formation of precursor metabolites (see Table 2.4), which are converted into building blocks (amino acids, nucleotides, lipids etc.). The building blocks are the monomers in macromolecules, which are the major constituents of biomass (see Table 2.5). The macromolecular composition of a given cell depends on the growth conditions, and on the composition, e.g. the amino acids in the proteins, of the different macromolecules. It is therefore not possible to specify a single reaction converting precursor metabolites into biomass. If this is still done one must make an assumption that the macromolecular composition is constant. There are three different ways of setting up an equation for formation of biomass with constant macromolecular composition:

• Direct synthesis from precursor metabolites

• Direct synthesis from building blocks

• Synthesis from macromolecules

In some cases one may use a combination of three approaches.

In the first approach an overall reaction is specified for conversion of precursor metabolites into biomass. Here information compiled in Table 2.4 is used together with information about the costs of ATP, NADPH and NADH to make biomass from the precursor metabolites. This identifies the stoichiometric coefficients for the different precursor metabolites involved in biomass formation. Reactions leading from the individual precursor metabolites to building blocks are not considered in this model, except perhaps for reactions leading to building blocks that are used for product formation, e.g. the synthesis of valine, cysteine and a-aminoadipic acid may have to be considered in a model for penicillin production. The synthesis of all other amino acids can be lumped into the overall biomass equation describing formation of biomass directly from precursor metabolites.

In the second approach the synthesis of most building blocks is considered in the model, and the biomass equation is described as a reaction where building blocks are converted into biomass. The stoichiometric coefficients for the building blocks are identified from knowledge of the amount of different building blocks that is needed for biomass formation. This approach typically results in a substantial increase in the model complexity, since a large number of reactions leads to the many different building blocks. Perhaps lumping of reactions that lead to many of the building blocks can be done but still the number of reactions considered in the model is large. The advantage is, however, that it is relatively easy to identify the different elements of the biomass equation.

In the last approach reactions for synthesis of the different macromolecules are included in the model, e.g. reactions for synthesis of proteins, lipids, DNA, RNA and carbohydrates. The biomass equation is described as a reaction where the macromolecules are converted into biomass, and the stoichiometric coefficients for the macromolecules are given by the macromolecular composition of the biomass. With this approach it is relatively easy to study the influence on the calculated fluxes of the macromolecular composition which directly appears in the biomass equation.

Which ever approach is used requires a substantial information about how biomass is synthesized and on the metabolic costs of the different precursor metabolites/building blocks/macromolecules. In addition the costs of ATP, NADPH and NADH for biomass formation must also be available, and this requires information about the biomass composition. In recent years this type of information has become available for many microorganisms as part of flux analysis studies. If no information is available for the investigated system one may use data from related organisms. It is already recommendable to calculate the sensitivities of the calculated fluxes to variations in the estimated (or experimentally determined) biomass equation._

5.4.1 Use of Measurable Rates

When F or more rates are measured all the fluxes can be estimated using matrix inversion as discussed in Section 5.3. Eq. (5.25) directly gives the solution for the flux vector v when exactly F rates are measured. This is often referred to as a determined system. If more than F rates are measured the system is over-determined. Here the matrix T2 is not quadratic and it cannot be inverted directly. To circumvent this problem there are two possibilities:

(i) One may use a sub-set of the measured rates and calculate the fluxes using eq. (5.25) and the other rates using eq. (5.26). Through comparison of the calculated rates and the measured rates that are not used to calculate the elements of the flux vector one may check the consistency of the model (and model predicted values may be found for the measured rates if the model is believed to have the correct structure).

(ii) One may use a statistical procedure on the whole set of data to obtain good estimates for the elements in the flux vector v and obtain new (and better) estimates for the measured rates.

In the first case the solution is found as for a determined system. In the other case a statistical procedure similar to that described in Section 3.6 has to be applied, but there may be different approaches (see Stephanopoulos et al. (1998) for details). Often one will, however, simply use the least square estimate for flux vector. This estimate is found by using the pseudo-inverse of T, - called T/, i.e., the fluxes are calculated from:

The matrix T2tT2 is always a square matrix. Furthermore, if T2 has full rank then T2tT2 is non-singular and the matrix can be inverted. The requirement of T2 having full rank is synonymous with the requirement of T2 being non-singular for the case of a determined system, and it means that there exists a JxJ sub-matrix within T2 that is non-singular. The requirements for application of eq. (5.28) are therefore the same as for analysis of the determined system, i.e., that there are no linearly dependent reaction stoichiometrics and there are no metabolites with identical or linear dependent stoichiometric coefficients (such as the co-factor couple NADH/NAD+). Furthermore, the set of measured rates must be chosen such that the fluxes can be calculated using the matrix equation, but this requirement is normally fulfilled for an over-determined system. Whereas it is quite simple to consider only one of the compounds in co-factor couples it is often more difficult to avoid linearly dependent reaction stoichiometrics, and this is therefore discussed further in Note 5.4.

Note 5.4 Linear dependency in reaction stoichiometrics.

Because most living cells are capable of utilizing a large variety of compounds as carbon, energy and nitrogen sources, many complementary pathways exist that would serve similar functions if they operated at the same time. The inclusion of all such pathways may give rise to problems when matrix inversion is applied for flux analysis. This situation usually manifests itself as a matrix singularity, whereby the non-observable pathways appear as linearly dependent reaction stoichiometrics. The fluxes through these different pathways cannot be discerned by extracellular measurements alone. Here we will consider two examples:

* Glyoxylate cycle in prokaryotes

• Nitrogen assimilation via the GS-GOGAT system

In prokaryotes, the TCA cycle and all anaplerotic reactions, including the glyoxylate cycle, operate in the cytosol. Often the glyoxylate cycle is considered as a bypass of the TCA cycle because it shares a number of reactions with this cycle (see Fig. 2.5). However, the two pathways serve very different purposes: the TCA cycle has the primary purpose of oxidizing pyruvate to carbon dioxide, whereas the glyoxylate cycle has the purpose of synthesizing precursor metabolites, e.g., oxaloacetate, from acetyl-CoA. Considered individually the TCA cycle and the glyoxylate shunt are not linearly dependent, but if other anaplerotic pathways, e.g. the pyruvate carboxylase reaction, are included, a singularity arises. This may be illustrated by writing lumped reactions for the three pathways (see Fig. 2.10 for an overview of

The pseudo-inverse of T2 is given as:

the pathways). In all cases we use pyruvate as the starting point:

TCA cycle: - pyruvate + 3CO; + 4 NADH + FADH; + GTP = 0

Glyoxylate shunt: - 2pyruvate + 2CO: + oxaloacetate + 4NADH + FADH: = 0 (2)

Pyruvate carboxylase: - pyruvate - ATP - C02 + oxaloacetate = 0

If ATP and GTP are pooled together (which is often done in the analysis of cellular reactions), it is quite obvious that the glyoxylate shunt is a linear combination of the two other pathways, and all three pathways cannot be determined independently by flux analysis. It may be a difficult task to decide which pathway should be eliminated. Fortunately, these pathways rarely operate at the same time as their enzymes are induced differently. Information about induction and regulation of the corresponding enzymatic activities is critical in making a decision as to the exact pathway to be considered at a given set of environmental conditions. For example, expression of isocitrate lyase (the first enzyme of the glyoxylate shunt) is repressed by glucose in many microorganisms, and consequently the glyoxylate shunt is inactive for growth on glucose. In eukaryotes, the presence of the glyoxylate shunt does not give rise to a linear dependency due to compartmentation of the different reactions, i.e., the TCA cycle operates in the mitochondria and the glyoxylate shunt either in the cytosol or in microbodies. In practice there are many other reactions in the network that involve intermediates of the TCA cycle and the glyoxylate shunt, and these reactions may lead to a removal of the linear dependency between these two pathways (see also Example 5.6). Even in cases where the two pathways are not linearly dependent the inclusion of both pathways in the model may lead to an ill-conditioned system, i.e., the condition number may be high (see Note 5.4).

Another example of linearly dependent reactions is the two ammonia assimilation routes: the glutamate dehydrogenase catalyzed reaction and the GS-GOGAT system (see Section 2.4.1). The stoichiometries of these two routes are

Thus, the only difference is that ATP is used in the GS-GOGAT route (which is a high-affinity system) but not in the GDH reaction. The problem here is that an ATP balance is not easy to utilize due to lack of sufficient information about all ATP-consuming reactions. In the absence of an ATP balance to differentiate between them, the two nitrogen assimilation reactions are linearly dependent and, as such, non-observable. Because the only difference between the two routes is the consumption of ATP in the GS-GOGAT system, distinction between the two routes may not be important, and they are therefore often lumped into a single reaction in stoichiometric models.

If a singularity arises in the stoichiometric matrix, one has the following two options:

(1) Remove the linearly dependent reaction(s) from the model, invoking (or postulating) information about specific enzyme regulation and induction.

(2) Introduce additional information such as the relative flux of the two pathways. Such information may be derived from measurements of enzyme activities, e.g., the relative activity of key enzymes in the two routes. However, this approach is hampered by the fact that in vitro enzyme activity measurements often bear little relationship to actual in vivo flux distributions. A more powerful technique is the use of labeled substrates, e.g., ''C-enriched glucose, followed by measurements of the labeling pattern of intracellular metabolites as discussed in Section 5.4.2._

GDH: - a-ketoglutarate - NH3 - NADPH + glutamate = 0 GS-GOGAT: - a-ketoglutarate - NH3 - NADPH - ATP + glutamate = 0

The combination of a metabolic model, based only on reaction stoichiometrics, and measurement of a few rates is a very simple method for estimation of intracellular fluxes, and it has been used to study many different fermentation processes (Vallino and Stephanopoulos, 1993; Vallino and Stephanopoulos, 1994a,b; Jergensen et al., 1995; van Gulik and Heijnen, 1995; Sauer et al., 1996; Nissen et al, 1997; Pramanik and Keasling, 1997; Pedersen et al., 1999). Clearly it is valuable to quantify the fluxes through the different branches of the metabolic network considered in the model, and in Section 5.4.2 we discuss how information on fluxes may be used to guide genetic modification, resulting in strains with improved properties. The approach may, however, also be used for analysis of the metabolic network, i.e., which pathways are likely to operate. This will be illustrated in examples 5.7 and 5.8. It is important to emphasize that such analysis must always be followed up with experimental verification, but clearly the simple approach discussed in this section may be used as an efficient guide to the experimental work.

Example 5.7 Metabolic Flux Analysis of Citric Acid Fermentation by Candida lipolytica

Aiba and Matsuoka (1979) were probably the first to apply the concept of metabolite balancing to analyse fermentation data. They studied the yeast Candida lipolytica producing citric acid, and the aim of their study was not to quantify the fluxes but rather to find which pathways were active during citric acid production. For their analysis they employed the simplified metabolic network shown in Fig. 5.5. The network includes the EMP pathway, the TCA cycle, the glyoxylate shunt, pyruvate carboxylation, and formation of the major macromolecular pools, i.e., proteins, carbohydrates, and lipids. At least one of the two anaplerotic routes are obviously necessary to replenish TCA cycle intermediates when citrate and isocitrate are secreted to the extracellular medium.

In the network there is a total of 16 compounds and of these 8 are intracellular metabolites for which pseudo steady state conditions apply. The compounds for which the rate of formation or consumption can be measured are:

Glucose (glc), ammonia (N), carbon dioxide (c), citrate (cit), isocitrate (ic), protein, (prot), carbohydrates (car) and lipids (lipid). The intracellular metabolites for which pseudo steady state applies are:

Glucose-6-phosphate, pyruvate, acetyl-CoA, 2-oxoglutarate, succinate, malate, oxaloacetate, glyxoxylate.

Notice that citrate and isocitrate do not remain in pseudo steady state: these metabolites are constantly produced. One could specify these compounds as intracellular metabolites being in pseudo steady state, but this would require including two additional reactions in the network (indicated by the broken secretion lines in Fig. 5.5).

Based on the network shown in Fig. 5.5 we can set up eq. (5.1) for the model. Mole basis is used for all stoichiometric coefficients and the protein synthesis rate is based on moles of OGT consumed. Similarly the rate of carbohydrate synthesis is based on moles of G6P consumed.

rN

Good Carb Diet

Good Carb Diet

WHAT IT IS A three-phase plan that has been likened to the low-carbohydrate Atkins program because during the first two weeks, South Beach eliminates most carbs, including bread, pasta, potatoes, fruit and most dairy products. In PHASE 2, healthy carbs, including most fruits, whole grains and dairy products are gradually reintroduced, but processed carbs such as bagels, cookies, cornflakes, regular pasta and rice cakes remain on the list of foods to avoid or eat rarely.

Get My Free Ebook


Post a comment