Structural and Dynamical Analyses of the Kinase Network Derived ...

Report 2 Downloads 12 Views
164

Genome Informatics 16(1): 164–173 (2005)

Structural and Dynamical Analyses of the Kinase Network Derived from the Transpath Database Bernd Binder

Reinhart Heinrich

[email protected]

[email protected]

Theoretical Biophysics, Institute of Biology, Humboldt University Berlin, Berlin, Germany Abstract We analyze the structural design and the dynamical properties of a protein kinase network derived from the Transpath database [14]. We consider structural properties, such as feedback cycles, pathway lengths, fraction of shortest pathways and crosstalk. Dynamic characteristics of the network are analyzed by using nonlinear differential equations with a special focus on kinase amplitudes and signal propagation times. Comparison with random networks shows that the cellular kinase network exhibits special features which might be a result of natural selection. In particular, the Transpath network contains no cycles, and input kinases and output kinases are generally connected by shortest signalling routes. Moreover, it displays a characteristic spectrum of cross-talk between different pathways.

Keywords: Transpath, kinases, network design, pathway length, crosstalk, amplitudes, signal duration

1

Introduction

In recent years several mathematical models have been developed for calculating the dynamical properties of specific signal transduction networks. Recent examples are the MAPK pathways [4, 6, 7, 15], the JAK/Stat pathway [17] and the Wnt-pathway [9]. These models are based on the kinetic properties of the compounds participating in a given signaling pathway. Complementary to these approaches one may envisage theoretical analyses which are aimed to understand the design of these networks. They are motivated by the fact that biochemical networks are, in contrast to chemical reaction systems of inanimate nature, the result of natural selection. Whereas for metabolic networks several such investigations already exist [3, 10, 16] consideration of design principles of signaling pathways is still at the beginning [12, 13]. Recently, we analysed systematically the dynamic stability of small kinase/phosphatase networks depending on their structural characteristics such as average connectivity and number of feedback cycles [1]. In the present work we focus on the structural and dynamic properties of large-scale kinase networks. In particular we investigate a network retrieved from the Transpath database [14]. We are interested in structural features such as number of input/output kinases, lengths of signaling pathways and degree of crosstalk. Dynamic properties of kinases are described in terms of signal amplitudes, signal duration, as well as of signal propagation times. For identifying specific features of cellular signaling networks we compare the results with the corresponding properties of random networks.

2

Basic Properties of the Transpath Network

We analyse the signal transduction network depicted in Figure 1 which has been deduced from the database Transpath. This database was chosen because it provides sufficient information on kinases

Structural and Dynamical Analyses of the Kinase Network

165

and their mutual interactions and is therefore most appropriate for the analysis of large kinase networks. It is shown as a graph where nodes represent the kinases and edges the phosphorylations. The network includes n = 86 kinases Ki . We did not concentrate on a specific organism but considered all kinases annotated. The actual number of entries in the database for the keyword “kinase” is much higher (1,100). However, many of them are ortholog enzymes in different species. In our scheme they are represented by a single kinase.

Figure 1: Transpath network. Black and white boxes indicate input and output kinases, respectively. Other kinases (grey boxes) receive and transmit signals. The network contains r = 171 interactions representing direct phosphorylations of kinases by other kinases. Only those phosphorylations were included which have an activating effect on the target. Moreover, isolated kinases which neither activate any other kinase nor being itself activated have not been included. In those cases where the database contains insufficient information concerning direct or indirect phosphorylation we consulted the original publication from which the information was retrieved or by taking into account more recent publications e.g. [2, 8] From the numbers given above one derives that each single component has on average 2-3 interaction partners. For the connectivity of the network r κ= , (1) n(n − 1) one obtains κ = 0.0234.

166

3

Binder and Heinrich

Methods

The network in Figure 1 is visualised with the program “graphviz” [18]. The algorithm produces layered drawings of directed graphs. As far as possible edges are drawn in the same direction (in our case from “top to bottom”) and edge lengths are reduced. Moreover, crossing of edges is avoided when possible. Visualization by this program reveals already several interesting characteristics, in particular a layered structure reflecting groups of upstream and downstream kinases. For example, well known components like MAP kinases (e.g. ERK, JNK, p38), MAPK kinases (MEK, MKK) and MAPKK kinases (Raf, MEKK) are assigned to three consecutive layers. To analyse the structural design of the kinase network we use graph theoretical methods, e.g. evaluation of the number and lengths of pathways and feedback cycles. There are 25 input kinases which are not activated by any other kinase, and 27 output kinases which do not have a target kinase. A signaling pathway Pk consists of a route of activations starting from an input kinase leading to an output kinase. In the case that the graph representing a given network does not contain cycles each pathway has a unique length Lk counted in numbers of activations. Dynamic properties of the kinase network were investigated by using ordinary differential equations, the solutions of which provide activation profiles of all kinases. To assess whether the kinase network has a specific design we compare its properties with those of random networks. For generating random networks we start with a copy of the Transpath network and use a method described in [11]. It involves subsequent steps of randomization by reshuffling pairs of edges. In each step two edges are chosen at random and then their target nodes or their nodes of origin are exchanged. This ensures that for each node the number of incoming and outgoing edges, that is, the single node characteristics of the network, remains fixed. In particular, the number of input and output kinases does not change. For arriving at a network which differs considerably from the original network, we applied 5,000 steps of reshuffling. The process was repeated for generating a set of 100,000 random networks.

4 4.1

Results Cycles, Lengths of Pathways, and Degree of Cross-Talk

Cycles: Closer inspection of the transpath network reveals that it does not contain cycles of any length (c = 0) which is in view of the high number of interconnections, a rather remarkable property. To assess whether this is a peculiarity of the kinase network, we analysed random networks with respect to the occurrence of cycles. Whereas it is rather cumbersome to determine the exact number of cycles for the whole set of these networks, one may easily calculate the number of networks containing no cycles at all. By analyzing the underlying adjacency matrices we obtained the result that among the 100,000 random networks under consideration there were only 37 networks which did not contain any cycle. This suggests that the probability of finding a “directed acyclic graph” (DAG) among the randomised networks is extremely low (less than 0.04 %). One may conclude therefore, that the absence of cycles is a very striking feature of the Transpath network. Pathway Lengths: MAP kinase pathways which are included in the network depicted in Figure 1 have the remarkable property that they consist of three to four consecutive levels of a phosphorylation cascade. It is an intriguing question whether such a pathway length is typical also for other pathways. According to the fact that the network contains 25 input kinases and 27 output kinases there are 675 pairs of kinases between which signaling routes could exist. Inspection shows that not all output kinases are reachable by routes from every input kinase and that routes only exist between N in,out = 222 such pairs. The total number of routes amounts to Ptot = 1272 , i.e. there are, on average, about six routes between each pair of connected kinases.

Structural and Dynamical Analyses of the Kinase Network

167

We calculated the lengths of the 222 shortest pathways between any two connected input and output kinases. The histogram in Figure 2A shows the number of shortest routes for a given length L. It can be seen that no shortest route is longer than six and that there is a considerable number of such routes consisting of only one step. Shortest routes of length three occur more often than routes of other lengths. The typical lengths of shortest routes in this network correspond therefore to that of MAPK-pathways. Among the 222 connected pairs of kinases there are 57 pairs which are connected by only one route. 

         

   



Figure 2: Lengths distributions of shortest pathways between input and output kinases. (A) Kinases network from Transpath (N in,out = 222); (B) Random network with the same single node degrees and N in,out = 349. In another class which contains 134 pairs there exist several routes but all of them are of equal length, that is, all of these routes represent shortest routes. There exist only 31 pairs which are connected by routes of different length. Within the latter class the longest routes between two kinases comprise 10 activation steps. These results indicate that input and output kinases are typically interconnected by shortest routes. For comparing this structural feature with the distribution of pathway lengths of other networks, we introduce for each connected pair of input and output kinases the fraction of shortest routes fµ =

Sµ , Tµ

(2)

where the pairs are numbered by the index µ (1 ≤ µ ≤ N in,out ). Sµ and Tµ denote the number of shortest routes and the total number of routes, respectively, for pair µ. The fraction of shortest routes among all routes within the network is defined as F =

1 N in,out

in,out NX

fµ ,

(3)

µ=1

with 0 < F ≤ 1. High values of F indicate that a network is rather uniform in the sense that it consists mainly of shortest routes between connected kinases. For the Transpath network we derived FT rans = 0.89. Pathway lengths are also calculated for cycle free random networks. We generated 100 networks under the additional constraint that the resulting networks do not contain cycles. Since the degrees of the nodes are preserved all, these networks have the same number of input kinases and output kinases as the Transpath network. However, the number of connected pairs is generally different. The histogram in Figure 2B shows the distribution of shortest pathway lengths for a random network having the highest number of connected pairs of input and output kinases (N in,out = 349). The typical lengths of shortest routes is L = 4 which is only slightly higher than that of the Transpath network. The results for other random networks are similar but the typical length decreases slightly with decreasing N in,out values. The N in,out values for the 100 random networks vary by more than a factor of two around an average value of 253 as can be seen in the Figure 3.

168

Binder and Heinrich

Figure 3: Frequency of pairs of connected input and output kinases (N in,out ). The histogram shows the result for 100 random networks. The arrow indicates the number of pairs found in the Transpath network.

Figure 4: Fraction of shortest routes between input and output kinases. Shown are the frequencies that among 100 random networks the fraction of shortest routes is found in certain intervals. The arrow indicates the F value for the Transpath network. We calculated the fraction of shortest routes for each randomized network leading to the results depicted on Figure 4. Compared with the Transpath network (whose F value is indicated by an arrow) the fraction of shortest routes within randomized networks is generally much lower, meaning that random network show a much more diverse distribution of pathway lengths. Signaling Cross-Talk: Inspection of Figure 1 shows that the signaling pathways between input and output kinases are generally not isolated from each other. A pair of two pathways Pk and Pl may share kinases, that is, they are overlapping, or they perform a cross-talk in that a kinase from one pathway interacts with a kinase of the other pathway. Examples are different MAPK cascades running in parallel. This is shown in Figure 5 redrawn from data given in [2]. It is seen that specificity of the kinases in these pathways is highest at the level of MAPKK activation of individual MAPKs. Activations exerted by MAPKKKs and MAPKas are generally more unspecific, indicating that crosstalk is more pronounced at these two levels of MAPK pathways. For a general assessment we introduce a measure which enables us to quantify the degree of cross-talk between two distinct pathways. We consider pairs of pathways differing in their input and output kinases and which have also no common kinases in between. They may contain, however, kinases which are activated by kinases of the other pathway. A simple example is shown in Figure 6A. An appropriate measure for the strength of cross-talk between two pathways Pk and Pl is given by rkl ρkl = . (4) Rkl In this formula Rkl denotes the maximal number of possible cross interactions which, hypothetically, could exist such that the two pathways under consideration would be still regarded as separate routes. The actual number of encountered interactions between Pk and Pl is denoted as rkl .

Structural and Dynamical Analyses of the Kinase Network

169

Figure 5: Crosstalk in MAPK pathways. Shown are three groups of MAPK pathways showing mutual interactions by kinase activations between different levels of the cascade. (Redrawn from data given in [2]).





Figure 6: Crosstalk between pathways. (A) Illustration of crosstalk between two simple pathways, Pk and Pl , with three cross-interactions (rkl = 3). (B) and (C): Distribution of the degree of crosstalk for the Transpath network and an selected random network, respectively. Figure 6B shows the distribution of the ρ values for the Transpath network and Figure 6C a corresponding distribution for a selected random network with an average number of N in,out . For the Transpath network the data were derived by analysing the interactions of all 1,272 pathways. The total number of pairs of pathways amounts to Ptot (Ptot − 1)/2 = 808, 356. Since among them 138,074 pairs of pathways are not disjoint a total of 670,282 pairs was taken into account for crosstalk analysis. It is seen that for the random network as well as for the Transpath network the cross-talk strengths are lower than 0.5. The cross-talk strengths for the random network exhibit a rather narrow distribution centered around an intermediate value of ρ ' 0.2 (other random networks show the same characteristics, not shown). This implies the occurrence of only low numbers of pathway pairs with weak crosstalk as well as with strong crosstalk. In contrast to that, the crosstalk distribution of the kinase network from Transpath displays a rather high number of pathway pairs which do not show any crosstalk at all (ρ = 0) as well as a considerable high number of pathway pairs exhibiting rather high crosstalk strength (ρ ' 0.32). Pairs of pathways with intermediate ρ-values occur less frequent than in random networks. Our analysis may give rise to the hypothesis that evolutionary optimization acted in such a way that there are now predominantly pathways which are rather isolated and another group of pathways showing many interactions with other pathways. Corroboration of this hypothesis requires, however, more complete data than those contained presently in Transpath, for example with respect to the

170

Binder and Heinrich

number of kinases involved and with respect to accuracy of existing interactions.

4.2

Amplitudes, Propagation Times and Duration of Signals

Above we studied exclusively aspects of the structural design. In the present section we analyse how the signal spreads out through the network. For such an analysis not only kinases have to be taken into account but also phosphatases which are responsible for downregulating the pathway after stimulation by a receptor. This dynamic analysis is based on a kinetic model which consists of the following differential equations n X dXi fi + fi Xj − βi Xi = R(t)δi X αij X dt j6=i

(5)

fi and Xi denote the concentrations of the unphosphorylated and phosphorylated forms, where X respectively, of the ith kinase [5]. They are related to each other by the conservation equation fi + Xi = Ci = const., αij and βi are the rate constants of kinases and phosphatases, and R(t) X denotes the time dependent concentration of the active receptor. Eqn. (5) describe the most simple ODEs if the description is based on mass action kinetics. It is assumed that the receptor affects exclusively input kinases by transforming them from inactive to active forms. The coefficients δi are chosen such that δi = 1 for input kinases and δi = 0 otherwise. The signaling off-state is defined by R = 0 which is dynamically stable, since the network does not contain cycles (for stability analysis, c.f. [1]). Receptor activation is described by the equation R(t) = R0 exp(−λt) with λ denoting the reciprocal value of the characteristic life time of activated receptor. To allow for an analytical treatment the fi which implies X fi ' Ci . In this way, the case of weakly activated kinases is studied, that is Xi ¿ X system (5) consists of linear differential equations. Based on these equations we calculated transient activation profiles for all kinases. Each is characterized by an amplitude Si , a signal duration θi and a signal propagation time τi [5]: Z

1 τi = Ii

0



tXi (t)dt;

θi2

1 = Ii

Z

0



2

t Xi (t)dt −

τi2 ;

Ii with Ii = Si = 2θi

Z



Xi (t)dt

(6)

0

The results for these signaling characteristics are depicted in the histograms of Figures 7A-C.





Figure 7: Amplitudes (A), signalling propagation times (B) and signal durations (C) for the Transpath network. Kinases with highest amplitudes are marked. Kinetic parameters: αij = βi = λ = 1.

Structural and Dynamical Analyses of the Kinase Network

171

It is seen that the signaling amplitudes are spread over about two orders of magnitude (0.4 < Si < 40). Values Si > 1 indicate that the signal arrives at kinase Ki in an amplified way. Signaling propagation times and signal durations are distributed within smaller ranges and vary by factors of about 4 and 2, respectively. Kinases with high amplitudes are marked in the figure. Except PFK2 these kinases are mainly involved in processes concerning mRNA translation. It is of interest how the different signalling characteristics are correlated. As an example, Figure 8 shows for all kinases values for the pairs τi and Si . In some cases the nodes represent several kinases with identical dynamic characteristics. For such kinases the structure of the upstream network is the same. In most cases a higher signaling time implies a higher amplitude whereby the corresponding relation is highly nonlinear (note the linear scale for τ and log-scale for S. In a few cases the amplitudes decrease with increasing signaling time. This occurs for the kinases which are connected along a linear pathway. Each of these kinases receives only a single activating input from an upstream kinase. In contrast, kinases with high signaling times and very high amplitudes receive inputs from many kinases.

Figure 8: Interrelations between amplitudes and signal propagation times of kinases. The histograms in Figure 9A, B show distributions of correlation coefficients RS,τ and RS,θ evaluated from 100 random networks and the corresponding correlation coefficients for the Transpath network.

     

Figure 9: Correlation coefficients of 100 random networks and the Transpath network. (A) between amplitudes and signal propagation times and (B) between amplitudes and durations.

172

Binder and Heinrich

For the latter network the correlation coefficients are comparatively low indicating that in real networks a high amplitudes do not coincide necessarily with long activation profiles.

5

Discussion

In this paper we studied structural and dynamical properties of a kinase network retrieved from the database Transpath. They are compared with the corresponding properties of random networks of the same size and with the same single node characteristics. Concerning structural features the main differences between these two types of networks can be described as follows: (1) The cellular kinase network contains no cycles, whereas the probability for a random network to be cycle free is less than 0.04 %; (2) Input and output kinases are typically connected by shortest signaling routes. In random networks these kinases are connected by pathways which vary considerably in their length; (3) The cellular network displays a characteristic spectrum of cross-talk between different pathways. It is bimodal in the sense that there is, on the hand, a considerable fraction of pathway pairs having a very low degree of cross talk, and, on the other hand, another fraction of pathways showing a rather strong cross-talk. In contrast, cases of very weak or strong cross-talk are extremely rare in random networks. With respect to the length distribution of shortest pathways between input and output the cellular network does not differ strongly from random networks (c.f. Figure 2A and B). However, this may result partly from the fact that randomization was performed under the constraints that some basic characteristics of the Transpath network remain unchanged. By investigating of dynamical properties we found that in the cellular network signaling amplitudes and signal duration exhibit weakly correlated compared to random networks. Our investigations suggest that signal transduction networks display characteristic features resulting from natural selection. The absence of cycles ensures the dynamic stability of the signaling off-state as shown previously [1]. Moreover, the high fraction of shortest pathways indicates that long pathways have been avoided as much as possible. Accordingly, natural selection seems to have reduced the degree of pathway redundancy. Concerning pathway dynamics we obtained some evidence that the network design is such that high signal amplification does not necessarily imply long signal duration and vice versa. For future work one may envisage similar treatments for organism specific kinase networks. However, this would require more complete information of signaling processes in individual species. Moreover, it would be intriguing to extent the analysis by considering also additional kinds of signaling components such as adaptor proteins, G proteins or scaffold proteins. This would result also in a wider spectrum of individual chemical processes. For characterising the structural properties of such complex networks one has to apply more advanced graph theoretical methods. Complexity is even increased when considering also processes of protein synthesis and degradation which have been found to play a crucial role in signal transduction [9].

References [1] Binder, B. and Heinrich, R., Interrelations between dynamical properties and structural characteristics of signal transduction networks, Genome Inform., 15(1):13–23, 2004. [2] Cowan, K.J. and Storey, K.B., Mitogen activated protein kinases: New signaling pathways functioning in cellular responses to environmental stress, J. Exp. Biol., 206:1107–1115, 2003. [3] Ebenh¨oh, O. and Heinrich, R., Stoichiometric design of metabolic networks: Multifunctionality, clusters, optimization, weak and strong robustness, Bull. Math. Biol., 65(2):323–357, 2003.

Structural and Dynamical Analyses of the Kinase Network

173

[4] Hatakeyama, M., Kimura, S., Naka, T., Kawasaki, T., Yumoto, N., Ichikawa, M., Kim, J.H., Saito, K., Saeki, M., Shirouzu, M., Yokoyama, S., and Konagaya, A., A computational model on the modulation of mitogen activated protein kinase (MAPK) and Akt pathways in heregulininduced ErbB signalling, Biochem. J., 373:451–463, 2003. [5] Heinrich, R., Neel, B.G., and Rapoport, T.A., Mathematical models of protein kinase signal transduction, Mol. Cell, 9:957–970, 2002. [6] Hornberg, J.J., Bruggeman, F.J., Binder, B., Geest, R.G., Marjolein Bij de Vaate, A.J., Lankelma, J., Heinrich, R., and Westerhoff, H.V., Principles behind the multifarious control of signal transduction: ERK phosphorylation and kinase/phosphatase control, FEBS J., 272:244–258, 2005. [7] Kofahl, B. and Klipp, E., Modeling the dynamics of the yeast pheromone pathway, Yeast, 21:831– 850, 2004. [8] Kyriakis, J. and Avruch, J., Mammalian mitogen-activated protein kinase signal transduction pathways activated by stress and inflammation, Physiol. Rev., 81(2):807–869, 2001. [9] Lee, E., Salic, A., Kr¨ uger, R., Heinrich, R., and Kirschner, M.W., The roles of APC and Axin derived from experimental and theoretical analysis of the Wnt pathway, PLoS Biol., 1(1):116–132, 2003. [10] Melendez-Hevia, E., Waddell, T.G., and Cascante, M., The puzzle of the Krebs citric acid cycle: Assembling the pieces of chemically feasible reactions, and opportunism in the design of metabolic pathways during evolution, J. Mol. Evol., 43:293–303, 1996. [11] Milo, R., Shen-Orr, S., Itzkovitz, S., Kashtan, N., Chklovskii, D., and Alon, U., Network motifs: Simple building blocks of complex networks, Science, 298:824–827, 2002. [12] Milo, R., Itzkovitz, S., Kashtan N., Levitt, L., Shen-Orr, S., Ayzenshtat, I., Sheffer, M., and Alon, U., Superfamilies of designed and evolved networks, Science, 303:1538–1542, 2004. [13] Papin, J.A. and Palsson, B.O., Topological analysis of mass-balanced signaling networks: A framework to obtain network properties including crosstalk, J. Theor. Biol., 227:283–297, 2004. [14] Schacherer, F., Choi, C., Gotze, U., Krull, M., Pistor, S., and Wingender E., The TRANSPATH signal transduction database: A knowledge base on signal transduction networks, Bioinformatics, 17:1053–1057, 2001. [15] Schoeberl, B., Eichler-Jonsson, C., Gilles, E.D., and Muller, G., Computational modeling of the dynamics of the MAP kinase cascade activated by surface and internalized EGF receptors, Nat. Biotechnol., 20:370–375, 2002. [16] Stephani, A., Nuno, J.C., and Heinrich, R., Optimal stoichiometric designs of ATP-producing systems as determined by an evolutionary algorithm, J. Theor. Biol., 199:45–61, 1999. [17] Swameye, I., M¨ uller, T.G., Timmer, J., Sandra, O., and Klingm¨ uller, U., Identification of nucleocytoplasmic cycling as a remote sensor in cellular signaling by databased modeling, Proc. Natl. Acad. Sci. USA, 100:1028–1033, 2003. [18] http://www.research.att.com/sw/tools/graphviz/