Prediction of Protein Complexes Based on Protein Interaction Data ...

Comment

Report 3 Downloads 93 Views

Prediction of Protein Complexes Based on Protein Interaction Data and Functional Annotation Data Using Kernel Methods Shi-Hua Zhang1 , Xue-Mei Ning1 , Hong-Wei Liu2 , and Xiang-Sun Zhang1 1

Institute of Applied Mathematics, Academy of Mathematics and Systems Science Chinese Academy of Sciences, Beijing 100080, China {zsh, nxm}@amss.ac.cn, [email protected] 2 School of Economics, Renmin University of China, Beijing 100872, China [email protected]

Abstract. Prediction of protein complexes is a crucial problem in computational biology. The increasing amount of available genomic data can enhance the identiﬁcation of protein complexes. Here we describe an approach for predicting protein complexes based on integration of protein-protein interaction (PPI) data and protein functional annotation data. The basic idea is that proteins in protein complexes often interact with each other and protein complexes exhibit high functional consistency/even multiple functional consistency. We create a proteinprotein relationship network (PPRN) via a kernel-based integration of these two genomic data. Then we apply the MCODE algorithm on PPRN to detect network clusters as numerically determined protein complexes. We present the results of the approach to yeast Sacchromyces cerevisiae. Comparison with well-known experimentally derived complexes and results of other methods veriﬁes the eﬀectiveness of our approach.

1

Introduction

Cellular organization and function are carried out through gene/protein interactions. With ever-increasing diﬀerent types of genomic data such as DNA sequences, gene expression measurement, protein-protein interaction, and protein phylogenetic proﬁles, reconstruction of biological machinery from these genomic data is a crucial problem. Protein complex is a group of proteins that often interact with each other, forming a special biological chemical machinery. However, despite recent advances in detection technologies of protein interactions, only a very few of many possible protein complexes has been experimentally determined [1]. Then prediction of protein complexes is a key problem in computational biology. One of such work has been done within the PPI networks [2,3,4]. Proteins in a complex often interact with each other, so protein complexes generally correspond to dense subgraphs in the PPI networks. Recently, three approaches to network clustering including the MCODE (Molecular Complex Detection) algorithm [2], restricted neighborhood search clustering (RNSC) [3], and local clique merging algorithm (LCMA) [4] have been applied to predict protein complexes. D.-S. Huang, K. Li, and G.W. Irwin (Eds.): ICIC 2006, LNBI 4115, pp. 514–524, 2006. c Springer-Verlag Berlin Heidelberg 2006

Prediction of Protein Complexes

515

The MCODE algorithm utilizes connectivity values in PPI networks to identify complexes, a shortcoming of which is that its resulted clusters may be too sparse. While RNSC partitions the PPI networks using a cost function. It is a random algorithm and relatively fewer complexes can be predicted by this algorithm. LCMA algorithm which is based on local clique merging has been shown more eﬃcient than MCODE algorithm preliminarily. It should be noted that proteins in known complexes often correspond to consistent functional annotation [2,3], so the relatively aboundant functional annotation information can be employed to identify protein complexes. In the study of ref.[3], functional homogeneity has been used as a necessary condition for prediction of protein complexes. Kernel representation of heterogeneous genomic information has already been proven to be very useful tool in computational biology [5,6]. Each type dataset can be represented by means of a kernel function, a real-valued function K(x, y), which deﬁnes similarities between pairs of objects (genes, proteins and so on) x and y. Evaluating the kernel on all pairs of data objects yields a symmetric, positive semi-deﬁnite matrix K known as the kernel matrix. The distinguished characteristic is that all types of data are represented in the uniﬁed framework even though they might be diﬀerent in nature. Various kernels have been developed for various genomic data integration [5,6]. For example, the linear kernel and Gaussian kernel are natural choice for datasets which are represented by vectors, while diﬀusion kernel [7] has proven to be very eﬀective for describing network data. In this study, a simple kernel representation which captures the functional consistency or even multiple functional consistency of protein complexes properly is deﬁned naturally for the protein functional annotation data. Here we propose an integrated approach that attempts to identify protein complexes using protein interaction data and functional annotation information. We create an integrated protein-protein relationship network (PPRN) by using the kernel methods to integrate these two genomic data. Then the network clustering method called MCODE algorithm is applied to the created PPRN network to detect numerically derived complexes. The MCODE algorithm is developed for detecting complexes in protein interaction networks and it can detect overlapping modules. So they are also suitable for ﬁnding complexes in our networks. This approach was applied to yeast Sacchromyces cerevisiae. The computed protein complexes show good consistency with well-known yeast protein complexes. Comparison with other methods such as the MCODE algorithm applied on PPI network directly shows the eﬀectiveness of our approach.

2 2.1

Systems and Methods Materials

We use yeast-related genomic data to predict protein complexes as it is currently the organism with the most comprehensive experimental datasets available publicly. Protein Interaction Data. A physical network of 4713 yeast Sacchromyces cerevisiae proteins containing 14848 protein interactions is used in our work.

516

S.-H. Zhang et al.

The protein-protein interactions were downloaded from the DIP database as of July 2004 and predominantly included data from large-scale experiments [8,9,10,11]. Functional Annotation Data. To employ the functional consistency of protein complexes, we utilize the functional annotation of Sacchromyces cerevisiae genes in MIPS Functional Catalog (FunCat) [12] database. FunCat is an annotation scheme for the functional description of proteins from various biology and consists of 28 main functional categories (or branches). The main branches exhibit a hierarchical, tree like structure with up to six levels of increasing speciﬁcity and 1307 functional categories are included in total. Here we utilize the functional annotation at the second levels of 68 categories (to the 4713 proteins), so that each protein corresponds to a vector of dimension 68 in which 1 or 0 represents a protein belonging to or not belonging to a category. Gold Standard Complex Data. To evaluate the eﬀectiveness of our approach for predicting protein complexes, we compare the predicted complexes of the yeast data with known protein complexes in MIPS yeast complex database [13]. In order to removing/ﬁltering the experimentally predicted protein complexes from the dataset to a certain extent, we only use manually annotated complexes derived from literature scanning and the known Gavin benchmark data [10] as our gold standard dataset. Finally, a set M of 439 yeast complexes is used as known complexes set. Its biggest protein complex contains 88 proteins and the average size of it is 9.11. 2.2

Methods

The outline of our method is shown in Figure 1. Two genomic datasets are represented by two kernel matrices respectively. Then a protein-protein relationship network (PPRN) is produced by integration of these two kernels. A powerful tool of detecting network modules is applied to PPRN network. The resulting modules are our numerically detected protein complexes which constitute the predicted complex set P . Validation of these complexes and comparison with related methods veriﬁes our idea that functional annotation information is helpful for the detection of protein complexes. Kernel Representation and Data Integration. In order to represent each type of genomic information uniformly, kernel representation is an eﬃcient method [5,6]. PPI network can be represented using the diﬀusion kernel [7]. Let A denote the adjacency matrix of the PPI network and D denote the diagonal matrix of degrees of nodes. So the Laplacian matrix of this network is L = D − A. Then the diﬀusion kernel is deﬁned as K = expm(−βL),

(1)

where expm is a matrix exponential operation and β is a parameter to control

Prediction of Protein Complexes

PPI network

K PPI

517

Functional annotation data

K Fa

PPRN network

MCODE Predicted protein complexes Match with well-known protein complexes

Validation of protein complexes

Fig. 1. The schematic diagram of our method for detection of protein complexes

the degree of diﬀusion. Then the diﬀusion kernel is normalized so that its all diagonal elements are one: Kij KP P I = . Kii Kjj

(2)

Functional annotation data is represented by means of liner kernel: Kf a (i, j) = xi · xj ,

KF a =

Kf a , max(Kf a )

(3)

where · means the inner product and max() means the maximal value of the matrix. These two kernels measure the similarity of proteins with respect to every genomic data. A new kernel deﬁned as the sum of the two kernels: KInt =

KP P I + KF a , 2

(4)

is a simple approach of data integration. Although more complex kernel operation can be employed to create new integrating method, this simple kernel has been used comprehensively [5]. Protein-Protein Relationship Network (PPRN): Kernels describe some implicit similarity of proteins, so any protein kernel matrix K can denote a weighted/unweighted network G(V, E, W )/G(V, E) of protein-protein relationship. The nodes set V consists of all proteins, the matrix W is the value of corresponding kernel matrix which denotes the weights of the edges, and the edge set of such network is deﬁned as: E = {(i, j)|Kij ≥ c},

(5)

where c is a parameter to control the density of the network. We denote the network of kernel KInt as our protein-protein relationship network (PPRN). We

518

S.-H. Zhang et al.

believe that a group of proteins which have large enough kernel mutually likely corresponds to a protein complex. MCODE Algorithm: Bader and Hogue [2] have developed a novel graph theoretic clustering algorithm, i.e., so-called MCODE algorithm (http://cbio.mskcc. org/ bader/software/mcode/index.html), which utilizes connectivity values in PPI networks to detect protein complexes. This algorithm is based on vertex weighting according to its local neighborhood density and then outward traversal from a dense seed protein with a high weighting value to recursively include neighboring vertices whose weight satisﬁes some given threshold. Here we also apply it on our PPI network and PPRN networks to evaluate our idea that functional annotation can improve the ability of prediction of complexes.

3 3.1

Experiments and Results Validation of Protein Complexes

We assess the precision of results of applying MCODE algorithm on our PPRN networks by using evaluation metric used in [2,4]. They used the overlap score: OS(p, m) =

k2 n1 × n2

(6)

to determine matching between a predicted complex p ∈ P and a known complex m ∈ M , where k is the size of overlap of p and m and n1 ,n2 are the sizes of p and m respectively. Given a predicted complex p and a known complex m, they are considered to be matching if OS(p, m) ≥ 0.2, where 0.2 is an experientially determined threshold used in [2] ﬁrstly and also was used in [4]. And then we refer the notation in [4] to deﬁne the set of true positives (T P ) as T P = {p|∃m, OS(p, m) ≥ 0.2, p ∈ P, m ∈ M }, and the set of false positives (F P ) as F P = P − T P . Naturally, the set of false negatives (F N ) is deﬁned as F N = {m|∀p, OS(p, m) < 0.2, p ∈ P, m ∈ M }, and the matched gold-standard complex set M S can be deﬁned as M S = M − F N , which contains known complexes matched by predicted complexes. Then the recall (sensitivity) and precision (speciﬁcity) are deﬁned as |T P |/(|T P | + |F N |) and |T P |/(|T P | + |F P |) respectively. The so-called F-measure which is deﬁned as F =

2 × P recision × Recall P recision + Recall

(7)

adopted in [4] is used to evaluate the performance of our approach. Just as the authors have pointed that F-measure of every method only can be taken as comparative measures rather than their real values for the incompleteness of known complexes set. In order to further test our approach, we consider another index which measures the coverage of predicted protein complexes: Cov(p, m) =

k , n2

Cq = {m|∃p, Cov(p, m) ≥ q, p ∈ P, m ∈ M },

(8)

Prediction of Protein Complexes

519

where q is a real number between 0 and 1, the set Cq contains the known complexes whose members appear in a predicted complex above the degree q. 3.2

Experimental Results

Since the noise and incompleteness of known protein interaction data, our approach aims to detect more complexes through integrating functional annotation data to current protein interaction data with high recall and precision. The effectiveness of kernel methods employs the functional consistency of proteins and implicit relationship of interacting proteins. The functional annotation information can complement the absence of existing interactions and correct some false interactions. So integration of the two genomic data can enhance the robustness of network clustering method against only the high noise protein interaction data. Figure 2 shows an example of MIPS complex of size 18 and two matching complexes both of size 13 in PPI and PPRN network respectively by means of MCODE algorithm. Table 1 shows the functional annotation of proteins in Figure 2 (only the functional annotation that is labeled by at least three proteins has been shown). The predicted complex in PPRN network (with c = 0.24, see below) is contained within the known complex while for the predicted complex in PPI network only ten proteins are included in it. We can see that all the proteins in the given MIPS complex (the ﬁrst 18 proteins in table 1) show high multiple functional consistency, while three proteins (the last three proteins labeled in black font) that are included in the predicted complex in PPI network but not included in the given MIPS complex do not show such multiple consistent functional annotation information. This shows our idea that known functional annotation information/functional annotation consistency is helpful for detecting complexes.

Fig. 2. An example: MIPS complex (MIPS-420.50)-F0/F1 ATP synthase complex and the matching complex predicted in PPI network and PPRN network respectively

In all the study, the diﬀusion kernel of protein interaction network is computed with β = 3. And then we choose c = 0.25 and 0.24 experientially for producing two PPRN networks with 14423 and 16413 edges respectively. We

520

S.-H. Zhang et al. Table 1. Functional annotation of proteins in Figure 2

Q0080 Q0085 Q0130 YOL077W-A YLR295C YBL099W YBR039W YDL004W YDL181W YDR298C YDR322C-A YDR377W YJR121W YKL016C YML081C-A YPL078C YPL271W YPR020W YJL180C YNL315C YBR271W

02.11 02.11 02.11 02.11 02.11 02.11 02.11 02.11 02.11 02.11 02.11 02.11 02.11 02.11 02.11 02.11 02.11 02.11

02.13 02.13 02.13 02.13 02.13 02.13 02.13 02.13 02.13 02.13 02.13 02.13 02.13 02.13 02.13 02.13 02.13

02.45.15 02.45.15 02.45.15 02.45.15 02.45.15 02.45.15 02.45.15 02.45.15 02.45.15 02.45.15 02.45.15 02.45.15 02.45.15 02.45.15 02.45.15 02.45.15 02.45.15 02.45.15

14.10 14.10

14.10 14.10 14.10

16.07 16.07

16.07 16.07

16.07 16.07 16.07

20.01.01.01 20.01.01.01 20.01.01.01 20.01.01.01 20.01.01.01 20.01.01.01 20.01.01.01 20.01.01.01 20.01.01.01 20.01.01.01 20.01.01.01 20.01.01.01 20.01.01.01 20.01.01.01 20.01.01.01

20.01.15 20.01.15 20.01.15 20.01.15 20.01.15 20.01.15 20.01.15 20.01.15 20.01.15 20.01.15 20.01.15 20.01.15 20.01.15 20.01.15 20.01.15 20.01.15 20.01.15 20.01.15

20.03.22 20.03.22 20.03.22 20.03.22 20.03.22 20.03.22 20.03.22

20.09.04 20.09.04 20.09.04 20.09.04 20.09.04 20.09.04 20.09.04 20.09.04

34.01.01.03 34.01.01.03 34.01.01.03

20.03.22 20.03

20.09.04 20.09.04

34.01.01.03

20.03.22 20.03.22

20.09.04 20.09.04

34.01.01.03 34.01.01.03

20.03.22 20.03.22

20.09.04 20.09.04

34.01.01.03 34.01.01.03

34.01.01.03 34.01.01.03 34.01.01.03 34.01.01.03

14.10 14.10

apply the MCODE algorithm on the two networks to predict protein complexes. For comparison, we also apply it on the original protein interaction network to show the eﬀectiveness of integration of two genomic information using kernel methods. The MCODE algorithm needs two important parameters w and f to control the number and size of resulting clusters. With respect to diﬀerent networks, the optimal results will be produced by diﬀerent parameter pairs [2]. We choose some parameter pairs to optimize the biological relevance. Some of the predicted complexes are of size 3, while complex with size 3 is less statistical signiﬁcant, since it is easy to produce in a random graph. So we discuss two cases: one is that the predicted complex set includes complexes size of 3 and the other is not. Table 2 and Figure 3 show that the optimal results with respect to the largest F-measures with three groups w in three networks. The results show the integration of functional annotation information with PPI data can enhance the prediction results largely. More complexes are detected using the same network clustering method, and the F-measures of the two PPRN networks are also clearly higher than that of only PPI network used. For example, the F-measures of two PPRN networks are able to achieve 15.85%/22.62% higher than that of PPI networks and 32.66%/39.74% higher in two cases with w = 0.1. We test coverage of predicted complexes, i.e., the degree to which entire complexes appear in the same predicted complexes [16]. Figure 4 shows the large improvement of our results for varying values of q in two cases respectively. For example, in our two PPRN networks, there are 139/140 gold-standard complexes (with w = 0.1 and f = 0) for which 50% or more of their members appeared in the same predicted complex, compared only 70 in the predicted results of PPI network. King et al.[3] applied RNSC algorithm to predict complexes from protein interaction networks. But they only predicted 45 complexes which match 30 MIPS complexes. A new recent system LCMA algorithm based on local clique merging has been reported to be more eﬃcient than MCODE algorithm. But we do not

Prediction of Protein Complexes

521

Table 2. List of various protein-protein networks and its related results, where P3 and P4 represent predicted complexes set with minimum size 3 and 4 of their predicted complexes respectively Protein-Protein network

para(c, w, f )

|P3 | |T P |

|M S| para(c, w, f )

PPI (MCODE) PPI (MCODE)

(no, 0.00, 0.00) (no, 0.05, 0.00)

157 237

74 94

104 143

(no, 0.00, 0.20) (no, 0.05, 0.10)

|P4 |

|T P |

|M S|

147 235

82 121

60 74

PPI (MCODE)

(no, 0.10, 0.00)

305

169

105

(no, 0.10, 0.25)

268

130

92

PPRN (Int+MCODE) PPRN (Int+MCODE)

(0.25, 0.00, 0.00) (0.25, 0.05, 0.00)

371 510

216 260

122 139

(0.25, 0.00, 0.10) (0.25, 0.05, 0.00)

357 360

192 204

112 95

PPRN (Int+MCODE)

(0.25, 0.10, 0.00)

591

284

142

(0.25, 0.10, 0.00)

405

225

97

PPRN (Int+MCODE)

(0.24, 0.00, 0.30)

377

239

119

(0.24, 0.00, 0.30)

330

216

104

PPRN (Int+MCODE) PPRN (Int+MCODE)

(0.24, 0.05, 0.00) (0.24, 0.10, 0.00)

526 633

286 318

141 150

(0.24, 0.05, 0.00) (0.24, 0.10, 0.00)

368 435

223 248

93 105

A

PPI

B

PPI PPI PPRN(0.25) PPRN(0.25) PPRN(0.25) PPRN(0.24) PPRN(0.24) PPRN(0.24) 0

0.1

0.2

0.3

0.4

0.5

0

0.1

0.2

0.3

0.4

0.5

Fig. 3. Comparison of F-measures of applying the MCODE algorithm on various networks with various selective parameters

do direct comparison for the lack of their system, and we emphasize that our contribution is that we complement the incomplete PPI data with functional annotation information by means of kernel methods which well exploit the functional homogeneity of protein complexes or even multiple functional consistency of proteins. Our approach also predicted complexes that do not match current protein complexes set just like other methods have done. Since the known complex set is largely incomplete, these new unmatched complexes could be real complexes likely. So the actual precision of our approach would be higher than current results. Recent studies have well shown that biological networks (eg. metabolic network, physical interaction networks) show the characteristic of scale-free networks just like many natural networks [14]. Here, we examined the scale-free characteristic of protein-protein relationship networks and size distribution of predicted complexes based on the PPI network and the two PPRN networks. On the top of Figure 5, plots A,B,C show that the probability P (k) of a node with degree in these three networks follows power law: P (k) ∝ k −γ , and at the bottom of Figure 5, plots B,C show that the size distribution of clusters (modules) of two PPRN networks also follow power law clearly, while that of PPI networks has a high slope (γ = 3.56)(see in plot A).

522

S.-H. Zhang et al.

200

|Cq|

160

PPI PPI PPI PPRN(0.25) PPRN(0.25) PPRN(0.25) PPRN(0.24) PPRN(0.24) PPRN(0.24)

250

150

140 120 100 80

100

60 40

50 20 0 20

40

60

80

0 20

100

40

60

q(%)

80

100

q(%)

Fig. 4. Complex coverage: |Cq | represents the number of complexes whose member proteins appear in the same predicted complex for various q values. The left ﬁgure plots the results with all the predicted complexes including complexes of size 3, while the right removing complexes of size 3.

A

Frequency (log)

3

slope=−1.57

2

2

1

1

1

0

0

0

1 2 Degree (log)

0

1 2 Degree (log)

A

0

C 2

slope=−3.56

slope=−1.98

1.5

1.5

1

1

1

0.5

0.5

0.5

0 0.4

slope=−1.86

1.5

0 0.6 0.8 1 Cluster size (log)

1 2 Degree (log)

B 2

2

C

3

slope=−1.61

2

0

Number of clusters (log)

B

3

slope=−1.82

0 0.5

1 1.5 Cluster size (log)

0.5

1 1.5 Cluster size (log)

Fig. 5. On the top of the ﬁgure, plots show the degree distribution of PPI network and two PPRN networks, and at the bottom of the ﬁgure, plots show size distribution of these networks by MCODE algorithm with parameter w = 0.1 and f = 0

4

Conclusion and Discussion

In this paper, we develop a method of predicting protein complexes based on integration of two important genomic data (physical interaction data and protein

Prediction of Protein Complexes

523

functional annotation data) by means of kernel methods. Group of genes/proteins which may correspond to functional modules have been detected comprehensively in physical interaction data [15]. However, it is often hard to conclude that these clusters/modules must have such properties. One reason is that these data is very noisy and incomplete. Prediction of protein complexes has been done based on protein interaction data such as MCODE algorithm [2], RNSC algorithm [3] and recent LCMA algorithm [4]. Detection of molecular pathways/functional modules also have been done based on integration of physical interaction data and another important genomic data—gene expression data [16]. Here, we introduce the functional annotation data to improve the limitation of the physical interaction data and this approach well employ the functional consistency of protein complexes. Kernel representation has been proven to be very useful for various types data, e.g. string, trees, network and so on. Its merit has been comprehensively used in bioinformatics, e.g. inference of biological network [5]. In this study, we well exploit the characteristic of kernel methods and combine these two data. The experimental results with yeast data show the effectiveness of our proposed method. Compared with the results of only using protein interaction data, our predicted complexes match or contain more known experimentally protein complexes. More novel predicted complexes may help biologists to detect new protein complexes experimentally. We can conclude that the combination of these two data sources can produce more better results than only using protein interaction data.

Acknowledgements This work is partly supported by Important Research Direction Project of CAS “Some Important Problem in Bioinformatics”, National Natural Science Foundation of China under Grant No.10471141.

References 1. Sear, R.P.: Speciﬁc Protein-Protein Binding in Many-componet Mixtures of Proteinsn. Phys. Biol., 1(2004), 53-60 2. Bader, G.D., Hogue, C.W.: An Automated Method for Finding Molecular Complexes in Large Protein Interaction Networks. BMC Bioinformatics, 4(2003), 2 3. King, A.D., Prˇzulj, N., Jurisica, I.: Protein Complex Prediction via Cost-based Clustering. Bioinformatics, 20(2004), 3013-3020 4. Li, X.L., Tan, S.H., Foo, C.S., Ng, S.K.: Interaction Graph Mining for Protein Complexes Using Local Clique Merging. Genome Informatics, 16(2005), 260-269 5. Yamanishi. Y, Vert, J.P., Kanehisa, M.: Protein Network Inference from Multiple Genomic Data: a Supervised Approach. Bioinformatics, 20(2004), i363-i370 6. Lanckriet, G.R., De Bie T.D, Cristianini, N., Jordan, M.I., Noble, W.S.: A Statistical Framework for Genomic Data Fusion. Bioinformatics, 20(2004), 2626-2635 7. Kondor, R.I., Laﬀerty, J.: Diﬀusion Kernels on Graphs and Other Discrete Input. In Proceedings of the 19th International Conference on Machine Learning, Morgan Kaufmann, University of South Wales, Sydney, Australia, (2002), 315-322

524

S.-H. Zhang et al.

8. Ito, T., Tashiro, K., Muta, S., Ozawa, R., Chiba, T., Nishizawa, M., Yamamoto, K., Kuhara, S. Sakaki, Y.: Toward a Protein-Protein Interaction Map of the Budding Yeast: a Comprehensive System to Examine Two-hybrid Interactions in All Possible Combinations between the Yeast Proteins. Proc. Natl Acad. Sci., USA, 97(2000), 1143-1147 9. Uetz, P., Giot, L., Cagney, G., Mansﬁeld, T.A., Judson, R.S., Knight, J.R., Lockshon, D., Narayan, V., Srinivasan, M., Pochart, P., et al.: A Comprehensive Analysis of Protein-Protein Interactions in Saccharomyces Cerevisiae. Nature, 403(2000), 623-627 10. Gavin, A.C., Bosche, M., Krause, R., Grandi, P., Marzioch, M., Bauer, A., Schultz, J., Rick, J.M., Michon, A.M., Cruciat, C.M., et al.: Functional Organization of the Yeast Proteome by Systematic Analysis of Protein Complexes. Nature, 415(2002), 141-147 11. Ho, Y., Gruhler, A., Heilbut, A., Bader, G.D., Moore, L., Adams, S.L., Millar, A., Taylor, P., Bennett, K., Boutilier, K., et al.: Systematic Identiﬁcation of Protein Complexes in Saccharomyces Cerevisiae by Mass Spectrometry. Nature, 415(2002), 180-183 12. Ruepp, A., Zollner, A., Maier, D., Albermann, K., Hani, J., Mokrejs, M., Tetko, I., Guldener, U., Mannhaupt, G., Munsterkotter, M. et al.: The FunCat, a Functional Annotation Scheme for Systematic Classiﬁcation of Proteins from Whole Genomes. Nucleic Acids Res., 32(2004), 5539-5545 13. Mewes, H.W., Frishman, D., Guldener, U., Mannhaupt, G., Mayer, K., Mokrejs, M., Morgenstern, B., Munsterkotter, M., Rudd, S., Weil, B.: MIPS: a Database for Genomes and Protein Sequences. Nucleic Acids Res. 30(2002), 31-34 14. Barab´ asi, A.-L., Oltvai, Z.N.: Network Biology: Understanding the Cell’s Functional Organization. Nature Rev. Genet., 5(2004), 101-114 15. Spirin, V., Mirny, L.A.: Protein Complexes and Functional Modules in Molecular Networks. Proc. Natl Acad. Sci., USA, 100(2003), 12123-12126 16. Segal, E., Wang, H., Koller, D.: Discovering Molecular Pathways from Protein Interaction and Gene Expression Data. Bioinformatics, 19(2003), i264-i272

Recommend Documents

Protein-protein interaction site prediction based on ... - CiteSeerX

Prediction of Protein Functions from Protein Interaction Networks: A ...