Bootstrapping Parameter Estimation in Dynamic Systems

Report 5 Downloads 145 Views
Bootstrapping Parameter Estimation in Dynamic Systems Huma Lodhi1 and David Gilbert1 School of Information Systems, Computing and Mathematics Brunel University UB8 3PH, UK [email protected], [email protected]

Abstract. We propose a novel approach for parameter estimation in dynamic systems. The method is based on the use of bootstrapping for time series data. It estimates parameters within the least square framework. The data points that do not appear in the individual bootstrapped datasets are used to assess the goodness of fit and for adaptive selection of the optimal parameters. We evaluate the efficacy of the proposed method by applying it to estimate parameters of dynamic biochemical systems. Experimental results show that the approach performs accurate estimation in both noise-free and noisy environments, thus validating its effectiveness. It generally outperforms related approaches in the scenarios where data is characterized by noise.

1

Introduction

Developing efficient and effective computational methodologies for parameter estimation in dynamic systems is a challenging task. In recent years designing and applying techniques to compute parameters has gained the attention of researchers as accurate models can facilitate understanding of unsolved problems in areas ranging from biology [24] to engineering. Models of dynamic systems are represented by using mathematical formulation [22, 25], generally in the form of ordinary differential equations (ODEs). An ODE describes change in a variable Xk (t) over time. In order to construct accurate models, it is essential to determine the values of the parameters θ that represent dynamics of a system. Parameter estimation is viewed as the most difficult task in modeling systems [29]. This process is characterized by many challenges. For example, in general the size of the datasets that are utilized to estimate parameters is small or moderate, and furthermore the datasets are noisy. In this paper we present an effective and robust computational method for parameter estimation in dynamic systems by addressing the issues. A number of relational, Bayesian and statistical techniques have been studied for this task. In relational methods [26], probabilistic logic programs are used for systems representation, and parameters are estimated by using Expectation Maximization (EM)-type [13] approaches. Bayesian approaches [23, 21,

8, 20] have also been studied. In the Bayesian framework parameters are estimated by calculating posterior probability distribution over model parameters. The posterior probabilities are computed by using the prior and likelihood of observed data. In these methods the process of parameter estimation is generally based on sampling methods. Collocation methods are another class of statistical techniques that have been proposed for computing the parameters of ODE-based dynamic systems [6, 5, 27]. The standard and most widely used method for parameter estimation has been developed within the least square framework. It iteratively solves the ODEs and minimizes a loss function that is given by the sum of squared residuals. The estimation process starts with an initial set of parameters that are optimized by minimizing the loss function. Despite its wide spread use in different fields ranging from engineering to biology, the method suffers from several limitations that need to be addressed. For example, if the initial guess for the parameters is far from the target (true parameters), the method may fail to perform an accurate estimation. Furthermore data characterized by noise may also pose challenges. In order to address these limitations, we present a novel strategy to improve the method that is based on the use of bootstrapping [17, 15, 16]. The underlying aim of the method is to train a highly accurate estimator. It accomplishes this by applying a series of estimators to the bootstrapped datasets. We propose the use of out-of-bootstrap [4, 28] to assess the goodness of fit. The final parameter values are obtained by combining the output of the individual estimators. The method is adaptive to the estimation process and robust to noise. In order to evaluate the efficacy of our proposed approach, we apply the proposed method to two signaling pathways and a gene regulatory network where the pathway or network is represented by a set of ODEs. The experimental results validate the premise that the proposed approach is effective for computing parameters in dynamic systems. It generally performs better than related approaches in the study. The paper is organized as follows. Section 2 describes least square parameter estimation and bootstrapping. In section 3 we present our novel approach for estimating parameters in dynamic systems. In section 4 we give a brief overview of biochemical systems and present experimental results that show the efficacy of the approach. Section 5 concludes the paper.

2

Least Square Parameter Estimation and Bootstrap Methods

The standard method for parameter estimation in ODE based dynamic systems and more specifically biochemical systems can be viewed as an iterative fitting process. It is provided with an initial guess for parameters. The method generates estimated data points by integrating the model at different time points hence producing an (estimated) time series for each variable k. It then updates the parameters by fitting the estimated data points to the observed data. In other

words it minimizes a non-linear least square problem given by: l(θ) =

n X

(xi − yi (θ, ti ))2

i=1

Here {yi (θ, ti ) : i = 1, . . . , n} are n data points that are estimated by using parameters θ for time points {ti : i = 1, . . . , n}. {xi : i = 1, . . . , n} are observed data points for a variable. In this way the method minimizes a loss function that is given by the residuals sum of squares. In the case where a system is described by K variables, the function to be minimized is: l(θ) =

K X n X

(xki − yki (θ, ti ))2

k=1 i=1

The process described above is repeated until some convergence criterion is met. The method returns estimated parameters θˇ θˇ = argmin [l(θ)] The minimization problem can be solved by utilizing optimization techniques like the Trust Region Newton method [12, 11]. We term this approach Least Square Estimator (LSE). In the next section we present a novel method that is effective and robust to noise by exploiting the characteristics of bootstrapping [17, 15, 16], a widely applicable and dependable technique. The bootstrap technique generates a sample (bootstrapped dataset) from a dataset from an unknown probability distribution D. Bootstrapping can be categorized into two classes, parametric bootstrap and non-parametric bootstrap. The non-parametric bootstrap approach works by generating samples from the dataset. The sample XiB is constructed by randomly drawing, with replacement, n data points from the dataset of size n. It replaces the distribution D (according to which data is generated) by an empir˘ that is a discrete distribution and assigns probability of 1 ical distribution D n to each data point. The bootstrapped dataset may not contain all of the data points from the original dataset and some may occur many times. The probability that a data point is not included in the bootstrapped dataset is (1 − 1/n)n . For large sample sizes the probability is approximately 1/e = 0.368. On average a bootstrapped dataset contains 63.2% of the distinct data points. Since the bootstrap sampling is carried out without using any parametric model, it is called nonparametric bootstrap. In the parametric bootstrap the samples are generated in a parametric fashion. The data points comprising the bootstrapped datasets are sampled from the learned parametric model instead of resampling

Fig. 1. Construction of blocks for the block bootstrap. The length l of a block is set to 3.

from the original dataset. The data points that do not appear in a bootstrap sample form the out-of-bootstrap dataset. In this paper we will focus on non-parametric bootstrap and its application to time series data. Block bootstrap is a well-known method to generate samples from time series data. In this method a time series is divided into blocks which can be overlapping or non-overlapping. Previous research [1] has shown that overlapping and non-overlapping blocks produce approximately similar numerical results. We will utilize non-overlapping blocks in this paper. Non-overlapping blocks are obtained as follows: A block is l−adjacent data points from a time series Xi . A window with its width set to l traverses the time series, stepping one data point at a time. Once a block is encountered, the process is repeated, to obtain a new block where the starting point for the new block is at l + j : j = 1, . . . : l. Figure 2 illustrates the block formation process. Once the blocks of length l are constructed, the bootstrapped time series data is obtained by drawing the blocks randomly with replacement. The blocks are put together in the order in which they are drawn, hence giving a bootstrapped sample. The size of the block bootstrapped time series is the same as the original time series n ≈ pl, where p is number of blocks. A block bootstrapped time series dataset X B is obtained by generating a set of bootstrapped time series for the K variables. The out-of-bootstrapped dataset X O does not contain the data point appearing in X B .

3

Bootstrapped Parameter Estimator for Dynamic Systems

We now present a novel approach for parameter estimation in dynamic systems by exploiting the characteristics of the bootstrap. In the next section we show that the proposed approach generally computes accurate parameters in noise-free and noisy environments. The algorithm is adaptive and computes the parameters according to the progress of the estimation process. We term the method Bootstrapped Parameter Estimator (BPE). It is based on the idea of resampling and combining. The estimation process of BPE can be viewed as comprising two stages. In the first stage base estimations are obtained and in the second stage the computed values are combined. The pseudocode for BPE is given in Algorithm 1. The method is provided with a dataset X that comprises time series data {Xi : i = 1, . . . , K} for K variables. Here each time series is of the form Xi = {xi1 , xi2 , . . . , xin } where xij are observed data points at {t1 , t2 , . . . , tn }. It also takes as input a set of K ODEs that describe a dynamic system. BPE generates the block bootstrap dataset X B = {X1B , X2B , . . . , XnB } where B B B Xi = {xB i1 , xi2 , . . . , xin } is a bootstrapped time series corresponding to the ith variable. The out-of-bootstrap dataset is given by X O = {X1O , X2O , . . . , XnO } and it does not contain the data points that occur in X B . Once the bootstrapped dataset is generated the method invokes Algorithm 2 that we call Adaptive Least Square Estimator (ALSE), which is an adaptive iterative method.

Algorithm 1 Bootstrapped Parameter Estimator (BPE) for Dynamic Systems. Input: A dataset of K time series X = {X1 , X2 , . . . , XK } for K variables where X is the observed dataset. A set E of K ODEs. A base parameter estimation technique e.g Adaptive Least Square Estimator (ALSE). for r = 1 to R do B Generate a bootstrap dataset X B = {X1B , X2, . . . , XkB } by applying (nonoverlapping) block bootstrap. Obtain estimated parameters θˇr by invoking Algorithm 2. end for ˆ Output:PThe final parameter θ: θ¯ = 1 R θˇr r=1

R

Algorithm 2 Adaptive Least Square Estimator. Input: A bootstrapped dataset X B , an out-of-bootstrapped dataset X O , indices I B for bootstrapped data points, indices I O for out-of-bootstrapped data points, a set E of ODEs, tolerance ˜, vector θ describing initial values for parameters and number of iterations m. Initialize parameters by generate uniform random numbers on unit interval, θ ← (0, 1) i←1 while i ≤ m or θ 6= θˇi do Obtain estimated parameters, θˇi by utilizing LSE. The estimator is provided with parameters θ and bootstrapped dataset X B . Obtain estimated time series dataset Y (θˇi , t) for K variables Generate estimated bootstrap dataset Y B (θˇi , t) and out-of-bootstrap dataset Y O (θˇi , t) by using I B and I O respectively. Assess the goodness of fit: for k = 1 to K do norm2(X O −Y O (θˇi ,t)) [k] = norm2 kY O (kθˇ ,t) ( k i ) end for Set the entries in θ to the estimated parameter values if all the corresponding elements in  ≤ ˜ Set the remaining entries in θ to random values. Quantify the fitness: normF (X O −Y O (θˇi ,t)) ξi (θˇi ) = normF Y O (θˇ ,t) ( ) i i←i+1 end while   θˆ = argmin ξi (θˇi ) θi

if i > m then θˇ = θˆ else θˇ = θˇi end if Output: θˇ

At each iteration i ALSE calls and applies Least Square Estimator (LSE) to the bootstrapped sample and computes values of the parameters given by the ˇ The goodness of fit corresponding to each variable k is assessed by vector θ. employing the following measure1 :  norm2 XkO − YkO (θˇi , t)  for k = 1, . . . , K [k] = norm2 Y O (θˇi , t) k

Here XkO and YkO are observed and estimated out-of-bootstrap time series corresponding to the kth variable. If the error [k] ≤ ˜, for k = 1, . . . , K is less than or equal to some given threshold for all K variables, ALSE returns the estimated parameters. In the case where [k] ≤ ˜ for some K variables, the corresponding entries in the vector θ are indexed by the estimated values of the parameters and the remaining entries are set randomly according to a uniform distribution. As a parameter can be shared by more than one ODE, the estimated value is assigned to the parameter if the measure is less than or equal to the threshold for all the ODEs in which it appears. In this scenario the fitness of the estimated model is also quantified by the relative error measure ξi 2 :  normF X O − Y O (θˇi , t)  ξi = normF Y O (θˇi , t) ALSE repeats the procedure m number of times. At the end of the process, parameters with minimum error are selected   argmin ξi (θˇi ) θi

In this way ALSE adapts to the progress of estimation process and selects optimal parameters. Once BPE obtains estimated parameters, it generates a bootstrapped dataset and then invokes ALSE. This process of drawing bootstrapped dataset and acquiring estimated values of the parameters from ALSE is repeated R times. The final estimated values θ¯ are obtained by computing the average of all the outputs.

4

Applications: Biochemical Systems

There are many networks of interacting components known to exist as part of the machinery of living organisms. Biochemical networks can be metabolic, regulatory or signal transduction networks. For example, signal transduction is the mechanism which enables a cell to sense changes in its environment and to make appropriate responses. The basis of this mechanism is the conversion of one kind of signal into another. 1 2

norm2 is the 2-norm (the Euclidean norm) on a vector. normF is the Frobenius matrix norm (Euclidean norm of a matrix).

Table 1. Estimated parameter values for noise-free dataset for signaling pathway. Standard deviations and relative errors for parameters are given. Target values of θ1 , θ2 , θ3 , θ4 and θ5 are also presented. Parameter Target θ1 θ2 θ3 θ4 θ5

0.050 0.200 0.100 0.100 0.100

Estimated ± SD l=3 l=5 0.050 ± 0.000 0.050 ± 0.000 0.200 ± 0.005 0.200 ± 0.005 0.100 ± 0.002 0.100 ± 0.002 0.098 ± 0.016 0.099 ± 0.016 0.101 ± 0.006 0.101 ± 0.007

Relative error l=3 l=5 0.000 0.000 0.000 0.000 0.000 0.000 0.020 0.010 0.010 0.010

The basic building block of any biological dynamic system is the enzymatic reaction: the conversion of a substrate into a product catalyzed by an enzyme. Such enzymatic reactions can be used to describe metabolic conversions, the activation of signaling molecules and even transport reactions between various sub-cellular compartments. Enzymes greatly accelerate reactions in one direction (often by factors of at least 106 ), and most reactions in biological systems do not occur at perceptible rates in the absence of enzymes. We can illustrate a simple enzymatic reaction involving one substrate S, one product P , and an enzyme E E by S − → P . In order to perform parameter estimation we are able to write the ordinary differential equations describing the consumption of the substrate S d[P ] [S] and production of the product P as: d[S] dt = − dt = −θ1 (θ2 +[S]) Where [S] and [P ] are the concentrations of proteins S and P , and θ1 and θ2 are the (unknown) parameters. Once we have a set of ODEs that describe a biochemical systems, we can exploit computational techniques to estimate the unknown parameters. 4.1

Experiments and Results

In order to assess the efficacy of the proposed method we performed a series of experiments for estimating parameters of signaling pathways and a gene regulatory network. Signaling pathways play a pivotal role in many key cellular processes [18]; abnormalities in cell signaling can cause the uncontrollable division of cells, which may lead to cancer. The developmental process in animals is controlled by gene regulatory networks [14]. We applied our approach to a synthetic signaling pathway, the Ras/Raf-1/MEK/ERK signaling pathway and the p53 gene regulatory network. Experiments were performed using Matlab 7 and as the Least Square Estimator Matlab function ”lsqnonlin” was utilized. Dataset 1 We first consider a synthetic signaling pathway described in [21]. The interactions that are modeled are as follows: Incoming protein S activates protein R into phosphorylated form Rpp. The enzyme S that is involved in the reaction is then degraded to D. The corresponding ODEs given in [21] are presented below: dS dt

= −θ1 S,

dD dt

= θ1 S,

dR dt

RS = − θθ42+R +

θ3 Rpp θ5 +Rpp ,

dRpp dt

=

θ2 RS θ4 +R



θ3 Rpp θ5 +Rpp

The mathematical model of the pathways is described by five kinetic parameters, θ1 , θ2 , θ3 , θ4 , θ5 . The target (nominal) values of the parameters that we used in

Parameter estimation for signaling pathway in noisy environment.

Parameter estimation for signaling pathway in noisy environment.

0.25

0.35

Target Estimated

Target Estimated 0.3

0.2 0.25

Values

Values

0.15

0.1

0.2

0.15

0.1 0.05 0.05

0

θ_1

θ_2

θ_3

θ_4

θ_5

0

θ_1

Parameters

(a)

θ_2

θ_3

θ_4

θ_5

Parameters

(b)

Fig. 2. Estimated values of parameters in noisy environment where data is corrupted by additive Gaussian noise with mean = 0 and standard deviation σ. Figure 2(a) shows estimated parameters for σ = 0.02 and Figure 2(b) shows computed parameters for σ = 0.10. Target values of the parameters are also plotted. Table 2. Comparison of BM with BPE for synthetic pathway. Target values and relative errors are also given. The estimated values for BM are taken from [21]. Parameter Target θ1 θ2 θ3 θ4 θ5

0.050 0.200 0.100 0.100 0.100

BM Estimated ± SD Relative Error 0.050 ± 0.001 0.000 0.168 ± 0.019 0.160 0.090 ± 0.005 0.100 0.094 ± 0.005 0.060 0.107 ± 0.018 0.070

BPE Estimated ± SD Relative Error 0.050 ± 0.000 0.000 0.208 ± 0.023 0.040 0.100 ± 0.006 0.000 0.129 ± 0.085 0.290 0.100 ± 0.024 0.000

conjunction with the model to generate the observed time series data are given in Table 1. Data was generated at 100 time points. The free parameter R in BPE was set to 500 whereas the free parameter m in ALSE was chosen to be 100. The block length l was set to 3 and 5.The initial values of the parameters θ1 , θ2 , θ3 , θ4 , θ5 were set using a uniform distribution between 0 and 1. Table 1 shows estimated values and relative errors which appears in the computed parameters where relative error = |(target − estimated)/target|. The results show that the proposed approach computes parameters in a way such that the estimated parameters are approximately the same as the target parameters. From the experiments we find that the performance of the method does not vary for block length 3 or 5. We performed further experiments to evaluate robustness of BPE to noise. Additive Gaussian noise with mean 0 and standard deviation σ was incorporated into the data. The values for σ were set to 0.02 and 0.10. Given that there was no significant difference in performance for block length 3 or 5, we set l to 3. Figures 2(a) and 2(b) illustrate performance of BPE in noisy environments. The

Table 3. Target and estimated values (with standard deviations) for noise-free dataset for RKIP inhibited ERK pathway. Relative errors for varying block lengths are also presented. Parameter Target θ1 θ2 θ3 θ4 θ5 θ6 θ7 θ8 θ9 θ10 θ11

0.53000 0.00720 0.62500 0.00245 0.03150 0.80000 0.00750 0.07100 0.92000 0.00122 0.87000

Estimated ± SD l=3 l=5 0.53000 ± 0.00001 0.53000 ± 0.00830 ± 0.00117 0.00827 ± 0.62500 ± 0.00000 0.62500 ± 0.00250 ± 0.00005 0.00250 ± 0.03144 ± 0.00004 0.03144 ± 0.80000 ± 0.00000 0.80000 ± 0.00769 ± 0.00012 0.00769 ± 0.07084 ± 0.00010 0.07084 ± 0.92001 ± 0.00001 0.92001 ± 0.00221 ± 0.00083 0.00218 ± 0.86964 ± 0.00026 0.86963 ±

0.00000 0.00112 0.00000 0.00005 0.00004 0.00000 0.00012 0.00011 0.00002 0.00083 0.00027

Relative Error l=3 l=5 0.00000 0.00000 0.15278 0.14861 0.00000 0.00000 0.02041 0.02041 0.00190 0.00190 0.00000 0.00000 0.02533 0.02533 0.00225 0.00225 0.00001 0.00001 0.81148 0.78689 0.00041 0.00043

Figures demonstrate correct estimation of the parameters. The results shows the efficacy of the proposed approach for datasets which are corrupted by noise. Next we compare the performance of BPE with the approach presented in [21] where a Bayesian method (BM) (product space sampling technique) was utilized. Table 2 shows the results reported in [21] that were obtained by assuming that the data was corrupted by additive Gaussian noise with σ = 0.02. In the Table better estimations are shown in bold. Experimental comparison of the performance of BPE with BM demonstrates improved results. The results show that the performance of BPE is better than BM for most of the parameters. The relative errors for parameters estimated by BPE is given by (0.000, 0.350, 0.070, 2.480, 0.320) for σ = 0.10. Dataset 2 The Ras/Raf-1/MEK/ERK signaling pathway (also called the Rafé1’ RKIP S1 S2 ERK pathway) is one of the most important and intensively studied sigr2 r1 naling pathways, which transfers the ERKéPP S3 S9 Rafé1’/RKIP mitogenic signals from the cell memr11 brane to the nucleus [32]. It is der3 r4 r8 MEKéPP/ERK RKIPéP/RP regulated in various diseases, rangS8 S4 S11 Rafé1’/RKIP/ERKéPP ing from cancer to immunological, r6 r9 r7 r5 r10 inflammatory and degenerative syndromes and thus represents an imporS7 S6 S10 S5 RKIPéP RP ERK MEKéPP tant drug target. Ras is activated by an external stimulus, via one of many growth factor receptors; it then binds Fig. 3. Graphical representation of the to and activates Raf-1 to become Raf- ERK signaling pathway regulated by RKIP 1*, or activated Raf, which in turn activates MAPK/ERK Kinase (MEK) which in turn activates Extracellular signal Regulated Kinase (ERK). This cascade (Raf-1 → Raf-1* → MEK → ERK) of protein interaction controls cell differenti-

Parameter estimation for RKIP inhibited ERK pathway in noisy environment.

Parameter estimation for RKIP inhibited ERK pathway in noisy environment.

1

Target Estimated

0.9

0.8

0.8

0.7

0.7

0.6

0.6

Values

Values

0.9

1

0.5

0.5

0.4

0.4

0.3

0.3

0.2

0.2

0.1

0.1

0

θ_1

θ_2

θ_3

θ_4

θ_5

θ_6

Parameters

(a)

θ_7

θ_8

θ_9

θ_{10} θ_{11}

Target Estimated

0

θ_1

θ_2

θ_3

θ_4

θ_5

θ_6

θ_7

θ_8

θ_9

θ_{10} θ_{11}

Parameters

(b)

Fig. 4. Estimated values of parameters in noisy environment where data is corrupted by zero-mean additive Gaussian noise and standard deviation σ. Figures 4(a) and 4(b) plots estimated parameters for σ = 0.02 and σ = 0.10 respectively. Figures also show target values of the parameters.

ation, the effect being dependent upon the activity of ERK. RKIP inhibits the activation of Raf-1 by binding to it, disrupting the interaction between Raf-1 and MEK, thus playing a part in regulating the activity of the ERK pathway [31]. A number of computational models have been developed in order to understand the role of RKIP in the pathway and ultimately to develop new therapies [9, 7]. In this paper, we use the RKIP inhibited ERK pathway as described in [9]. The set of ODEs is given below and a graphical representation of the ERK signaling pathway regulated by RKIP is shown in Figure 3. The parameters θ1 , . . . , θ11 are shown as r1 , . . . , r11 in the Figure 3. dS1 dS7 = −θ1 S1 S2 + θ2 S3 + θ5 S4 = −θ6 S5 S7 + θ7 S8 + θ8 S8 dt dt dS2 dS8 = −θ1 S1 S2 + θ2 S3 + θ11 S11 = θ6 S5 S7 − θ7 S8 − θ8 S8 dt dt dS3 dS9 = θ1 S1 S2 − θ2 S3 − θ3 S3 S9 + θ4 S4 = −θ3 S3 S9 + θ4 S4 + θ8 S8 dt dt dS4 dS10 = θ3 S3 S9 − θ4 S4 − θ5 S4 = −θ9 S6 S10 + θ10 S11 + θ11 S11 dt dt dS5 dS11 = θ5 S4 − θ6 S5 S7 + θ7 S8 = θ9 S6 S10 − θ10 S11 − θ11 S11 dt dt dS6 = θ5 S4 − θ9 S6 S10 + θ10 S11 dt We computed the concentration of the species at 100 time points and treated this as the observed times series dataset. The data was generated by utilizing the above model with parameters θ1 = 0.53, θ2 = 0.0072, θ3 = 0.625, θ4 = 0.00245, θ5 = 0.0315, θ6 = 0.8, θ6 = 0.0075, θ8 = 0.071, θ9 = 0.92, θ10 = 0.00122, θ11 = 0.87. The free parameters R and m were set to 500 and 100 respectively.

The length of blocks for generating bootstrapped dataset was chosen to 3 and 5. The initial guess for the parameters was provided by setting the values according to uniform distribution between 0 and 1. Table 3 shows relative errors and estimated values for the pathway. The results illustrate the efficacy of BPE in estimating the parameters. The values of the parameters, θ1 , θ3 , θ5 , θ6 , θ8 , θ9 and θ11 are almost identical to the target values. We next analyze the effect of varying l on performance of BPE for RKIP inhibited ERK pathway. It is interesting to note that there is no substantial difference in performance for varying block lengths and the algorithm computes accurate parameters for l = 3 and l = 5. In order to observe the influence of noise on the performance of BPE, we performed simulations by adding random noise drawn from a Gaussian distribution with mean zero. The standard deviation σ was set to 0.02 and 0.10 where the value for l was chosen to 3. Figures 4(a), 4(b), and Table 4 illustrate the results in noisy environments. The results show that in the scenario where the level of noise is low the method is successful in performing accurate computation of most of the parameters. In the case where data is corrupted with high level of noise, the performance of the method can decrease for some parameters. Table 4. Relative errors for RKIP inhibited ERK pathway in noisy environment. σ θ1 θ2 θ3 θ4 θ5 θ6 θ7 θ8 θ9 θ10 θ11 0.02 0.000 0.794 0.000 1.159 0.109 0.000 1.545 0.180 0.001 44.377 0.024 0.10 0.001 2.565 0.002 4.294 0.429 0.004 6.200 0.753 0.006 186.402 0.093

Dataset 3 The p53 gene is viewed an important tumor repressor gene [30]. A rise in p53 that is caused by damage in DNA produces apoptosis [33]. The important members of the p53 network are p53, Mouse double minute two (Mdm2), Ataxia telangiectasia mutated (ATM) and Chk2, E2F1, and alternative reading frame product (ARF). MDM2 is an ubiquitin ligase where ATM and CHK2 are protein kinases. E2F1 is a transcription factor and ARF is a protein. The p53 network is considered highly complex [3] due to issues like its varied behaviour in tissues and cell lines [19] . A better understanding of the p53 network may lead to the development of effective cancer treatments. In order to study and analyze the p53 network a number of models have been proposed [2, 10, 6, 5]. We used the model described in [6, 5]: p53 is activated by the process of phosphorylation while Mdm2 is inactivated/degraded. Mdm2 molecules are transcribed by active p53 and Mdm2 promotes rapid degradation of active and inactive p53. The corresponding ODEs model described in [6, 5] is given below. dS1 = −θ1 S1 dt dS2 = θ2 − θ5 S2 − θ3 S4 S2 − θ4 S1 S2 dt In the model, variables S1 , S2 , S3 and S4

dS3 = −θ5 S3 − θ3S4 S3 + θ4 S1 S2 dt dS4 = θ6 + θ7 S3 − θ8 S4 − θ9 S1 S4 dt represent active ATM, inactive p53,

Table 5. Estimated parameter values for noise-free dataset for the p53 gene regulatory network. Standard deviation is given. Target values are presented. Table also shows relative errors for parameters. Parameter Target θ1 θ2 θ3 θ4 θ5 θ6 θ7 θ8 θ9

0.050 0.520 1.420 0.390 0.041 0.010 2.500 0.180 0.750

Estimated ± SD l=3 l=5 0.050 ± 0.000 0.050 ± 0.000 0.520 ± 0.000 0.520 ± 0.001 1.420 ± 0.001 1.420 ± 0.001 0.390 ± 0.000 0.390 ± 0.000 0.041 ± 0.001 0.041 ± 0.001 0.010 ± 0.000 0.010 ± 0.001 2.500 ± 0.001 2.501 ± 0.003 0.181 ± 0.001 0.181 ± 0.001 0.750 ± 0.000 0.750 ± 0.001

Relative Error l=3 l=5 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.006 0.006 0.000 0.000

Table 6. Relative errors for noisy dataset for the p53 network. σ θ1 θ2 θ3 θ4 θ5 θ6 θ7 θ8 θ9 0.02 0.000 0.000 0.036 0.015 0.537 0.500 0.042 0.100 0.068 0.10 0.020 0.027 0.026 0.044 0.780 0.100 0.040 0.111 0.149

active p53 and Mdm2 respectively. The parameters represent interaction rate , basal production and degradation rates. We now present the experi- Table 7. Estimated parameter values (Est) ments for computing parameters obtained by applying spline based collocation of the p53 network for noise free method (CM) and BPE to the p53 network and noisy datasets. The dataset noisy dataset. Target values are also given. Data was generated from the above is corrupted by additive random noise from a model with target values of the Gaussian distribution with mean zero and standard deviation σ = 0.06. The estimated values parameters given in Table 5. for CM are taken from [6]. Relative errors (Rel Noisy dataset was obtained by Err) for CM and BPE are also given. adding uniformly distributed ranPara Target CM BPE dom noise with mean 0 and stanEst Rel err Est Rel Err dard deviation σ. The values of θ1 0.05 0.049 0.020 0.049 0.020 the free parameters R, m, l, and θ2 0.52 0.455 0.125 0.570 0.096 σ were set as described in the preθ3 1.42 1.37 0.035 1.549 0.091 ceding paragraphs. LSE was proθ4 0.39 0.37 0.051 0.407 0.044 vided with initial parameters that θ5 0.04 -0.04 1.976 0.069 0.683 were set randomly according to θ6 0.01 -0.05 6.000 0.016 0.600 a uniform distribution between 0 θ7 2.5 2.05 0.180 2.456 0.018 and 1. The results are reported θ8 0.18 0.14 0.222 0.186 0.033 in Tables 5 and 6 and are illusθ9 0.75 0.42 0.440 0.660 0.120 trated in Figures 5(a) and 5(b). The computed values of the parameters are the same as the target values for the noise-free dataset. The method also successfully estimates parameters in noisy environments. These results validate the efficacy of the method.

Parameter estimation for p53 gene regulatory network in noisy environment.

Parameter estimation for p53 gene regulatory network in noisy environment.

2.5

2.5

Target Estimated

Target Estimated

1.5

1.5

Values

2

Values

2

1

1

0.5

0.5

0

θ_1

θ_2

θ_3

θ_4

θ_5

Parameters

(a)

θ_6

θ_7

θ_8

θ_9

0

θ_1

θ_2

θ_3

θ_4

θ_5

θ_6

θ_7

θ_8

θ_9

Parameters

(b)

Fig. 5. Estimated values of parameters in noisy environment. Random noise from a Gaussian distribution with mean zero and standard deviation σ (σ = 0.02 for Figure 5(a) and σ = 0.10 for Figure 5(b) were added to data. Target values of the parameters are also shown.

We now compare the performance of BPE with a recently proposed technique [6]. The authors proposed a spline based collocation method (CM) for estimating parameters of ODEs and applied their technique to the p53 gene regulatory network. The method was successful in providing accurate estimations of the parameters for the noise-free dataset, but the performance of CM was decreased in noisy environment. Table 7 presents the results of CM and BPE for noisy dataset. The improved estimated values of the parameters are shown in bold. The estimated parameter values for CM were reported in [6]. Table 7 shows that performance of BPE is better than collocation method for parameters θ2 , θ4 , θ5 , θ6 , θ7 , θ8 and θ9 . Spline based collocation method achieved better performance for only one parameters, namely θ3 . The results demonstrate that the performance of BPE is substantially better that the CM, being more robust to noise. In this section we have demonstrated efficacy of proposed approach for parameter estimation in dynamic systems. The phenomenon of adaptive selection of parameters and averaging individual estimates make the algorithm robust and highly useful in both noise-free and noisy environments.

5

Conclusion

This paper focuses on the task of parameter estimation, and presents a technique that is inspired by bootstrapping. Our approach addresses the problems associated with the task and provides accurate estimates of parameters in scenarios where the data is characterized by noise. It is based on the idea of adaptive computation of parameters and then combination of the individual estimates. We applied our proposed technique for computing parameters of dynamic biochemical systems to several examples: two signaling pathways and a gene regulatory

network. These experiments validate the efficacy of the approach for noise free and noisy environments. The results show that our approach generally outperforms related methods considered in the study. Techniques like collocation methods based on splines are used for parameter estimation in dynamic systems. We believe that BPE in conjunction with these approaches will be effective and efficient. We are looking at the ways to develop these methods. One of the important issues to be addressed in our future work is to learn unknown parameters for very large dynamic biochemical and other systems. One way to achieve this goal is to adopt a divide and conquer based strategy.

Acknowledgements The research presented in the paper is supported by Biotechnology and Biological Sciences Research Council (BBSRC) grant reference number BB/F005679/1.

References 1. Andrews, D.W.K.: The block-block boostrap: improved asymptotic refinements. Econometrica 72(3), 673–700 (2004) 2. Bar-Or, R.L., Maya, R., Segel, L.A., Alon, U., Levine, A.J., Oren, M.: Generation of oscillations by the p53-mdm2 feedback loop: a theoretical and experimental study. Proc Natl Acad Sci U S A 97(21), 11250–11255 (2000) 3. Braithwaite, A.W., Royds, J.A., Jackson, P.: The p53 story: layers of complexity. Carcinogenesis 26(7), 1161–1169 (2005) 4. Breiman, L.: Random forests. Machine Learning 45(1), 5–32 (2001) 5. Brewer, D.: Modelling the p53 gene regulatory network. Ph.D. thesis, University of London (2006) 6. Brewer, D., Barenco, M., Callard, R., Hubank, M., Stark, J.: Fitting ordinary differential equations to short time course data. Philosophical Transactions of the Royal Society A 366, 519–544 (2008) 7. Calder, M., Gilmore, S., Hillston, J.: Modelling the influence of RKIP on the ERK signalling pathway using the stochastic process algebra PEPA. In: Trans. Computational Systems Biology. vol. 4230, pp. 1–23. Springer (2004) 8. Calderhead, B., Girolami, M., Lawrence, N.: Accelerating bayesian inference over nonlinear differentail equations with gaussian processes. In: Advances in Neural Information Processing System 21. pp. 217–224 (2009) 9. Cho, K.H., Shin, S.Y., Kim, H.W., Wolkenhauer, O., Mcferran, B., Kolch, W.: Mathematical modeling of the influence of RKIP on the ERK signaling pathway. In: C. Priami, editor, Computational Methods in Systems Biology (CSMB’03). LNCS, vol. 2602, pp. 127–141. Springer (2003) 10. Ciliberto, A., Novak, B., Tyson, J.J.: Steady states and oscillations in the p53/mdm2 network cell cycle. Cell Cycle 4(3), 488–493 (2005) 11. Coleman, T.F., Li, Y.: On the convergence of reflective Newton methods for largescale nonlinear minimization subject to bounds. Mathematical Programming 67(2), 189–224 (1994) 12. Coleman, T.F., Li, Y.: An interior, trust region approach for nonlinear minimization subject to bounds. SIAM Journal on Optimization 6(2), 418–445 (1996)

13. Cussens, J.: Parameter estimation in stochastic logic programs. Machine Learning 44(3), 245–271 (2001) 14. Davidson, E., Levin, M.: Gene regulatory networks. Proc. Natl. Acad. Sci. USA 102(14), 4935 (2005) 15. Efron, B.: The Jackknife, the Bootstrap and other Resampling Plans. Society for Industrial and Applied Mathematics (1982) 16. Efron, B.: Bootstrap methods: another look at the jackknife. The Annals of Statistics 7(1), 1–26 (1997) 17. Efron, B., Tibshirani, R.: An introduction to bootstrap. Chapman and Hall (1993) 18. Elliot, W., Elliot, D.: Biochemistry and Molecular Biology. Oxford University Press, 2nd edition edn. (2002) 19. Fridman, J.S., Lowe, S.W.: Control of apoptosis by p53. Oncogene 22(56), 9030 (9040 2003) 20. Gelman, A., Carlin, J.B., Stern, H.S., Rubin, D.B.: Bayesian Data Analysis. Chapman and Hall/CRC (2004) 21. Girolami, M.: Bayesian inference for differential equations. Theoretical Computer Science 408(1), 4–16 (2008) 22. Gunawardena, J.: Models in systems biology: the parameter problem and the meanings of robustness. In: Lodhi, H., Muggleton, S. (eds.) Elements of Computational Systems Biology, vol. 1. Wiley (2010) 23. Kirk, P.D.W., Stumpf, P.H.: Gaussian process regression bootstrapping: exploring the effects of uncertainty in time course data. Bioinformatics 25(10), 1300–1306 (2009) 24. Levins, R.: The strategy of model building in population biology. American Scientist 54(421–429) (1966) 25. Lodhi, H.: Advances in systems biology. In: Lodhi, H., Muggleton, S. (eds.) Elements of Computational Systems Biology. Wiley (2010) 26. Lodhi, H., Muggleton, S.: Modelling metabolic pathways using stochastic logic programs-based ensemble methods. In: Danos, V., Schachter, V. (eds.) Second International Conference on Computational Methods in System Biology (CMSB04). pp. 119–133. LNCS, Springer Verlag (2004) 27. Ramsay, J.O., Hooker, G., Campbell, D., Cao, J.: Parameter estimation for differential equations: a generalized smoothing approach. J. R. Statist. Soc. B 69(5), 741–796 (2007) 28. Rao, J.S., Tibshirani, R.: The out-of-bootstrap method for model averaging and selection. Tech. rep., University of Toronto (1997) 29. Tyson, J.: Models of cell cycle control in eukaryotes. Journal of Biotechnology 71(1-3), 239–244 (1999) 30. Vogelstein, B., Lane, D., Levine, A.: Surfing the p53 network. Nature 408(6810), 307–310 (2000) 31. Yeung, K., Seitz, T., Li, S., Janosch, P., McFerran, B., Kaiser, C., Fee, F., Katsanakis, K.D., Rose, D.W., Mischak, H., Sedivy, J.M., Kolch, W.: Suppression of Raf-1 kinase activity and MAP kinase signaling by RKIP. Nature 401, 173–177 (1999) 32. Yeung, K., Janosch, P., McFerran, B., Rose, D.W., Mischak, H., Sedivy, J.M., Kolch, W.: Mechanism of suppression of the Raf/MEK/Extracellular signalregulated kinase pathway by the Raf kinase inhibitor protein. Mol. Cell Biol. 20(9), 3079–3085 (2000) 33. Yonish-Rouach, Y., Resnitzky, D., Lotem, J., Sachs, L., amd M. Oren, A.K.: Wild-type p53 induces apoptosis of myeloid leukaemic cells that is inhibited by interleukin-6. Nature 352(6333), 345–347 (1991)