Stochastic analysis and inference of a two-state genetic promoter model Abhyudai Singh, Cesar A. Vargas and Rajesh Karmakar
Abstract— Transcription is the process by which messenger RNA (mRNA) transcripts are synthesized from genes. Measurements in individual living cells reveal fluctuations in mRNA copy numbers over time suggesting that transcription is an intrinsically random process driven by thermal molecular motion of biochemical species. We here use a stochastic model of the transcription process that captures both the extent and timescale of fluctuations in mRNA population counts. In particular, randomness in the transcription process is captured through a two-state model, where the promoter of a gene stochastically switches between an active and inactive state. High levels of transcription occur from the active state, while the inactive state allows for a low basal rate of transcription. For the two-state model we derive exact analytical formulas for the steady-state mRNA probability distribution and the mRNA auto-correlation function. These results are applied to recent data from the Human Immunodeficiency Virus (HIV) system. Using Akaike Information Criterion (AIC) we select the most likely stochastic model for the transcription process given mRNA histogram data. For the selected model, maximum likelihood estimates of the different kinetic rates associated with the viral promoter are inferred. Analysis reveals that the viral promoter resides mostly in the inactive state and there is a 100fold difference in the rate of mRNA synthesis from the active and inactive state. In summary, formulas presented here are an important resource for reverse engineering genetic promoters from single-cell mRNA copy number data.
I. I NTRODUCTION Genetically identical cells exposed to the same environment exhibit significant stochastic variability in the levels of mRNAs and protein expressed from a given gene [1]– [4]. Given that such stochastic variability plays important functional roles within cellular processes [5]–[7], much work has focused on understanding the origins of stochasticity in the gene-expression process using both theoretical and experimental techniques. Here, we build systems-level mathematical models of the transcription process for capturing stochasticity in mRNA copy numbers. Particular focus is on model selection and inference using experimentally obtained mRNA copy number distributions. Perhaps the simplest stochastic model of the transcription process is one where individual mRNA transcripts are created at exponential distributed time intervals (Poisson process) and each mRNA lives for an exponentially distributed time. In this case, the mRNA distribution is predicted to be a Poisson distribution [8]. However, experimentally obtained mRNA histograms show considerable deviations from a A. Singh and C. Vargas are with the Department of Electrical and Computer Engineering, University of Delaware, Newark, DE 19716.
[email protected],
[email protected] R. Karmakar Mahavidyalaya,
is with Bengai,
the Department of Hooghly, West
[email protected] Physics, Bengal,
A.K.P.C. INDIA
Poisson distribution indicating that more complex models are needed to explain the data [9]–[12]. A popular model used for capturing stochasticity in the transcription process is a two-state promoter model where the promoter of a gene randomly fluctuates between an active and inactive state (see Fig. 1; [13]–[17]). For the two-state model we derive an exact analytical expression for the steady-state mRNA probability distribution and auto-correlation function. The above results are used to investigate the promoter within HIV. Previous results have shown that stochastic expression of viral mRNAs from this promoter during human cell infection drives the virus into latency, a drug resistant state of HIV [18]–[21]. Thus, this is an important system to build models that can capture gene-expression variability and reveal new insight about the promoter. Recent experiments measuring mRNAs expressed from the viral promoter inside individual cells [22] show considerable variation in mRNA copy numbers across a cell population, in spite of the fact that these cells are genetically identical, share a common environment, and have similar shapes/sizes. Combining analytically predicted mRNA distributions with histogram data, different kinetic parameters are estimated for the HIV promoter. Finally, using parameters inferred for the HIV promoter we predict the mRNA auto-correlation function, which could be verified with additional experiments. The paper is organized as follows. In Section II, we introduce the two-state promoter model and derive an analytical expression for the mRNA probability distribution. Procedure for inferring model parameters from data and its application to HIV is presented in Section III. Auto-correlation function of mRNA copy numbers is derived in Section IV. Finally, our conclusions are presented in Section V. II. T WO - STATE PROMOTE MODEL To model the transcription process we consider a two-state promoter architecture where the promoter fluctuates between an active and inactive state with rates kon and ko f f (see Fig. 1). mRNA transcripts are synthesized from the active state at a rate km . We assume transcription occurs at a low basal rate kb from the inactive state. Finally, synthesized mRNAs degrade at a rate γm . We represent the state of the promoter at time t by g(t), with g(t) = 1 (g(t) = 0) indicating that the promoter is in the active (inactive) state. Let m(t) denote the mRNA copy number inside the cell at time t. Then, the twostate promoter model can be described by a set of discrete events listed in Table I. As in the stochastic formulation of chemical kinetics [23]–[25], events “fire” at exponentially distributed time intervals, and whenever a specific event
!"#$#%&"'*+,-&'
3(45' 1"*)/+"(2,#)'!#$
!m kon
koff
!"#$#%&"'()*+,-&'
.*/*0'' 1"*)/+"(2,#)'!"'
' 6
!"#
%$Figure 1. Schematic of a two-state model where a promoter stochastically transitions between an active and inactive state. mRNAs are synthesized from the active (inactive) state at a rate km (kb ). Since transcription from the inactive state is inefficient, we assume kb < km . Table I: Frequency and reset maps for different stochastic events in the two-state promoter model Propensity function f (g, m) kon (1 − g(t))
Event description Promoter transition to ON state
Reset in population count g(t) → g(t) + 1
Promoter transition to OFF state
g(t) → g(t) − 1
ko f f g(t)
mRNA transcription from ON state
m(t) → m(t) + 1
km g(t)
mRNA transcription from OFF state
m(t) → m(t) + 1
kb (1 − g(t))
mRNA degradation
m(t) → m(t) − 1
γm m(t)
occurs the state of the system is updated accordingly to a reset map (see second column in Table I). The third column lists the propensity functions f (g, m), which determine how often an event occurs. More specifically, the probability that an event will occur in the next infinitesimal time interval (t,t + dt] is given by f (g, m)dt. In summary, probabilistic events described in Table I define a stochastic model where bursts of mRNAs are produced as the promoter transitions from the inactive to the active state, and then back to the inactive state. The two-state promoter model described above has been studied in great detail in previous works [9], [14], [15]. However, analysis of this model is mostly restricted to the case where kb = 0. Here, we derive an exact analytical expression for the steady-state mRNA probability for the full two-state model, where transcription even occurs when the promoter resides in the inactive state. Let p1 (m,t) (p0 (m,t)) be the probability that at time t, the gene is in the active (inactive) state with m number of mRNA molecules. Then, these probabilities evolve according to the following Chemical Master Equation (CME) [24], [25]: ∂ p0 (m,t) ∂t
∂ p1 (m,t) ∂t
= ko f f p1 (m,t) − kon p0 (m,t) + kb [p0 (m − 1,t) − p0 (m,t)] + γm [(m + 1)p0 (m + 1,t) − m p0 (m,t)]
(1a)
= kon p0 (m,t) − ko f f p1 (m,t) + km [p1 (m − 1,t) − p1 (m,t)] + γm [(m + 1)p1 (m + 1,t) − m p1 (m,t)].
Note that p(m,t) = p0 (m,t) + p1 (m,t)
is the probability of observing m mRNA transcripts at time t. The standard approach for solving the CME is to convert it into a partial differential equation on the probability generating function [26], [27]. Generating functions corresponding to p0 (m,t), p1 (m,t) and p(m,t) are defined as F0 (z, t) :=
∞
∑ zm p0 (m,t),
m=0
F1 (z, t) :=
F(z,t) := F0 (z, t) + F1 (z, t) =
∞
∞
∑ zm p1 (m,t)
(3a)
m=0
∑ zm p(m,t),
(3b)
m=0
respectively. The CME (1) can be written in terms of the generating functions as follows ∂ F0 (z,t) = ko f f F1 (z,t) − kon F0 (z,t) + kb (z − 1)F0 (z,t) ∂t ∂ F0 (z,t) + γm (1 − z) (4a) ∂z ∂ F1 (z,t) = kon F0 (z,t) − ko f f F1 (z,t) + km (z − 1)F1 (z,t) ∂t ∂ F1 (z,t) . (4b) + γm (1 − z) ∂z At steady-state 0 = ko f f F 1 (z) − kon F 0 (z,t) + kb (z − 1)F 0 (z) + γm (1 − z)
(1b)
(2)
dF 0 (z) dz
(5a)
0 = kon F 0 (z) − ko f f F 1 (z,t) + km (z − 1)F 1 (z) + γm (1 − z)
dF 1 (z) , dz
(5b)
where t→∞
t→∞
(6)
Adding equations (5a) and (5b) we obtain dF(z) , F(z) := lim F(z,t). t→∞ dz Solving equations (3b) and (7) yields km F 1 (z) + kb F 0 (z) = γm
(7)
F 1 (z) =
γm dF(z) kb F(z) − km − kb dz km − kb
(8a)
F 0 (z) =
km F(z) γm dF(z) − . km − kb km − kb dz
(8b)
Using (8), equation (5a) can be written as
where
dF(z) d 2 F(z) (z − 1) + (K − (kˆ m + kˆ b )z) 2 dz dz + (kˆ b kˆ m z − Ke )F(z) = 0, K = kˆ on + kˆ o f f + kˆ m + kˆ b Ke = kˆ on kˆ m + kˆ o f f kˆ b + kˆ m kˆ b
(9)
0.06
0.04
0.02
20
40
60
80
100
m !"#$%&'()%*+!,-.%/!0%
Figure 2. Steady-state mRNA probability distribution functions for a two-state promoter model with kˆ m = 100, kˆ on = 0.9 and kˆ o f f = 5. The basal transcription rate from the inactive promoter is either taken as kˆ b = 0 (solid line) or kˆ b = 10 (dashed line). All rates have been normalized by the mRNA degradation rate.
(10a) (10b)
and kˆ on = kon /γm , kˆ o f f = ko f f /γm , kˆ m = km /γm , kˆ b = kb /γm (11) are model parameters normalized by the mRNA degradation rate. The exact solution of equation (9) is given by ˆ
F(z) =C ekb z ˆ ˆ ˆ ˆ ˆ 1 F1 (kon ; kon + ko f f ; (z − 1)(km − kb ))
0.08
p!m" 1.',2,3435)%%"/!0%
F 0 (z) := lim F0 (z,t), F 1 (z) := lim F1 (z,t).
(12)
where 1 F1 (a; b; z) is the confluent hypergeometric function ˆ of the first kind and C = e−kb is a normalization constant determined from the condition F(1) = 1. Let p(m) be the steady-state mRNA probability distribution function, then d m F(z) p(m) = |z=0 . (13) m!dzm Differentiating equation (12) m times with respect to z at z = 0 and then comparing both sides one obtains � � � � m k ˆ m−i (kˆ m − kˆ b )i Γ kˆ on + i Γ kˆ on + kˆ o f f ˆ � � � � p(m) = e−kb ∑ b (m − i)! i! Γ kˆ on Γ kˆ on + kˆ o f f + i i=0 � � ˆ ˆ ˆ ˆ ˆ (14) 1 F1 kon + i; kon + ko f f + i; kb − km
which is characterized by the four normalized model parameters: kˆ m , kˆ b , kˆ on , kˆ o f f . Plots of p(m) for different parameter values are shown in Figure 2. There are two cases under which (14) reduces to a simpler form. Firstly, if kˆ b = 0 ( i.e., no transcription from the inactive state), then � � � � mΓ k ˆ on + m Γ kˆ on + kˆ o f f kˆ m � p(m) = � � � Γ kˆ on Γ kˆ on + kˆ o f f + m Γ(m + 1) � � ˆ ˆ ˆ ˆ (15) 1 F1 kon + m; kon + ko f f + m; −km .
Secondly, if kˆ b = 0 and kˆ o f f � 1 (i.e., the active promoter state is unstable) then we obtain the following negative binomial distribution p(m) = �
kˆ m 1+ kˆ o f f
�−kˆ on
� � kˆ m m ˆ Γ kˆ on + m ko f f (16) � � ˆ Γ kon Γ(m + 1) 1 + kˆ m kˆ o f f
characterized by two parameters kˆ on and kˆ m /kˆ o f f [9].
III. R EVERSE ENGINEERING THE HIV PROMOTER Measurements of mRNAs transcribed from the HIV promoter show considerable variation in copy numbers across genetically identical cell populations (see mRNA histogram data in Figure 3 taken from [22]). Given data, our goal is to find the most likely underlying stochastic model. It is important to point out that the mRNA population count data shown in Figure 3 has a Fano factor (variance divided by mean) close to 100, and hence, has a significant deviation from a Poisson distribution, for which Fano factor = 1. We consider three candidate mRNA distributions that arise from different underlying stochastic models: 1) Distribution associated with the full two-state model characterized by four parameters (Eq. 14). 2) Distribution associated with the two-state model where kˆ b = 0 and characterized by three parameters (Eq. 15). 3) Negative binomial characterized by two parameters (Eq. 16). Given measurements m1 , m2 , . . . , mN of mRNA population counts from N cells, parameters associated with each distribution can be estimated by maximizing the likelihood
Table II: Fitting different mRNA probability distributions to data
Model description
Probability distribution
Number of parameters k
∆AIC
Two-state promoter model
Eq. (14)
4 (kˆ m , kˆ b , kˆ on , kˆ o f f )
0
Two-state promoter model with kˆ b = 0
Eq. (15)
3 (kˆ m , kˆ on , kˆ o f f )
9
Two-state promoter model with kˆ b = 0 and kˆ o f f � 1
Eq. (16)
2 (kˆ m /kˆ o f f , kˆ on )
7.2
function: N
L(θ ) = ∏ p(mi , θ )
22
(17)
i=1
20
where parameters θ = {kˆ on , kˆ o f f , kˆ m , kˆ b }, θ = {kˆ on , kˆ o f f , kˆ m }, θ = {kˆ on , kˆ m /kˆ o f f } for distributions (14), (15) and (16), respectively. To select between the different possible models we use the Akaike information criterion (AIC) [28], [29]. Briefly, for each distribution, an AIC score is computed using
18
(18)
where k is the number of parameters in the distribution and Lmax is the maximized value of the likelihood function (17). As can be seen from (18), the AIC score rewards a model for goodness of fit but also penalizes it for complexity, with the penalty increasing linearly with the number of parameters that need to estimated in the distribution. Given AIC scores for different candidate models, the model with the least AIC score is selected [28], [29]. Using mRNA histogram data for the HIV system, AIC scores for different candidate distributions were computed by maximizing the corresponding likelihood function on MATHEMATICA. Analysis shows that the full two-state model has the lowest AIC score. Our results are summarized in Table II, which lists the ∆AIC = AIC − AICmin value for each model, where AICmin is the minimal AIC score across all the three candidate models. Both distributions (15) and (16) yield AIC scores that are at least 7 points higher than the AIC score for (14). To conclude, model inference using AIC selects the full two-state promoter model as the most likely stochastic model for the HIV promoter. The maximum-likelihood parameter estimates for this model are as follows: kˆ on = 0.9, kˆ o f f = 5.75, kˆ m = 759.4, kˆ b = 6.5.
(19)
A plot of the mRNA probability distribution (14) for these parameter values together with the histogram data is shown in Figure 3. IV.
M RNA AUTO - CORRELATION TIME
We next focus on the time scale of mRNA fluctuations and derive an analytical formula for the steady-state autocorrelation function of the stochastic process m(t). To determine the auto-correlation function we use the fact that �m(t + s)m(s)� = �m(s)�m(t + s) | m(s), g(s)�� ,
(20)
Cell Count
AIC = 2k − 2 ln Lmax ,
Experimental p(m)
16 14 12 10 8 6 4 2 0
0
100
200
300
400
500
mRNA Count Figure 3. Histogram of mRNA population counts (sample size ≈ 200) taken from [22] and the mRNA probability distribution p(m) (Eq. 14) for the two-state promoter model with kˆ on = 0.9, kˆ o f f = 5.75, kˆ m = 759.4 and kˆ b = 6.5.
where �.� denotes the expected value and �m(t + s) | m(s), m(s)�
(21)
is the expected number of mRNA transcripts at time t + s given m(s) and g(s). To compute the conditional expectation (21) we derive the linear system of differential equations that describe the time evolution of the statistical moments �g(t)�, �m(t)�, �g2 (t)�, �m2 (t)� and �g(t)m(t)�. For the two-state promoter model described in Table I, the time derivate of the expected value of any differential function ϕ(g, m) is given by � � d�ϕ(g, m)� = (22) ∑ ∆ϕ(g, m) f (g, m) , dt Events where ∆ϕ(g, m) is the change in ϕ(g, m) when an event occurs and f (g, m) is the event propensity function [30], [31]. For appropriate choices of ϕ(g, m) we obtain the following
2
R(t) :=
�m(t + s)m(s)� − �m�
(28a)
2
�m2 � − �m�
� 2 � � � km − kb2 γm exp(−kont − ko f f t) − exp(−γmt) � � = exp(−γmt) + (k +k )(γ +k +k )(k k +k k ) 2 − k2 (ko f f + kon − γm ) o f f on m okof ff f konon b o f f m on + km b
d�g� = kon − (kon + ko f f )�g� (23a) dt d�m� = km �g� + kb (1 − �g�) − γm �m� (23b) dt d�g2 � = kon + ko f f �g� + kon �g� − 2ko f f �g2 � − 2kon �g2 � dt (23c) d�m2 � = km �g� + kb (1 − �g�) + 2kb �m� dt + γm �m� + 2(km − kb )�gm� − 2γm �m2 � d�gm� = kb �g� + (km − kb )�m2 � + kon �m� dt − (γm + kon + ko f f )�gm�.
(23d)
(23e)
Steady-state analysis of the above linear equations yields �g� =
kb ko f f + km kon kon , �m� = kon + ko f f γm (kon + ko f f ) 2
�m2 � = �m� + �m�F
(24a) (24b)
(km − kb )2 kon ko f f (kon km + ko f f kb )(kon + ko f f )(γm + kon + ko f f ) (24c) (km − kb )kon ko f f �gm� = �g� �m� + , (24d) (kon + ko f f )2 (γm + kon + ko f f ) F = 1+
where F is the Fano factor (variance divided by mean) of the mRNA population count. Note that when km = kb , i.e., same rate of transcription from both promoter states, then F = 1, implying Poisson mRNA statistics. We next use (23) to compute the condition expectation (21). By solving the linear system of differential equations (23a) and (23b) with initial conditions g(s) and m(s), respectively, we obtain �m(t + s) | m(s), g(s)� = a1 (t)�m�+ a2 (t)g(s) + a3 (t)m(s)
t→∞
t→∞
t→∞
1.0 0.8 0.6 0.4 0.2 2
4
6
8
23!-%45)',67%
10
12
14
Figure 4. mRNA auto-correlation function (dashed line) for the two-state promoter model corresponding to parameters inferred for the HIV promoter in Section IV (Eq. 19). For comparison purposes, exp(−γmt) is also plotted (solid line) where γm in the mRNA degradation rate and taken as 0.23 hour−1 . V. C ONCLUSION
(25)
for some time-varying coefficients a1 (t), a1 (t) and a1 (t) that satisfy a1 (0) = 0, a2 (0) = 0, , a3 (0) = 1, lim a1 (t) = 1, lim a2 (t) = 0, lim a3 (t) = 0.
(23) results in the auto-correlation function R(t) given by Eq. 28. As expected, when km = kb , mRNA production is a Poisson process and the auto-correlation function reduces to R(t) = exp(−γmt). Figure 4 plots R(t) for parameters inferred in Section IV (Eq. 19). Experiments used for quantifying mRNA histograms in Figure 3, report an mRNA half-life of 3 hours [22]. Hence, we assume γm = .23 hour−1 in Eq. (28). Figure 4 shows that the mRNA auto-correlation function has shifted to the right compared to exp(−γmt) suggesting that the timescale of mRNA fluctuations is larger than the mRNA halflife. In particular, we obtain the mRNA auto-correlation time tm (defined as R(tm ) := 0.5) to be approximately 3.7 hours. !"#$%&'()*+),,-.&/)0%1'0+/)0%
moment dynamics:
(28b)
(26a) (26b)
Substituting (25) in (20) yields 2
�m(t + s)m(s)� = a1 (t)�m� + a2 (t)�gm� + a3 (t)�m2 �, (27)
which using values of steady-state moments computed in
The two-state promoter model has been widely used to capture stochasticity in the gene transcription process [8], [9], [16]. We derived exact analytical expressions for the steady-state mRNA probability distribution function (Eq. 14) and auto-correlation function (Eq. 28) for the full twostate promoter model. These results were used to infer kinetic parameters associated with the HIV promoter. Fitting analytically predicted distributions to mRNA histogram data showed that the full two-state promoter model was the most probable stochastic model for the viral promoter. Model inference revealed that
1) Even in the inactive state, the HIV promoter synthesizes transcripts, albeit at low rate (kb /γm = 6.5). Assuming γm = 0.23 hour−1 this corresponds to approximately 1-2 mRNAs being transcribed per hour. 2) In the active state, transcription occurs at a rate km /γm = 759, which corresponds to 2-3 mRNAs transcribed per minute. 3) The promoter spends only 15% of its time in the active state. Using the inferred parameters for the HIV promoter we predicted the mRNA auto-correlation function (Figure 4), which could be verified through additional experiments. In summary, given advances in experimental techniques to measure mRNA levels inside individual cells in real time, analytical results presented here could be useful in inferring genetic promoter models from data. An implicit assumption in the two-state promoter model is that the time spent in each promoter state is exponentially distributed. However, recent data shows that in some cases time spent in the inactive state is better approximated by a gamma distribution [32]. Future work will extend current analysis to such promoters and develop techniques for discriminating between alternative promoter architectures using measured mRNA population count statistics. R EFERENCES [1] A. Bar-Even, J. Paulsson, N. Maheshri, M. Carmi, E. O’Shea, Y. Pilpel, and N. Barkai, “Noise in protein expression scales with natural protein abundance,” Nature Genetics, vol. 38, pp. 636–643, 2006. [2] J. M. Raser and E. K. O’Shea, “Noise in gene expression: Origins, consequences, and control,” Science, vol. 309, pp. 2010–2013, 2005. [3] M. Kaern, T. Elston, W. Blake, and J. Collins, “Stochasticity in gene expression: from theories to phenotypes,” Nature Reviews Genetics, vol. 6, pp. 451–464, 2005. [4] A. Arkin, J. Ross, and H. H. McAdams, “Stochastic kinetic analysis of developmental pathway bifurcation in phage λ -infected Escherichia coli cells,” Genetics, vol. 149, pp. 1633–1648, 1998. [5] A. Raj and A. van Oudenaarden, “Nature, nurture, or chance: stochastic gene expression and its consequences,” Cell, vol. 135, pp. 216–226, 2008. [6] R. Losick and C. Desplan, “Stochasticity and cell fate,” Science, vol. 320, pp. 65–68, 2008. [7] A. Eldar and M. B. Elowitz, “Functional roles for noise in genetic circuits,” Nature, vol. 467, no. 7312, pp. 167–173, Sept. 2010. [8] J. Paulsson, “Model of stochastic gene expression,” Physics of Life Reviews, vol. 2, pp. 157–175, 2005. [9] A. Raj, C. Peskin, D. Tranchina, D. Vargas, and S. Tyagi, “Stochastic mRNA synthesis in mammalian cells,” PLoS Biology, vol. 4, p. e309, 2006. [10] I. Golding, J. Paulsson, S. Zawilski, and E. Cox, “Real-time kinetics of gene activity in individual bacteria,” Cell, vol. 123, pp. 1025–1036, 2005. [11] S. J. Gandhi, D. Zenklusen, T. Lionnet, and R. H. Singer, “Transcription of functionally related constitutive genes is not coordinated,” Nature Structural Molecular Biology, vol. 18, pp. 27–34, 2011. [12] J. R. Chubb, T. Trcek, S. M. Shenoy, and R. H. Singer, “Transcriptional pulsing of a developmental gene,” Current biology, vol. 16, pp. 1018– 1025, 2006. [13] T. B. Kepler and T. C. Elston, “Stochasticity in transcriptional regulation: Origins, consequences, and mathematical representations,” Biophysical Journal, vol. 81, no. 6, pp. 3116–3136, 2001. [14] M. Simpson, C. Cox, and G. S. Sayler, “Frequency domain chemical langevin analysis of stochasticity in gene transcriptional regulation,” J. of Theoretical Biology, vol. 229, pp. 383–394, 2004. [15] V. Shahrezaei and P. S. Swain, “Analytical distributions for stochastic gene expression,” Proceedings of the National Academy of Sciences, vol. 105, pp. 17 256–17 261, 2008.
[16] A. Singh, B. Razooky, C. D. Cox, M. L. Simpson, and L. S. Weinberger, “Transcriptional bursting from the HIV-1 promoter is a significant source of stochastic noise in HIV-1 gene expression,” Biophysical Journal, vol. 98, pp. L32–L34, 2010. [17] R. D. Dar, B. S. Razooky, A. Singh, T. Trimeloni, J. McCollum, C. Cox, M. Simpson, and L. Weinberger., “Transcriptional burst frequency and burst size are equally modulated across the human genome,” Proceedings of the National Academy of Sciences, vol. 109, pp. 17 454–17 459, 2012. [18] L. S. Weinberger, J. C. Burnett, J. E. Toettcher, A. Arkin, and D. Schaffer, “Stochastic gene expression in a lentiviral positive-feedback loop: HIV-1 Tat fluctuations drive phenotypic diversity,” Cell, vol. 122, pp. 169–182, 2005. [19] A. Singh and L. S. Weinberger, “Stochastic gene expression as a molecular switch for viral latency,” Current Opinion in Microbiology, vol. 12, pp. 460–466, 2009. [20] L. S. Weinberger, R. D. Dar, and M. L. Simpson, “Transient-mediated fate determination in a transcriptional circuit of HIV,” Nature Genetics, vol. 40, pp. 466–470, 2008. [21] A. Singh, “Stochastic analysis of genetic feedback circuit controlling cell-fate decision in HIV,” in Proc. of the 51st IEEE Conference on Decision and Control, Maui, Hawaii, 2012. [22] A. Singh, B. S. Razooky, R. D. Dar, and L. S. Weinberger, “Dynamics of protein noise can distinguish between alternate sources of geneexpression variability,” Molecular Systems Biology, vol. 8, p. 607, 2012. [23] D. T. Gillespie, “Approximate accelerated stochastic simulation of chemically reacting systems,” J. of Chemical Physics, vol. 115, no. 4, pp. 1716–1733, 2001. [24] D. A. McQuarrie, “Stochastic approach to chemical kinetics,” J. of Applied Probability, vol. 4, pp. 413–478, 1967. [25] D. J. Wilkinson, Stochastic Modelling for Systems Biology. Chapman and Hall/CRC, 2011. [26] R. Karmakar, “Conversion of graded to binary response in an activatorrepressor system,” Phys. Rev. E, vol. 81, p. 021905, 2010. [27] N. G. V. Kampen, Stochastic Processes in Physics and Chemistry. Amsterdam, The Netherlands: Elsevier Science, 2001. [28] H. Akaike, “A new look at the statistical model identification,” IEEE Trans. on Automatic Control, vol. 19, pp. 716–723, 1974. [29] M. J. Dunlop, E. Franco, and R. M. Murray, “A multi-model approach to identification of biosynthetic pathways,” in Proc. of the 2007 Amer. Control Conference, New York, NY, 2007. [30] A. Singh and J. P. Hespanha, “Approximate moment dynamics for chemically reacting systems,” IEEE Trans. on Automatic Control, vol. 56, pp. 414 – 418, 2011. [31] ——, “Stochastic hybrid systems for studying biochemical processes,” Phil. Trans. R. Soc. A, vol. 368, pp. 4995–5011, 2010. [32] D. M. Suter, N. Molina, D. Gatfield, K. Schneider, U. Schibler, and F. Naef, “Mammalian genes are transcribed with widely different bursting kinetics,” Science, vol. 332, pp. 472–474, 2011.