Monte
Carlo
Simulation
of Microbial
Population
B. H. SHAH,
Growth
J. D. BORWANKER
AND
D. RAMKRISHNA* Indian Instituie of Technology, Kanpur, U.P. 208016, India Communicated
by Charles
J. Mode
ABSTRACT Mathematical models individual cells differing
of cell populations from one another
which recognize them to be segregated into in state and behavior lead to complicated
integro-differential
equations.
Analysis
even more complex
situation
of coupled
integro-differential
the behavior
of populations
by Kendall
for simulating
model has been extended cellular state and behavior. that the random
variables
of fluctuations
in this work to include The resulting algorithm to be generated
in small populations equations. which
obey
leads to the
An early technique the birth-and-death
the factor of variation in individual is probabilistically exact in the sense
have exactly
derived
distribution
functions,
and
involves no arbitrary discretization of the time interval, characteristic of simulation techniques in general. The simulated systems include populations distributed according to their age in a birth-and-death process and a batch culture of cells distributed according to their mass which grow and reproduce by binary fission.
INTRODUCTION Microbial populations consist of individual cells that differ from one another in their physiological states and consequently in their growth and reproductive processes in any particular environment. Mathematical models have attempted to account for this complexity by assigning a simple identifying structure such as cell mass or age to the individual cell and accounting for variation among individual cells through a statistical distribution function for which equations are obtained based on the concept of a
*Current address: School Lafayette, Indiana 47907. MATHEMATICAL 0 American
Elsevier
of
BIOSCIENCES Publishing
Chemical
Engineering,
31, l-23 (1976)
Company,
Inc., 1976
Purdue
University,
West
1
2
B. H. SHAH ET AL.
number balance. The penalty for this sophistication, even with such gross approximations of the physiological state of the cell, has been the solution of integro-differential equations which result from the model. Small populations add to the complexity by the presence of fluctuations whose calculation would require the solution of coupled integro-differential equations [9]. Solutions have been obtained for large populations, which are distributed according to their age [2, 141 or mass [l, 121. A mathematical model of the type just referred to is essentially a set of stipulations regarding the behavior of the individual cell. Thus a characteristic state such as age or mass may be defined for the individual cell, together with the kinetics of its growth and a transition probability function associated with the birth of new cells from the parent cell. The term “growth” in the present context refers to the rate of change of state, which for cell age is of course unity. The growth rate and the transition probability of cell division will in general depend on the cell state, time and the “environment”, which may be defined by concentrations of one or more essential nutrient substances. Also other events such as death of an individual may be accounted for through a transition-probability function for them. These stipulations, together with a knowledge of the initial state of the population and the environment, should suffice for the prediction of all future states. Kendall [4] considered an “artificial realization” of a simple birth-anddeath process in the following framework. The process involves the random appearance of new individuals and the disappearance of existing individuals with respective transition probabilities or “frequency functions” which continually change the strength of the total population, indicated by the total number. Thus a “birth” event will add one to the population while a “death” event will delete one from it. As is customary, during a time interval dt, multiple events will occur with probability of O(dt2). Kendall defines a “time interval of quiescence”, the period between successive events during which the number of individuals would remain constant. Obviously, this interval of quiescence is a random quantity because of the random nature of the events themselves. Kendall shows that the interval of quiescence has an exponential distribution with a coefficient parameter which depends on the number of individuals at the beginning of the interval, although this latter qualification depends upon the model for the transition probabilities. Thus if the size of the population is known at the beginning of the interval of quiescence, the size at the end of the interval will depend on whether the last event is a birth or a death. Given that a birth or a death has occurred, the probability of either event is readily obtained as the ratio of the corresponding transition probability to the sum of the two frequencies. An artificial realization of the process is now made possible by the generation of random numbers which represent successive
SIMULATION
OF MICROBIAL GROWTH
3
quiescence intervals and identify whether the event at the termination of each interval is a birth or a death. The random numbers to be generated should obey their governing probabilistic laws. Several such realizations can be obtained from the instant of reckoning of the process, from which all information about the process, probabilistic or average, must be calculable. The objective of this paper is to show that Kendall’s approach is easily extended to obtain artificial realizations of mathematical models of populations which incorporate the additional detail of individual variation in some characteristic state. The resulting algorithm is a powerful tool for the simulation of particulate systems in general. Consequently, we derive the algorithm for an abstract particulate system and subsequently discuss its application to examples of biological populations. THE SIMULATION
ALGORITHM
We consider a system of particles in which the individual particle is distinguished by a single scalar property X which can take on values between 0 and co. The extension to a vector-valued particle state will become evident in the course of the following discussion. We assume that the rate of change of particle state, denoted X, is a function of the instantaneous state. The situation when X may depend on some environmental variables is, in principle, not difficult to accommodate, although computationally burdensome. The procedure for this generalization is indicated at a subsequent stage, but the derivation of the algorithm and the examples of application will exclude this feature. The particles in the system are subject to events which we will call particulate events, the occurrence of any of which will result in a change in the number of particles. Thus the breakage of a particle into smaller particles or fission of a cell into daughter cells are particulate events which increase the number of particles, while the agglomeration of two particles to form a single particle diminishes the number. The latter type, which necessarily involves the participation of more than one particle (i.e., biparticulate events), will not be treated here. Thus, this paper will deal with monoparticulate events such as cell division or cell death. Random particulate events are modeled through frequencies or transition probabilities. We suppose for the sake of generality that there are m monoparticulate events, the ith event possessing a frequency bi, which is assumed to depend on the particle state. Since environmental variables are excluded for the present, their influence, if any, on the frequencies b, would require the generalization outlined subsequently. At this point we recognize that the model may require additional stipulations pertaining to the states of newborn particles. The state of a newborn particle may depend on the state of the parent particle and may have a probabilistic distribution. Thus a daughter cell formed from the
B.H.SHAH
4
ETAI..
division of a mother cell of mass (say) m’ may have any mass between 0 and m’. If, however, age is the characteristic state of a cell, newborn cells will have age zero. Following KendalI 141,we first seek to determine the probability distribution of the interval of quiescence following an instant at which the state of the system is known, i.e.. the number of particles along with the states of individual particles is known. DISi-RIEUTION
OF QUIESCENCE
TIME
INTERVALS
The probability distribution to be determined is conditional on a known state of the system a? time t. Suppose there are N particles at time I with states _x,.x2,. . ,xN. We will refer to the particle of state x, as thejth particle. The state of thejth particle at any time 7 subsequent to I is given by the solution of the differential equation
with the initial condition that x=x, at T=O; the solution is most conveniently represented by x(! + T.x,,I). We denote the state of the system at time I by A, and define the following conditional probability: P (TIA,) = Pr{ interval of quiescence > TIA,) The interval of quiescence will be greater than 7 + do, if it is greater than 7 and during the time interval (r? 7 + dr) none of the N particles then existing underwent any of the m particulate events. The mathematical equivalent of the preceding statement is
1
P(r+dTlA,)=P(TjA,)
I-
2 5 4(X(t+T,Xj.t))dT].
!=I
Rearranging the terms in Eq. (2). dividing obtains the differential equation
i P(rjAt)
must
satisfy
by d7 and letting
dT+0,
one
2 5 bi(x(t+7,xj,t))
-$P(TiA,)= -P(rlA,) Further
(2)
J-1
i=l ,‘I
the obvious
(3) 1
initial
condition
P(O]A,)=
I,
SIMULATION
5
OF MICROBIAL GROWTH
subject to which the solution
of Eq. (3) is readily found. Thus
which is an exponential distribution with a variable parameter. The cumulative distribution function F(rlA,), which is the probability that the quiescence interval is less than r, is then given by
F(rlA,)=
l-exp
1
- 2 2 i=i
JTb,(,(r+~‘,~l,l))d7r
j=l
O
.1
(5)
Next we compute the probability that, given that a particulate event has occurred after a quiescence interval T following time t, thejth particle was involved in the ith event. This is conveniently done by defining a twodimensional indicator variable u which takes on the value (ij) for the event just mentioned to be true. It is readily seen that
Pr{ u = (W% T } = m
bi(x(t+ T,Xj,t)) Ncrj
(6)
Z: C bi(X(t+ T,Xj,t)) . i=l
j=l
When the quiescence interval T is specified along with the identity of the particle and the identity of the event it has undergone, the state of the system at t+ T is exactly calculable from that at time t, if the states of any newborns are precisely known. SIMULATION
OF SYSTEM
EVOLUTION
The particulate system under study evolves in time starting from a given initial condition. The preceding discussion has established that any particular realization of the system is represented by a sequence of random variables
(T,,u,)>
(Tz>dr...>(T,,,u,,)
(7)
which represents the state of the system until time t =Cy=,Tj with changes in the number of particles occurring at T,, T, + T2, T, + T,+ T3,. . . , T, + T2+ . . . + T,,. Of course, the tacit assumption has been made that the states of newborns are precisely known. The situation when this assumption does
6
B. H. SHAH
ET AL.
not hold will be treated through an example. Since the probability distributions of the random variables in the sequence are known, artificial realizations are possible by generating the random variables on a digital computer. The random variables are sampled sequentially until a specified value of the final time is just exceeded. A sufficient number of repetitions starting from the initial condition would then allow us to estimate the expected values of various quantities of interest and fluctuations about them. If the initial condition is known only in a probabilistic sense, then this would require the generation of specific realizations of the random initial conditions. We will not concern outselves here with the method of generating random variables, since it is available elsewhere. The simulation strategy should now be evident. The state of the system at any time t, represented by A,, starting from that at t = 0, is obtained as follows. From A,, generate the interval of quiescence T,. Update the state of the jth particle to i= 1,2 )...) N,. x = x ( T,, x/7O), Compute the event frequencies at t = T, and simulate the event by generating the indicator variable u. At this point one may need to generate additional random variables, especially in assigning the “birth states” of newborns. Since samples of a random variable with an arbitrary cumulative distribution function are obtainable by the methods in the preceding section, the states of newborns can be generated when a probabilistic model is available for their distribution. ESTIMATION
OF SYSTEM
PROPERTIES
Essentially, the number of simulations must be large enough to provide consistent estimates of the various quantities of interest which describe the system. Some of the quantities evaluated in this work are as follows. If n(x, t)dx represents the number of particles at time t with states between x and x+ dx, then N (x, t), the number of particles with states less than x, is given by N(x,t)=fn(x’,t)dx’.
Since N (x, t) is in general a random
(8)
quantity,
,8, (xpt) = E
one may obtain
[N (x, 41.
(9)
One may also write PI as P, (x,t)=
Lxf, (x’,t)dx’.
(10)
SIMULATION
7
OF MICROBIAL GROWTH
wheref,(x, t) is a product density of order 1, a concept first developed by A. Ramakrishnan [7] in the analysis of cosmic-ray showers; f,(x,t)dx is the expected number of particles between x and x + dx at time t. One may also obtain a function /?*(x,y, t) (x = J’,yyf2 (X’>Y’> t)dx'dy'.
fi as
(12)
,B2is thus associated with the calculation of fluctuations about mean values The particulate process defined in the preceding sections leads to integrodifferential equations in f,, fi etc. [9] which may be solved subject to appropriate initial conditions to obtain the values of /It and & However, the present aim is to compute their values from the results of direct simulation. Cumulative properties of the system derived from individual properties may also be evaluated. Let W(X) be the property associated with a particle of state x. Then define W(t) by
W(t)=
~mw(x)n(x,l)dx,
which is obviously a stochastic process. nature of the particle population,
Equivalently,
(13) due to the discrete
N(r)
W(t)=
z
w(x,).
(14)
j=l The expected value of W(t) and the variance
may be obtained
EW(1)=Jmw(x>f, (x,t)b
(15)
0
+
Yv2(x)f, I
0
(x,t)dx-
as in [9]:
V
mw(x)fi
0
2
(-Gt)dx
1 .
B. H. SHAH ET AL.
8
The formulae for estimating the quantities (9) (1 l), (15) and (16) from the simulation results are easily identified. Representing the estimated quantities by a circumflex on the symbol, we obtain for a total of S simulation runs the following results.
(17)
(18)
EW(t)= +
E Wj(l),
J=l
PW(t)=
&
,g [ u:(+~~wl*.
(21)
The confidence intervals of estimates were also determined using a confidence limit of 95%. The confidence interval for EW, for example, is given by l?W-k<EW(,x,y, r), where x and y now refer to age. If X(x) and p(x) are respectively the birth and death frequencies for an individual of age x, the differential equations in .f, and fi are easily derived:
(23)
B. H. SHAH
10
ET AL.
The differential equations contain no birth frequency, because birth has been assumed to leave the parent unaltered while adding an individual of age zero to the population. The possibility of multiplets requires some modifications, and A. Ramakrishnan and Srinivasan [8] have accounted for twins. The simulation procedure in this paper needs no special consideration to account for this effect, which is done naturally and readily. Equations (22) and (23) must be subject to the boundary conditions (24)
(25) If the initial population consists of No individuals with ages xo,,xo2, then the initial conditions for f, and fiare given by f,
(x,0)=
2 stx--oj)
XONo7
(26)
j=l
NO
f2 Equations arbitrary on letting simulation dence of increased.
NO
c 8(Y - XOj>, j=l
(XYY> 0) = Ix f3tx- XOi) i=l
i#j.
(27)
(22) and (23) are readily solved subject to (24) through (27) for frequencies h(x) and p(x). The algebra is considerably simplified h and p be independent of age, i.e., constants. Of course, the procedure is in no way complicated by allowing for age depenA and CL, only the number of simulations required may be For h and p constant we obtain r
Noe(X-“)‘-ti 3
O<x