Johan Andreasson Stanford University 12/19/2014
Scoring of riboswitches in EteRNA Riboswitches are exciting! Here we give a brief introduction to riboswitches, riboswitch puzzles in EteRNA, and the scoring system we have introduced to assess riboswitch designs.
Introduction to riboswitches Riboswitches are RNA molecules that change structure in response to signals such as small molecules inside the cell. They are present in all kingdoms of life and are commonly used by bacteria to fine‐tune gene expression. Riboswitches may also be very useful in biotechnology and the development of RNA‐ based drugs so the design of accurate and efficient switches is an important problem to solve. The simplest switches contain an aptamer, a short RNA motif that binds a ligand. The switch alternate between two states, [1] and [2] (Figure 1). In the absence of ligand the probability of being in state [2] is determined by the energy difference between the states, Δ . As seen in the blue trace in Figure 2, the switch is in state [2] almost all the time (96%) if the energy of that state is 2 kcal/mol lower than the energy of state [1]. When the states have the same energy, i.e., Δ 0 kcal/mol, they are equally populated.
[1*] ligand
[1]
[2]
[2*]
MS2 hairpin
MS2
aptamer
Figure 1. Example of an EteRNA switch puzzle.
Free energy Δ
1
2
Figure 2. Left: Probability of being in state [2] as a function of the difference in energy, Δ , between states [2] and [1] (blue 1/ 1 , where is the Boltzmann constant and is the absolute temperature. curve). The probability is given by The addition of ligand (red curve) shifts the equilibrium away from state [2]. Right: Free energy diagram of a simple switch. With no ligand present (blue line), state [2] has the lowest energy. The addition of ligand lowers the energy of state [1], which now becomes the most stable state.
State [1] has a properly folded aptamer and the binding energy from ligand binding (Δ ) lowers the energy of the state. This alters the energy difference, indicated by the red arrow in the left panel of Figure 2. For a good switch, this change is larger than the native Δ and shifts the balance to state [1]. In the example of Figure 1, the binding energy of 4 kcal/mol is twice as large as the native Δ , so state [2] is only occupied 4% of the time in the presence of ligand. The switch range (92%) is the difference in occupancy between the two conditions. A good way to think about switching is through a free energy diagram, like the one in the right panel of Figure 2. The RNA will fold into the lowest state, and ligand binding shifts the minimum from one state to the other.
Riboswitches in EterRNA Switch puzzles In EteRNA, switch puzzles show two desired structures ([1] and [2] in Figure 1) and the players have to design a sequence that simultaneously solves both states. One conformation includes an aptamer and is stabilized by a ligand ([1*]). The ligand binding results in an energy bonus that is indicated by a symbol within the aptamer structure. Early switch puzzles in EteRNA were tested with chemical mapping methods that directly report the secondary structure. For our new switch puzzles, we instead use a new RNA array that allows us to probe thousands of switches at once using fluorescently labeled RNA binding proteins or RNA oligonucleotides. In the first iterations, we are using the MS2 viral capsid protein (MS2) that binds the MS2 hairpin with high affinity. Because we can only observe binding, we are now measuring the structure of the riboswitch indirectly (state [2*]). Aptamers and ligands There are hundreds of aptamers in nature that sense various small molecules. New synthetic aptamers can also be obtained for new molecules using in vitro selection methods (e.g., SELEX) where the RNA is evolved from random starting sequences. For the first switch puzzles in EteRNA, we use an aptamer for the small molecule flavin mononucleotide (FMN) (Figure 3). The aptamer has a relatively strong affinity for FMN, with a dissociation constant 0.3 µM. In the presence of 200 µM FMN, our experimental , conditions, most of these aptamers are expected to bind FMN.
Figure 3. Flavin mononucleotide (FMN). Source: Wikipedia
For future puzzles, we also want to use microRNAs (miRNAs), ~22‐nt RNA molecules that modulate gene expression in humans and many other organisms. There are more than 2000 such miRNAs humans, some of may be used as therapeutic targets (http://www.mirbase.org/cgi‐bin/browse.pl?org=hsa). Binding of MS2 protein The fluorescent signal, , of binding by proteins like MS2 is characterized by a binding curve, MS2
,
, where MS2 is the MS2 concentration.
,
is the MS2 dissociation
constant, i.e., the concentration at which the signal reaches half of its maximum.
For an RNA molecule that only contains an MS2 hairpin, , 16 nM (Figure 4, left panel). These measurements were done with an RNA array, where millions of RNA clusters are generated on a next‐ generation sequencing flow cell. For RNA molecules without an MS2 hairpin sequence, signals are low (Figure 4, right panel).
Figure 4. Left: Binding of MS2 to the MS2 hairpin. The normalized fluorescence signal for more than 18,000 clusters is shown as small dots. The median fluorescence in the presence and absence of FMN was fit to a binding curve (solid lines), with , as indicated. Right: Binding to a randomly generated RNA sequence without the MS2 hairpin (>15,000 clusters). Some binding is observed, but the median value is close to zero.
Signals for switches OFF-switches We define an OFF‐switch as a riboswitch that turns off in the presence of ligand. Figure 1 shows one example where FMN binds to the aptamer and destabilizes the MS2 hairpin, leading to lower signal. For this example, the fluorescent signal is governed by a binding curve equation: MS2
,
, where
,
,
1
1
,
is the observed
,
is the Boltzmann constant and is the temperature. For a derivation of this equation and related expressions, see Appendix 1. The range of switching for a single design could be determined at a single MS2 concentration by measuring the fluorescence in the presence and absence of ligand, e.g., for 0 and 200 µM FMN. In practice, the concentration of maximum switching, MS2
,
,
,
,
, is
not known beforehand and is different for each design so instead we measure the entire binding curve, from low MS2 concentrations to MS2 3000 nM. ON-switches An ON‐switch enables the MS2 signal in presence of ligand. The MS2 fluorescence signal is again governed by an apparent binding curve:
MS2
,
, where
,
,
, ,
is the observed
.
The observed is has a different expression from the OFF‐switch. Here, a more negative Δ represents a more stable state with the aptamer and the hairpin formed. Details are given in Appendix 1.
Switch design score To evaluate submitted and tested designs, we use an EteRNA Score, (0–100), that adds a Switching Subcore, (0–40), a Baseline Subscore, (0–30), and a Folding Subscore, (0–30), Together, these scores capture the three main qualities of a good switch (Figure 5). The Switch Subscore (red arrow), quantifies the fold‐increase in , between the OFF‐ and the ON‐states, and indirectly the range of switching at the optimal MS2 concentration. The Baseline Subscore (green arrow) quantifies how close the affinity of the ON‐state is to that of the native MS2 hairpin. The Folding Subscore rewards switches that properly fold the MS2 hairpin. If the maximum signal falls below a threshold, the Folding Subscore decreases. The three subscores are explained in more detail below.
Folding Subscore
Fhigh Baseline Subscore
Switch Subscore
Flow
Figure 5. EteRNA subscores. The arrows indicate the direction the binding curves should move to increase the scores. The EteRNA score is the sumb of the Switch Subscore (red), the Baseline Subscore (green), and the Folding Subscore (blue).
Switch Subscore The first quality of a riboswitch design is that is has to switch in the presence of ligand. Here, use the fold‐change (
) in
,
upon addition of ligand,
,
, ,
,
, as a metric for the switching
(Figure 6, left panel). The switch range, i.e., the difference between maximum and minimum signal at the optimal MS2 concentration, is directly related to (Figure 6, right panel and Appendix 1), so the higher the the better the switching.
Figure 6. Left: Switch metrics for an OFF‐switch. In this example , the Switch range is 0.70, or 70%, and the fold‐change in is 89. Right: Switch range as a function of switch efficiency.
,
The formal expression for the Switch Subscore is 40
0,
1, log
⁄log
All designs with above the cutoff receive maximum score (40). Below the cutoff, the score is proportional to the logarithm of the , and designs with an less than 1 receive a score of 0 (Figure 7).
Figure 7 . Switch Subscore as a function of fold‐change in
,
.
,
For our simple OFF‐switch,
, and for the ON‐switch,
,
,
The cutoff for maximum score is set to the For both types of switches, this value is experiments.
expected for switches with |Δ | 26 when FMN
log
,
.
200 µM, the conditions used in our
It turns out that for both OFF‐ and ON‐switches, the destabilized. For large positive Δ s, the ,
increases with Δ , i.e., when the ON‐state is
approaches
,
but only at the cost of also increasing
in the ON‐state, (Figure 8).
Figure 8. Fold‐change of , upon switching ( , red) and Baseline Ratio ( , green) as functions of Δ for two example switches. Left: The OFF‐switch illustrated in Figure 1. Right: The ON‐switch described in Appendix 1. The Δ scales are different for the left and right panels but the general behavior of the OFF‐ and ON‐switches are similar. The shaded areas indicate switch qualities that are considered good and receive the highest EteRNA scores. The overlap represents the window in Free energy, Δ , for designs that can receive the maximum score.
Baseline Subscore The observed switching between the two states increases as we bias the OFF‐state, but for a good design we want the molecule to switch close to the of our detection molecule (MS2). We therefore introduce a second metric, the Baseline Score, The ,
, which is based on the Baseline Ratio,
,
, ,
quantifies the affinity for the ON‐state, , , , compared with the baseline MS2 affinity, . The formal expression for the Baseline Subscore is 30
0,
1, 1
⁄
For low , designs are awarded the maximum subscore (30). When the exceeds an initial , is decreases linearly. Beyond a second threshold, , the score is 0 (Figure 9). threshold,
.
Figure 9. Baseline Subscore as a function of Baseline Ratio.
Based on our theoretical example switches, we set 2. For the OFF‐switch, this corresponds to Δ 0, or a 50% probability that the switch forms the MS2 hairpin the absence of ligand. Similarly for the ON‐switch, 2 gives a 50% probability of forming the hairpin in the presence of ligand, i.e., energy from FMN binding is equal to Δ . The second threshold is set to 6. Folding Subscore A switch is only useful if it actually folds into the proper structure. To this end, we define the Folding . This intensity is Score that is based on the maximum fluorescence intensity in the ON‐state, measured as a fraction of the intensity for the MS2 control. The Folding Subscore ensures that high‐ scoring designs are fully functional and do not return high scores by generating unusual binding curves. The formal expression for the Folding Subscore is 30
0,
⁄
1,
The maximum Folding Score (30) is awarded to all designs for which the maximum intensity is above 0.7. The score is 0 if 0.3 and increases linearly for , i.e., between 0.3 and 0.7 (Figure 10).
Figure 10. Folding Subscore as a function of
.
Appendix 1: Derivation of expressions for observed MS2 binding curves OFF-switch We define an OFF‐switch as a riboswitch that turns off in the presence of ligand. In other words, the MS2 signal should decrease as the ligand concentration increases. Here, the ligand is FMN at a concentration of FMN 200 µM. The aptamer affinity for FMN is given by the dissociation constant, 0.3 µM. , Using the same states as in Figure 1, we find the following energies: ln
[1*] (Aptamer formed, FMN bound, no hairpin, no MS2 binding):
,
[1] (Aptamer formed, no FMN bound, no hairpin, no MS2 bound): 0 [2] (No aptamer, no FMN bound, hairpin formed, no MS2 bound): Δ ln
[2*] (No aptamer, , no FMN bound, hairpin formed, MS2 bound): Δ
,
The fluorescent signal is only observed in [2*] and the probability of being in this state is given by: ∗ ∗
∗
∗
where
Signal
,
∗
,
, ,
,
,
,
,
, where
,
,
,
1
,
,
,
1
,
,
The fold‐change in
,
is
,
1.
667 ≫ 1
.
,
,
is a constant scaling factor. For a normalized fluorescent signal, For our experiments,
,
,
, ,
,
,
, ,
,
ON-switch ON‐switches form the MS2 hairpin in the presence of ligand. The derivation for the observed signal is similar to that for the OFF‐switch but here we need to consider five states: [0] (No aptamer, no FMN bound, no hairpin, no MS2 binding): 0 [1] (Aptamer formed, no FMN bound, hairpin formed, no MS2 bound): Δ ln
[1*] (Aptamer, FMN bound, hairpin formed, no MS2 bound): Δ
ln
[2] (Aptamer, no FMN bound, hairpin formed, MS2 bound): Δ
,
ln
[2*] (Aptamer, FMN bound, hairpin formed, MS2 bound): Δ
,
ln
,
,
∗
Signal
∗
,
,
∗
,
,
,
,
,
,
,
,
,
∗
,
,
,
,
,
, ,
,
,
,
, where
,
,
, ,
The fold‐change in
is
,
, , ,
,
,
, ,
,
,
,
Switch range The switch range is the maximum difference in signal for the ON‐ and the OFF‐states
,
,
,
The difference is maximized for MS2 , ,
,
,
∗
,
,
,
.
,
,
,
, ,
,
,
,
,
,
, ,
, ,
,
, ,
,
,
,
,
, such that
,
,
∗
or ,
,
, ,
, ,
,
,
,
,
,
,
,
,
,