Programming Chemical Kinetics: Engineering Dynamic Reaction Networks with DNA Strand Displacement

His work was the intellectual foundation for my efforts to design chemical reaction networks with DNA strand displacement. Bernie Yurke was a collaborator on our work on the biophysics and kinetics of DNA strand displacement.

Motivation and context: the molecular programming perspective

Design complexity (plotted in log scale to the base 10) was evaluated as the number of nucleotides of synthetic DNA incorporated into the experimentally demonstrated system. Information processing takes place in many different flavors in the living world; Figure 1.4 provides three clear examples.

Figure 1.1: The developmental program, like software, is sensitive to small changes. Mutations in the Antennapedia (Antp) gene in the fruit fly Drosophila melanogaster can result in a leg growing out of the head rather than an antenna [4]

Well-mixed chemical systems with complex dynamical behavior

Furthermore, mitotic events occur within minutes of each other in distant parts of the cell. However, with successive cycles, the trigger waves occupied more and more of the tube.

Figure 2 | Rapid, linear propagation of mitotic entry and exit through Xenopus cytoplasm

The language of formal chemical reaction networks

Discrete stochastic CRNs

The model
Computation with discrete stochastic CRNs

Connections between discrete stochastic CRNs and some of these computational models are discussed in Cook et al. The proof is based on the simulation of register machines (known to be Turing universal [82]) with discrete stochastic CRNs.

Continuous deterministic CRNs

So far we have discussed transformations from ODE systems to chemical systems with mass action kinetics. A naturally related question is whether dynamical systems corresponding to physical or electrical systems can be approximated by chemical systems with mass action kinetics.

DNA strand displacement as a candidate architecture

Nucleic acid nanotechnology

In 1982, Ned Seeman proposed that synthetic DNA molecules could be designed to form immobile three-armed and four-armed junctions, which in turn could be used to create three-dimensional lattices [130]. As with Seeman's original goal, several scaling challenges limit the possibility of using DNA molecules to solve NP-hard problems (at least in the way Adleman envisioned) [140].

DNA strand displacement

In particular, one reaction mechanism called toehold-mediated DNA strand displacement [145–148] is a major workhorse of dynamic DNA nanotechnology [8].

Summary of contributions

How does the kinetics of strand displacement depend on the length of the branch migration domain or on the temperature and buffer conditions. Let us assume that the rate constant for the formation of the toehold base pair is of the order of 106 /M/s.

Figure 2.1: (A) Domain notation. Arrows indicate 3’ ends; * indicates Watson-Crick complementarity

Materials, Methods, and Results

Intuitive Energy Landscape model

The final step of successful displacement involves the dissociation of the incumbent (state E) followed by the formation of the final base pair between invader and substrate (state F). Figure 2.3 shows the first base pair of the toehold next to the helix, where it interacts positively with the adjacent duplex end.

Figure 2.3: Free energy landscape of the IEL at 25 ◦ C for a 6-base toehold. States A-F and the sawtooth amplitude (∆G s ) and plateau height (∆G p ) parameters are described in the text

Secondary structure kinetics model

It is the number of strands in the complex and ∆Ginit = ∆Gassoc+ ∆Gvolume is, similarly to IEL, the free energy cost of joining two separate strands. The SSK analysis confirms that in order to understand what the IEL ∆Gsand∆Grepresent, it is necessary to examine features that are not present in the NN model.

Measuring relative stability of branch migration intermediates

Measuring the relative stability of these frozen snapshots is expected to be indicative of the relative free energies of branch migration intermediates. By comparing the free energies of different complexes, we infer the contribution of the poly-T overhangs.

Figure 2.6: (A) Complex Xi:Yj comprises hairpin Xi and strand Yj, with poly-T overhangs of length i and j respectively

Coarse-grained molecular modelling

The red crosses show the free energy as a function of the index of the most advanced base pair between the invading strand and the substrate (base pair 1 is the base pair in the toe grip farthest from the incumbent). The contribution of the single-stranded overhangs to the free energy of association ∆G◦ is expected to be independent.

Figure 2.10: Rate of displacement, as a function of toehold length, observed in simulations (crosses, left axis).

Discussion

However, understanding the process at an effective secondary structure level is useful: oxDNA then justifies tuning the IEL to use an effective sawtooth amplitude significantly larger than the free energy of a single stack of a base pair to slow the rate of branch migration. We argue that the slow onset of branch migration relative to friction is a key aspect to understanding strand displacement.

Figure 2.15: (A) Free-energy landscape of a system with a two base-pair toehold, with the system prevented from forming other base pairs between invader and substrate and also prevented from having either base pair separation exceed 3.7 nm

Conclusions

Their proposal is essentially an algorithm that, given a set of chemical reaction equations and rate constants, yields a molecular DNA-based implementation in which reactions are mediated via DNA strand displacement cascades. Although my research into the biophysics and kinetics of DNA strand movement (Chapter 2) started completely independently, there appeared to be a significant synergy between the two projects.

Figure 3.1: Examples of formal CRNs exhibiting different dynamical behaviors in the mass action setting (based on numerical solutions to mass action ODEs)

DNA strand displacement architecture

We then use the CRN-to-DNA scheme described in this chapter to translate the formal CRN into an implementation of DNA strand movement, where the formal species are represented by single DNA strands called "signal" species . In the regime where the fuel species are at high concentration, the signal species approach the dynamics of the formal species in the original CRN.

Figure 3.2: a. Overview of our CRN-to-DNA efforts. We start with a desired dynamical behavior (oscillation, in this case) and a CRN program that captures the desired dynamics

Test case: engineering a strand displacement oscillator

Modeling the DNA implementation

For clarity, Equations 3.1–3.4 specify the chemical reaction equations in the strand displacement level model for the autocatalytic module B + A→2B. Therefore, in principle we should be able to construct oscillatory dynamics that persist as long as the fuel species are in significant excess, even in 'batch reactor' mode where the fuel species are not replenished.

Non-idealities in the DNA implementation

These flow reactions are a direct consequence of the fact that the displacement of the blunted finite strand occurs at a non-zero rate. Strand displacement can then result in the release of the first output of the output gate (here, Cj) and the formation of a spurious species.

Figure 3.11: Modeling the DNA implementation of the oscillator at the level of individual strand displacement and toehold exchange reactions

Sequence design challenges

First, they should be as fast as possible compared to gradual flow paths, such as the displacement rates of the leading edge filament. First, such strong footholds result in rapid strand displacement rates compared to gradual flow rates.

Figure 3.14: Illustrative (but not exhaustive) examples of toehold-only interactions in the molecu- molecu-lar implementation of our oscillator

Sequence design 1

That is, the gradual leakage rate does not scale with the concentration of the reactants exactly as we expect a bimolecular process to. These clamps are intended to mitigate some of the gradual leakage paths shown in Figure 3.12, such as the React-Produce gradual leaks in panel (c).

Figure 3.15: Experiment illustrating leaks in Design 1, with Produce CApAq - Helper AAq leaks as an example

Choosing algorithms for sequence design and verification

Heuristics for evaluating sequence designs in silico

The “Top Strand Interactions (TSI)” score is the sum of the interaction scores for each individual pair of top strands (Signal, Flux, Back and Helper strands). The “Toehold Occlusion (TO)” score is the sum of I (t∗, S) for each toehold complementt∗ and top strand S, assuming that S contains no toeholdt.

Candidate sequence design methods

First, our heuristic measures include measures that focus on false matches at the level of sequence identity, without a thermodynamic or kinetic evaluation of how physically important those false sequence identity matches might be in the test tube. We did not test the performance of the second generation NU-PACK sequence design algorithms in this analysis.

Sequence design 2

In this strategy, there is a trade-off between the ACT alphabet and the prevention of branch migration at the junction in the Produce molecules. With the ACT alphabet, branch migration back and forth of 2 nucleotides around the junction is inevitable as both will have to start with "CC".

Sequence design 3

Kinetics of desired pathways

In addition to the MFE structure, we found that the first two bases of the mA branch migration domain, both G, were almost always bound to one or the other (weak) hairpin. In particular, the first two bases (GG) of the branch migration region are base-paired most of the time, and these base pairs appear as part of several weak hairpins.

Figure 3.19: Based on experiments measuring gradual leak with single-base changes at the posi- posi-tions illustrated (‘ATCC’ in Helper AAq and ‘GGTA’ in Produce CApAq ), these bases contribute to the high gradual leak between Produce CApAq and Helper AAq

Sequence design 4

To observe both the consumption of the threshold and the autocatalytic amplification simultaneously in the same sample, we combine the threshold readout with an "Auxiliary readout". The main difference between the semi-quantitative model and the experimental data lies in the shape of the threshold concentration curves.

Figure 3.21: Energy imbalance between the external forward toehold ( f A ) and the internal back- back-ward toehold ( f C ) causes slow triggering of React ACAp

Displacillator: a de novo strand displacement oscillator

Counteracting damping: Catalytic helper mechanism

The CatHelper string is nothing but the Helper string extended at the 5' end with the history domain of the first output of the Produse species (here,hCj). Apart from releasing the second output (here, Ck), the catalytic Helper also displaces the Flux strand through tonehold exchange, which is then free to interact with more Produce species to release more outputs, thereby effectively "tune" the output stoichiometry of the desired. CRN.

Figure 4.3: Catalytic Helper mechanism. a. Standard produce step for the reaction C + B → 2C facilitated by the traditional “Helper” species

Optimized Displacillator experiments

Two samples are used for each experiment: Sample 1 uses “simple” versions of Helper and CatHelper (indicated by a†), which do not contain fluorophores, and ThA, ThB, and ThC thresholds with fluorophores. In particular, the Produce species in both samples are labeled with a quencher on the bottom string.).

Figure 4.4: Experimental setup for Displacillator experiments. Two samples are used for each experiment: Sample 1 uses “plain” versions of Helper and CatHelper (indicated with a † ), which do not contain fluorophores, and thresholds Th A , Th B , and Th C

Inferring signal strand concentrations

Ideal stoichiometry approach
Phenomenological model

For a given autocatalytic module, r is interpreted as the average number of reactants consumed per unit consumption of total Helper species for that module. Similarly, pi is interpreted as the average number of products released per unit of consumption of total Helper types for that module.

Figure 4.6: Inferences of A(t), B(t) and C(t) concentrations with the ideal stoichiometry approach outlined in Section 4.2.3.1

Mechanistic model of the Displacillator

Mechanistic model

We now present a mechanical model of the Displacillator that models each elementary strand displacement and leg exchange response. The mechanical model predicts much faster oscillations than observed experimentally; the periods of oscillation are extremely different.

Mechanistic-occlusion model

Figure 4.9 shows predictions of the mechanistic model and experimental data for one run of the displacer. Fixingcon= 2∗106/M /s for simplicity, we found the predictions of the mechanistic occlusion model to be quite sensitive.

Characterizing individual strand displacement and toehold exchange rates . 120

Here we summarize what we learned about sequence design and silicon verification while developing experimental DNA strand displacement systems based on our CRN-to-DNA scheme. We believe that while these design rules are likely particularly relevant to our CRN-to-DNA scheme, the general principles can apply to any DNA strand displacement system.

Table 4.2: Measured rate constants (all in /M /s) for designed strand displacement and toehold exchange reactions in the Displacillator

Challenges in scaling up CRN-to-DNA approaches

Second, scaling up will require (i) better mechanistic understanding of the initial and gradual flows, so that they can be further mitigated, and (ii) design principles to modify the domain-level specifications of current CRN schemes. -DNA for increased fault tolerance and robust performance in the face of molecular non-idealities. However, with further work to understand the experimental limitations and their implications for our design pipeline, answers to this question will begin to emerge.

Appendix to Chapter 2

Introduction

Intuitive Energy Landscape model

Initially, it is the degree to which the current driver's first base is displaced by the catcher, as the leg grip is connected. The probability of the starter moving to first base before the leg rest is detached is simply kfirst/(kfirst+kr(h)).

Figure 5.2: IEL free energy landscape at 25 ◦ C for a 0-base toehold. First, the invader and the substrate- substrate-incumbent complex are unconstrained by each other (A)

Augmented Energy Landscape model

AEL has the same rate pattern as IEL, except for transitions involving states in which the leg support is partially formed. In general, these modifications to the IEL result in a self-consistent model with an initial connection speed that is linear in the length of the leg brace.

Secondary structure kinetics model

For toe lengths less than 15, the toe of the intruder is truncated to the appropriate length, measured from the 5' end. Depending on the length of the penetrating tonal syllable, a subset of this overhang is complementary to the tonal syllable.

Figure 5.6: Multistrand simulations at 25 ◦ C with different choices: (A) (i) in treating free energy con- con-tributions due to dangles [178] (options “Some”(default), “None” and “All” in the NUPACK [123] energy model [120]) and (ii) with substrate overha

Measuring relative stability of strand displacement intermediates

For each of these samples, two runs of the temperature dependent absorbance experiment described above were performed. For each data set, we perform a simultaneous nonlinear least-squares fit (using the Levenberg-Marquardt algorithm, implemented by a built-in MATLAB function) of the predicted melt fraction curves to the smoothed and normalized absorbance data across all three concentrations that are present in the dataset.

Figure 5.8: Raw absorbance data (at 260 nm), while annealing, at a concentration of 200 nM

Coarse-grained molecular modeling

Assuming that we have obtained a representative set of conditions at each interface, the later stages can be modeled as Bernoulli trials - the probability of success measured after Nattempts has a variance of p(1−p)/N, where the true probability of success is. Umbrella sampling was performed using a bias of the system according to the number of base pairs between the substrate and the established strand, and the substrate and the invading strand.

Table 5.5: Leave-one-concentration-out mean and standard deviation for ∆G ◦ at 25 ◦ C and 55 ◦ C, for each complex

Notes on 1D Landscape Models

IEL(5.3, 2.0)'s predictions are marked with filled circles and solid lines, while predictions of the phenomenological model of Zhang and Winfree [147] are indicated with crosses and dashed lines.

Figure 5.14: The sequence-dependent free energy landscape of strand displacement for a 10-base toehold at 25 ◦ C predicted by efn2 for RNA molecules

Appendix to Chapters 3 and 4

Materials and Methods

Each round of dialysis is expected to achieve an approximately 50-fold reduction as 1 ml of purified multistranded fuel species is dialyzed for 2 hours with approximately 50 ml of TE/Na+ buffer using a 2 ml Thermo Scientific Slide-A-Lyzer MINI dialysis device with a 10K MWCO membrane. The procedure for quantifying multistranded fuel species is essentially identical to the procedure for single strands, except for the calculation of extinction coefficients, which involve corrections for hyperchromicity [219].

Measuring individual strand displacement and toehold exchange rates

Sequences from Designs 1, 2, 3 and 4

D4_Rep_Flux_BCj_Top /5IAbRQ/CATCTTCCCTCCACCG D4_Rep_Flux_BCj_Bot AAATGGGCGGTGGAGGGAAGATG/3Rox_N D4_Rep_Flux_BCj_Top /5IAbRQ/CATCTTCCCTCCACCACCG D4_RepGGTGGAAAt_Rep D4_Rep_Flux_CAp_Top /5IAbR Q/CTCTTCACACCACTCT D4_Rep_Flux_CAp_Bot GTAAAGAAGAGTGGTGTGAAGAG/3Rox_N. These species are added at the end of that experiment to extinguish those persistent Helper species in Design 4.