following “inverse” question: given a finite set of formal chemical reaction equations (with speci- fied rate constants) between formal speciesX1, X2, ..., Xn, can we design a set of “real” molecules M1, M2, ..., Mpthat interact in a well-mixed solution to approximate the mass-action kinetics spec- ified by the formal system?
Of course, this question regarding a general strategy for implementing arbitrary formal CRNs is interesting only if CRNs are capable of exhibiting a wide range of dynamical behaviors. This is indeed the case: given a system of polynomial ODEs with nonnegative integer powers, one may explicitly construct a formal CRN some of whose species will approximate the solution to the system of ODEs on the positive orthant, up to arbitrary accuracy over any time interval [87,88].
In fact, the CRN constructed has particularly nice properties: (i) all reactions follow conservation of mass, (ii) have at most two reactants and two products, and (iii) no reactions are autocatalytic.
Soloveichik et al. [110] show that, given a formal CRN, it is indeed possible to engineer a molecular implementation that will, assuming certain “fuel” species are in large excess, approxi- mate the mass action kinetics specified by the formal CRN (up to scaling rate constants). Indeed, they provide a construction which “compiles” any set of formal chemical reactions into a set of DNA strand displacement reactions, which approximate the prescribed dynamics up to arbitrary accuracy. The “DNA implementation” of the formal CRN has a larger set of interacting molecules, some of which represent the formal species of the formal CRN and approximate their dynamics, while the others are auxiliary species that mediate the desired reactions. Following this theoret- ical advance, other such CRN-to-DNA compilation schemes have been proposed [111,224–226].
We describe our DNA strand displacement architecture, adapted from Soloveichik et al. [110], in Section3.2.
a
b c
b
Concentration
Time
Desired dynamics
B+A−→2B C+B−→2C A+C−→2A
k k k
Molecular program DNA dynamics
Concentration
Time
Concentrations of free A, B, C
DNA architecture
signal species A B B B C
fuel species
GGTATAG
5’
AAATGGG GGTGGTTAGTTAGAGTTTACCC CCACCAATCAATCTC 3’
3’
5’
u b
u*
t* b*
u b
u*
t* b*
b t
b u*
t* b*
t
u b
⇔ ⇔
...⇔
Figure 3.2: a. Overview of our CRN-to-DNA efforts. We start with a desired dynamical behavior (oscillation, in this case) and a CRN program that captures the desired dynamics. We then use the CRN-to-DNA scheme described in this chapter to translate the formal CRN into a DNA strand displacement implementation, where the formal species are represented by single strands of DNA called “signal” species. Desired reactions between signal species are mediated by “fuel” species which provide both logic and free-energy for the reaction. Some of the fuel species are mutli- stranded complexes which are pre-prepared and purified. In the regime where the fuel species are at high concentration, the signal species approximate the dynamics of the formal species in the original CRN. Our reactions are performed in “batch reactor” mode, which means that fuel species are not replenished. Therefore, the test tube dynamics is expected to deviate from idealized formal CRN dynamics. b. Domain notation. A “domain” comprises contiguously located bases whose binding and unbinding occurs as one logical unit. * indicates Watson-Crick complementarity.
Arrows indicate 3’ ends. c. Toehold exchange. “Short” (5-7 nucleotide) domains which bind fleetingly to their components at room temperature and reversibly co-localize distinct molecules are called “toeholds”. Here toeholdtreversibly co-localizes the molecules to form a three stranded intermediate, where the two bdomains can exchange base pairs by a process called three-way branch migration. Eventually, either toeholdudissociates (leading to the products) or toeholdt dissociates (leading to the reactants). Notice that the entire process is reversible and toeholducan also carry out toehold exchange.
a
63Br
hBr fB mB sB hAp fA mA sAAp
fX mX sX
Xi
hXi
fY mY sY
Yj
hYj
Bs
hBs fB mB sB}
}
logical unit}
versioning
Figure 3.3: The formal species are represented by single-stranded DNA molecules (signal strands).
Each signal strand comprises a history domain in black (versioning unit, e.g.hBr) and a logical unit. The logical unit comprises three domains: the first toehold (e.g. fB), a branch migration region (e.g.mB), and the second toehold (e.g.sB). Signal strands are designed to not interact with each other. Signal strands with the same logical unit (e.g. Br and Bs) represent the same formal species (B) and are designed to behave identically in solution.
behavior that CRNs are capable of (such as oscillations, chaos, etc.), rather than just the steady state end point, remains elusive.
We now describe our attempt to exploit a modified version of Soloveichik et al. [110]’s CRN-to- DNA scheme to engineer prescribed dynamical behaviors in chemical systems. Figure3.2provides a pictorial overview of our efforts.
Figure 3.3illustrates the single-stranded representation of formal species employed by our scheme. Each formal species (e.g. B) is represented by single strands that contain a history domain (in black, e.g.hBr) followed by 3 logical domains: a first toehold (e.g. fB), a branch migration domain (e.g.mB) and a second toehold (e.g.sB). Strands that have identical logical domains (e.g.
BrandBs) are designed to behave identically in solution, as they both represent formal species B, regardless of their history domain. The reason for this will become clear once the mechanism for implementing reactions is illustrated.
Strands representing formal species (“signal strands”) are designed to have orthogonal do- mains — they are not supposed to interact with each other directly. Desired reactions between signal strands are mediated by auxiliary species. Some of those auxiliary species are fuel species, which are present in large excess at the beginning of the reaction and perform the dual functions of both encoding the logical flow of the desired reactions and providing the required free energy to drive the intended reactions. This design principle ensures that (i) signal strands do not have any sequence inter-dependence and (ii) if a formal CRN, say CRN1, is extended to CRN2, then the DNA implementation of CRN1 may also be extended to a DNA implementation of CRN2 merely
by adding to the test tube fuel species necessary for the additional reactions in CRN2.
Figure3.4illustrates how a general bimolecular reaction of the formB + A→X + Ywould be implemented. Logically, the DNA implementation is performed in two steps. First, the “re- act” step consumes reactants B and A and releasesFluxABi — if and only if both reactants are present. If one or both of the reactants are absent, no irreversible reactions occur. Next,FluxABi
gets consumed in the “produce” step and releases both outputs X and Y. Therefore, taking both react and produce steps together, the reactants B and A have been consumed and the products X and Y have been released. For completeness, we also illustrate how the general unimolecular re- action (B→X), degradation reaction (B→φ), and production reaction (φ→X) are implemented (Figures3.5,3.6and3.7).
The naming scheme we use for the species involved in the reaction pathways in our CRN-to- DNA scheme is both precise and general. By this we mean that, given an arbitrary formal CRN, the naming scheme allows the user to write down the names and molecular specifications for all the species involved in the DNA strand displacement reactions needed to implement the given formal CRN. Moreover, given just the name, the associated molecule can be immediately constructed at the domain level. Essentially, our naming scheme is consistent with a compiler that could be used to generate the DNA implementation for any specified formal CRN.
a
b
+ +
Br
fB mB sB hBr
Ap
fA mA sA hAp
Xi
fX mX sX hXi
Yj
fY mY sY hYj
React step: Br + Ap FluxABi
+
WasteBrAp
sB mB
s*B
sA mA f*A m*A s*A m*B
f*B fB
hBr fA
hAp
FluxABi sA mA
hXi
ReactBAXi
sB mB fA
s*B
sA mA f*A m*A s*A m*B
f*B
hXi fB mB sB
hBr
Br
BackBA sB
mB fA mB sB
s*B
sA mA f*A m*A s*A m*B
f*B fB hBr
hXi
+
fA mA sA hAp
Ap +
Produce step: FluxABi Xi + Yj
+
fX mX sX hXi
ProduceAXiYj
s*A
fY mY
sY
f*Y hXi
hXi* hYj sX mX fX f*X hYj*
s*A
fY mY
sY mA sA hXi
hXi* f*Y
hYj hYj* f*X
HelperXYj hYj
fX fY
Xi
+ +
fY mY sY hYj
Yj
s*A mA sA
WasteAXiYj hXi hXi*
hYj hYj*
fY f*Y fX f*X
Figure 3.4: a. CRN-to-DNA scheme illustrated with the general bimolecular reactionB + A → X + Y. Note that the reactants (Br, Ap) and the products (Xi, Yj) have completely independent sequences. The same mechanism can occur with different versions of the formal species B and A. b. Names of fuel species are enclosed in a dashed box. The reaction is implemented in two steps: React and Produce. The React step is mediated by theReactBAXiandBackBAfuel species. B reacts withReactBAXi to reversibly displaceBackBAby toehold exchange and produces an inter- mediate species. Note that this process exposes the previously sequestered toeholdfA∗. In case Ap is present, it can react with the intermediate by strand displacement using the toeholdfAto irre- versibly displaceFluxABiand produceWasteBrAp.WasteBrAphas no free toeholds and is therefore inert.FluxABireleases the outputs through the produce step, which is mediated by fuel molecules ProduceAXiYjandHelperXYj. FluxABireacts reversibly withProduceAXiYjto release the first out- put Xi and an intermediate species.HelperXYjreacts irreversibly using the newly exposed toehold fX∗ by strand displacement to release the second output Yj andWasteAXiYj.
a
b
Br
fB mB sB hBr
Xi
fX mX sX hXi
React step: Br Flux
BXi+
WasteBr
ReactBXi
sB mB
s*B m*B f*B
hXi fB mB sB
hBr
Br
sB mB
s*B m*B f*B fB hBr
Produce step: Flux
BXiXi
ProduceBXi
s*B hXi hXi*
sX mX fX f*X fX mX sX
hXi
Xi
WasteBXi
s*B mB sB hXi
hXi* fX f*X
+
sB mB
hXi
FluxBXi
fX fX
Figure 3.5: Implementation for the reactionB→X. The same mechanism can occur with different versions of B. Names of fuel species are enclosed in a dashed box.
a
b
Br
fB mB sB
hBr
React step: Br
+
Waste
BrReact
BøsB
mB
s*B m*B f*B fB mB sB
hBr
Br
sB
mB
s*B m*B f*B fB
hBr mB sB
Figure 3.6: Molecular implementation forB→φ. The same mechanism can occur with different versions of B. Names of fuel species are enclosed in a dashed box.
a
b React step: GX Flux
GXi+
WasteGX
ReactGXi
sGX mGX
s*GX m*GX f*GX
hXi fGX mGX sGX
sGX mGX
s*GX m*GX f*GX fGX
Produce step: Flux
GXiXi
ProduceGXi
s*GX hXi hXi*
sX mX fX f*X fX mX sX
hXi
Xi
WasteGXi
s*GX mGX sGX hXi
hXi* fX f*X
+
sGX mGX
hXi
FluxGXi
fX fX
fX mX sX hXi
Xi
GX
Figure 3.7: Molecular implementation forφ→X. Names of fuel species are enclosed in a dashed box.