3. Diversity-oriented enzymatic synthesis of cyclopropane building blocks
3.2 Initial activity determination and enhancement via directed evolution
Substituted cyclopropanes are prevalent in pharmaceutical and agrochemical compounds.20–22 The ability to rapidly produce derivatives of a stereopure cyclopropane could be used in lead fragment optimization to assist future drug discovery and development efforts. Boronate ester moieties are ubiquitous in medicinal chemistry due to their robust activity in the presence of a wide range of functional groups and their efficacy in convergent synthesis of complex molecules.12 Therefore, I envisioned that an enzymatically produced pinacolboronate (Bpin)-substituted chiral cyclopropane could be used as a substrate for Suzuki-Miyaura cross-coupling reactions to generate a diverse array of substituted cyclopropanes. In light of our previous studies that showed that heme proteins can be engineered to generate the cyclopropane-containing pharmaceuticals,23-26 I chose to develop a chemoenzymatic strategy to produce a cyclopropane motif with a functional handle, which could then be derivatized to form substituted chiral cyclopropanes. To realize this approach, I set out to engineer heme proteins to catalyze the stereoselective cyclopropanation via carbene transfer of ethyl diazoacetate (EDA) and vinylboronic acid pinacol ester, generating 2-(4,4,5,5-tetramethyl-1,3,2-dioxaborolan-2-yl)-cyclopropanecarboxylic acid ethyl ester (Figure 3-3).
Figure 3-3. Proposed enzymatic reaction of vinylboronic acid pinacol ester (1) and ethyl diazoacetate (2) to form the cyclopropylboronate ester (3).
cytochromes c, and globins (Supplementary Information, Table 3-1). The activity was determined by GC-MS, with mass fragmentation and retention time of the cyclopropylboronate compared to the authentic product standard. The best initial activities for the formation of cis- and trans- cyclopropylboronate were shown by Aeropyrum pernix protoglobin W59A Y60G F145W (ApePgb AGW) and Rhodothermus marinus nitric oxide dioxygenase Q52A (RmaNOD Q52A), respectively. Both variants were developed during the engineering for the cyclopropanation of linear, aliphatic alkenes.28 These proteins were also tested for their ability to use alternative commercially available boronate esters such as MIDA ester and dibutyl ester, but only the pinacol ester showed detectable cyclopropanation activity. The initial stereoselectivity was good, with ApePgb AGW catalyzing the formation of the cyclopropylboronate ester with 96:4 diastereomeric ratio (dr) and 96% enantiomeric excess (ee, determined by chiral GC-FID).
Single site-saturation libraries were generated for ApePgb AGW at amino-acid residues 63, 86, and 90. Hits were identified in screening, but subsequent validation of protein variants identified as hits in screening failed to verify improved variants. In parallel, screening RmaNOD Q52A for enhanced trans-cyclopropylboronate activity yielded variant RmaNOD Y32T Q52A, with 420 TTN and an inversion of diastereoselectivity from 17:83 cis:trans to 90:10 cis:trans (Figure 3-4). RmaNOD Y32T Q52A produced the same major cis- enantiomer as ApePgb AGW with 99% ee, but with higher activity. We therefore used the RmaNOD Y32T Q52A variant for further evolution.
Figure 3-4. Location of residues 32 and 52 in the RmaNOD scaffold, based on the RmaNOD Q52V crystal structure (PDB ID: 6WK3). Residues Y32 and A52 are shown in blue. The heme-bound acetate is omitted for clarity.
As we had thus far only targeted a small subset of the active-site residues, we expanded the search to include several more active-site residues. In order to screen a larger sequence space without increasing the overall screening time with a medium-throughput GC-based screening method, we opted to reduce redundancy by screening 44 clones per library rather than the usual 88. Screening only 44 clones for a 22-member library still has an 86% library coverage (compared to 98% coverage for 88 clones per library) while allowing for higher sequence- space coverage (Supplementary Information, Figure 3-11).
Using RmaNOD Y32T Q52A as parent, in a second round of site-saturation mutagenesis, we targeted residues M31, F36, Y39, F46, L48, P49, I53, L56, R79, R86, V89, and L101, screening 44 colonies per library (Figure 3-5). This round of engineering yielded beneficial mutations at positions 39, 48, 79, and 86. Residues 39 and 48 are in the distal heme pocket, but R79 and R86 coordinate with the heme carboxylate; the underlying cause for the activity enhancement at these positions is unclear. It is noteworthy that the mutations at R79 enhanced activity at the cost of diastereoselectivity, while mutation of residues 39 and 48 enhanced diastereoselectivity. We therefore recombined the mutations at these positions; the
combination of mutations Y32T, Y39H, L48R, Q52A, and R79W was found under screening conditions to both improve activity and stereoselectivity. This variant, RmaNOD THRAW, displays 1300 total turnovers (TTN) while maintaining high dr (94:6 cis:trans) and greater than 99% ee (Figure 3-6a).
Figure 3-5. RmaNOD scaffold (PDB ID: 6WK3) displaying residues targeted in the cis-selective lineage. Residues targeted for mutagenesis are shown as α-carbon spheres. Positions at which the final variant contained a mutation relative to RmaNOD WT are shown in blue; other targeted positions are shown in green.
Figure 3-6. Activity and diastereoselectivity of the cis- and trans- specific lineages for the formation of 3. Variant names are listed as parent protein plus new amino-acid substitution(s). Data points are from biological duplicates of technical triplicates.
In parallel to the engineering which generated the cis-selective variant RmaNOD THRAW, we targeted the RmaNOD scaffold to engineer a trans-selective protein variant. The trans- selective lineage evolution was led by my collaborator, Bruce Wittmann. Site-saturation mutagenesis libraries were generated at positions 31, 36, 46, 49, 53, 56, 79, 89, and 101 using
RmaNOD Q52A as the parent (Figure 3-7). We screened 44 colonies per library and identified mutation L101N, which improved diastereoselectivity for the trans-product without significantly affecting enzyme activity (Figure 3-6b). In the second generation, site- saturation mutagenesis libraries were generated at positions 35, 42, 53, 56, 60, 79, 96, 97, and 125, using RmaNOD Q52A L101N as the parent. Here, we identified beneficial mutations at positions 56 and 60, with L60H as the most beneficial one (defined as enhancing activity the most while retaining diastereoselectivity).
Figure 3-7. RmaNOD scaffold (PDB ID: 6WK3) displaying residues targeted in the trans-selective lineage. Residues targeted for mutagenesis are shown as α-carbon spheres. Positions at which the final variant contained a mutation relative to RmaNOD WT are shown in blue; other targeted positions are shown in green.
Because the activity of RmaNOD Q52A L60H L101N was still low, we continued with two further rounds of site-saturation mutagenesis. Using RmaNOD Q52A L60H L101N as the parent, in the first round, we targeted positions 27, 28, 43, 47, 48, 55, 59, 94, 105, 111, and 121, identifying beneficial mutations at positions 55 and 105, with mutation I105M being
most beneficial. In the final round of site-saturation mutagenesis, we increased the coverage of the first- and second-shell residues by targeting positions 16, 19, 20, 23, 24, 27, 28, 31, 36, 46, 49, 56, 79, 98, 100, 108, 109, 112, 117, and 118, now using RmaNOD Q52A L60H L101N I105M as the parent. From these libraries, we discovered the beneficial mutations L20W, M31F, L56A, and L56I. In a recombination library of these four mutations, we identified RmaNOD L20W Q52A L56I L60H L101N I105M as the combination with the greatest enhancement in activity with 2300 TTN, while maintaining greater than 99:1 trans:cis dr and 99% ee (Figure 3-6b).