Toward Rules for 1:1 Polyamide:DNA Recognition 2

Purpose. At the forefront of the endeavor to control gene expression by small molecules is the elucidation of chemical principles for direct read-out of predetermined sequences of double-stranded DNA. Although there exists a significant body of literature on the 1:1 mode of binding (Lown et al., 1986), much of this was carried out before quantitative footprinting methods were introduced to the field. In an effort to characterize more rigorously the 1:1 motif, we address several questions quantitatively: 1) Can a 1:1 recognition code be established, which uses individual ligand residues (e.g. Py, Im, Hp or β) to specify individual Watson•Crick base pairs? 2) How is orientation related to overall DNA sequence type? and 3) What is the effect of ligand size on binding affinity in the 1:1 motif?

Specificity of Py, Im, Hp and β

Approach. The polyamide Im-β-ImPy-β-Im-β-ImPy-β-Dp (2) was chosen as the template to examine the specificity at Im, Py and β residues in an oriented 1:1 complex with DNA. Because Hp/Py pairs were shown to discriminate T•A from A•T in the 2:1 motif, a second ligand Im-β-ImHp-β-Im-β-ImPy-β-Dp (4) was prepared to explore any possible specificity the Hp residue may have in a 1:1 complex. Specificity at a single and unique carboxamide position was determined by varying a single base pair within the parent sequence context, 5’- AAAGAGAAGAG-3’, to all four Watson•Crick base pairs and comparing the relative affinities for the four possible complexes. To meet this end, three

2 The text of this section is taken from Urbach and Dervan, 2001.

Figure 13 Examination of sequence selectivity at a single imidazole (Im), beta alanine (β), pyrrole (Py), or hydroxypyrrole (Hp) position within the parent context, 5’-AAAGAGAAGAG-3’. Imidazole and pyrrole rings are represented as shaded and nonshaded circles, respectively; β-alanines are shown as gray diamonds; and hydroxypyrrole is indicated by a circle containing the letter H.

plasmids were cloned, each containing four binding sites: pAU8 (for Im), 5’- AAAGAXAAGAG-3’; pAU15 (for β), 5’-AAAGXGAAGAG-3’; and pAU12 (for Py and Hp) 5’-AAAGAGAXGAG-3’, where X = A, T, G, and C (Figure 13). The Hp- containing polyamide (4) was synthesized by solid phase methods, as described previously (Urbach et al., 1999).

DNA Binding Affinity and Sequence Specificity. Quantitative DNase I footprinting was carried out for polyamides 2 and 4 on PCR products of pAU8, pAU15, and pAU12 (Figure 14). The variable base pair position was chosen opposite the amino acid residue in question. Specificity of Im: polyamide 2 binds the four DNA sites 5’-AAAGAXAAGAG-3’ (X =A, T, G, C), with similar high affinities, K_a = 2.6 – 1.1 x 10^{1 0} M^-1, revealing that Im tolerates all four base pairs.

5'-A A A G A X A A G A G-3' +

3'-T T T C T Y T T C T C-5' 5'-A A A G X G A A G A G-3'

3'-T T T C Y C T T C T C-5' 5'-A A A G A G A X G A G-3'

3'-T T T C T C T Y C T C-5' 5'-A A A G A G A X G A G-3'

3'-T T T C T C T Y C T C-5'

2 Im = Polyamide

4 β =

Py =

Hp = ^H

Sequence

X•Y = A•T, T•A, G•C, C•G

Figure 14 Quantitative DNase I footprint titration experiments for polyamide 2 on the 298 bp, 5’-end- labelled PCR product of plasmids pAU8 (A), pAU15 (B), and pAU12 (C), as well as polyamide 4 on the PCR product of pAU12 (D): (A) and (C), lane 1, intact DNA; lane 2, G reaction; lane 3, A reaction; lane 4, DNase I standard; lanes 5 – 14, 300 fM, 1 pM, 3 pM, 10 pM, 30 pM, 100 pM, 300 pM, 1 nM, 3 nM, 10 nM 2, respectively; (B) lane 1, intact DNA; lane 2, G reaction; lane 3, A reaction; lane 4, DNase I standard; lanes 5 – 15, 100 fM, 300 fM, 1 pM, 3 pM, 10 pM, 30 pM, 100 pM, 300 pM, 1 nM, 3 nM, 10 nM 1, respectively; (D) lane 1, intact DNA; lane 2, G reaction; lane 3, A reaction; lane 4, DNase I standard; lanes 5 – 14, 10 pM, 30 pM, 100 pM, 300 pM, 1 nM, 3 nM, 10 nM, 30 nM, 100 nM, 300 nM 4, respectively. Each footprinting gel is accompanied by the following: (above) binding schematic with the mutated position boxed; (left, top) chemical structure of the monomer of interest; and (left, bottom) Langmuir binding isotherms for the four designed sites. θnorm values were obtained using a nonlinear, least- squares algorithm.

(A)

X•Y T•A

A•T

C•G

G•C

Intact DNase IG

1 2 3 4 5 6 7 8 9 10 11 12 13 14^A 2

5'-C C A A A G A X A A G A G A A A C C-3' 3'-G G T T T C T Y T T C T C T T T G G-5'

θapp

-0.2 0 0.2 0.4 0.6 0.8 1 1.2

10-1410-1310-1210-1110-1010-910-810-7 A•T

T•A G•C C•G

[ polyamide ]

-0.2 0 0.2 0.4 0.6 0.8 1 1.2

10-1410-1310-1210-1110-1010-910-810-7 A•T

T•A G•C C•G

[ polyamide ] θapp

(C)

X•Y C•G

G•C

T•A

A•T

Intact DNase IAG

1 2 3 4 5 6 7 8 9 10 11 12 13 14 2

5'-C C A A A G A G A X G A G A A A C C-3' 3'-G G T T T C T C T Y C T C T T T G G-5'

Intact DNase IAG

X•Y C•G

G•C

T•A

A•T 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

-0.2 0 0.2 0.4 0.6 0.8 1 1.2

10-14 10-13

10-12 10-11

10-10 10-9

10-8 10-7 A•T

T•A G•C C•G

[ polyamide ] θapp

(B)

5'-C C A A A G X G A A G A G A A A C C-3' 3'-G G T T T C Y C T T C T C T T T G G-5'

-0.2 0 0.2 0.4 0.6 0.8 1 1.2

10-12 10-11

10-10 10-9

10-8 10-7

10-6 10-5 A•T

T•A G•C C•G

qapp

[ polyamide ] θapp

5'-C C A A A G A G A X G A G A A A C C-3' 3'-G G T T T C T C T Y C T C T T T G G-5'

Intact DNase IG

1 2 3 4 5 6 7 8 9 10 11 12 13 14^A 4

(D)

X•Y

C•G

G•C

T•A

A•T

O HN

N O HN OH N

N O HN

Specificity of Py and β: polyamide 2 binds the target sites, 5'-AAAGAGAXGAG-3' and 5’–AAAGXGAAGAG-3’, respectively (X = A, T, G, C), with high affinity and in both cases displays a preference for A•T and T•A > G•C and C•G by at least a factor of 10. Remarkably, substituting one Py residue with Hp afforded the most specific polyamide (4), which binds the sequences 5’-AAAGAGAXGAG-3’ with a modest loss in affinity, characteristic of the Hp residue (White et al., 1998;

Kielkopf et al., 1998b), and with a tenfold single site preference for X = A > T >

G/C (Table 2). Langmuir binding isotherms for each complex fit well to an n = 1 Hill equation (see Experimental for equation), which is consistent with a 1:1 ligand:DNA stoichiometry (Figure 14). To establish that the specificity of Im and β is not position dependent, controls were performed on polyamide 2 at different Im (pAU16, 5’-AAAGAGAAXAG-3’) and β positions (pAU13, 5’- AAAGAGXAGAG-3’). The observed complex affinities and sequence specificities were similar to that described above for pAU8 and pAU15, respectively. The more significant effect of position on the sequence specificity of Hp is presented in the next section.

A•T T•A G•C C•G

Im β Py Hp Residue

2.4 x 10^{( 0.1)}^± ¹⁰ 1.3 x 10^{( 0.1)}^± ¹⁰ 4.3 x 10^{( 1.4)}^± ⁸ 7.8 x 10^{( 1.9)}^± ⁸ 2.5 x 10^{( 0.2)}^± ¹⁰ 1.1 x 10^{( 0.1)}^± ¹⁰ 2.6 x 10^{( 0.4)}^± ¹⁰ 1.3 x 10^{( 0.3)}^± ¹⁰

3.4 x 10^{( 0.3)}^± ¹⁰ 1.5 x 10^{( 0.2)}^± ¹⁰ 1.8 x 10^{( 0.2)}^± ⁹ 8.6 x 10^{( 1.5)}^± ⁸ 1.6 x 10^{( 0.2)}^± ⁹ 1.3 x 10^{( 0.2)}^± ⁸ 4.9 x 10^{( 0.8)}^± ⁷ 1.0 x 10^{( 0.3)}^± ⁷

(2)^c (2) (2) (4)

a Values reported are the mean values from at least three DNase I footprint titration experiments, with the standard deviation given in parentheses. b Assays were performed at 22 o

C in a buffer of 10 mM Tris•HCl, 10 mM KCl, 10 mM, MgCl2, and 5 mM CaCl2 at pH 7.0. c The number in parentheses indicates the compound containing the unique residue.

Table 2 Equilibrium Association Constants, K_a (M^-1)^a,b

28 Discussion. This quantitative study helps to elucidate the current state-of- the-art for 1:1 polyamide:DNA complexes and creates a baseline for future specificity studies. An Im residue binds each of the four Watson•Crick base pairs with high affinity, whereas a β or Py residue prefers A,T over G,C base pairs but does not discriminate A from T. Steric inhibition between the exocyclic amino group of guanine and the Py and β residues may explain their A,T preference, as previously suggested by Dickerson from x-ray structural analysis of 1:1 complexes. Based on the study of netropsin bound in a 1:1 complex with DNA, the promiscuous nature of the Im residue accepting G,C as well as A,T was anticipated (Kopka et al., 1985). The unanticipated result of this study is the observation that an Hp residue in this polyamide sequence context can distinguish one of the four Watson•Crick base pairs. Whether the hydroxyl moiety lies asymmetrically in the cleft between A and T and makes a specific hydrogen bond to the O2 of T, as observed for Hp/Py recognition of T•A (Kielkopf et al., 1998b), is unclear. However, the fact that a single aromatic carboxamide residue can select one of the four Watson•Crick base pairs within the 1:1 motif is an encouraging step toward a set of rules for DNA recognition similar to the 2:1 motif. Whether new aromatic residues can be invented to complete a 1:1 recognition code is addressed in the next section.

Figure 15 Family of five-membered aromatic heterocycles.

Specificity of Novel Heterocyclic Amino Acids

Approach. In an effort to improve upon the 1:1 recognition code established in the previous section, we address the issues of whether new heterocycles can be developed to discriminate between the four Watson-Crick base pairs in the 1:1 motif, and whether we can understand the relationship between overall heterocycle structure and DNA sequence specificity. This specificity has been attributed largely to the unique functionality presented by each heterocycle to the floor of the minor groove. However, little has been done to assess the ramifications of functional groups pointing away from the DNA.

Figure 15 shows a family of five-membered aromatic, heterocyclic residues grouped into columns by the type of functionality directed toward the DNA minor groove. Py and 1H-pyrrole (Nh) project a hydrogen with positive

3 The text of this section is taken from Marques et al., in preparation.

Fr Py

RHN COR'

RHN O COR' RHN N COR'

RHN S COR' N

RHN COR'

H N

RHN COR'

Hp Th

Z X

RHN COR'

RHN S COR' N

RHN COR'

RHN N COR'

Nt Tn

30 potential toward the DNA; Im, 5-methylthiazole (Nt), and furan (Fr) project an sp² lone pair from nitrogen or oxygen; Hp and 3-hydroxythiophene (Ht) project a hydroxyl group; and 4-methylthiazole (Th) and 4-methylthiophene (Tn) project an sp² lone pair from sulfur. Comparative analysis of new residues within this five-membered heterocyclic framework should enable us to retain overall ligand morphology and to observe the effects of small structural changes, such as single atom substitution, on DNA base pair specificity. Specificity at the unique carboxamide position was determined by varying a single base pair (X) within the sequence context 5’-AAAGAXAAGAG-3’ to all four Watson•Crick base pairs and comparing the relative affinities for the four possible complexes. Ab initio computational modeling of the heterocyclic amino acids was implemented to derive their inherent geometric and electronic parameters. The combination of these techniques has provided an interesting perspective on the origin of DNA sequence discrimination by polyamides.

Synthesis. Synthesis of Boc-protected Nh, Fu, Ht, Nt, and Tn amino acids and the corresponding polyamides Im-β-ImPy-β-X-β-ImPy-β-Dp (X = unique heterocycle) required new solution and solid-phase synthetic methodologies, which were developed by Michael Marques and Raymond Doss and will be reported elsewhere (Marques et al., in preparation). The structures of polyamides Im-β-Im-Py-β-Py-β-Im-Py-β-Dp (5), Im-β-Im-Py-β-Hp-β-Im-Py-β- Dp (6), Im-β-Im-Py-β-Nh-β-Im-Py-β-Dp (7), Im-β-Im-Py-β-Ht-β-Im-Py-β-Dp (8), Im-β-Im-Py-β-Fr-β-Im-Py-β-Dp (9), Im-β-Im-Py-β-Nt-β-Im-Py-β-Dp (10), Im-β- Im-Py-β-Tn-β-Im-Py-β-Dp (11), and Im-β-Im-Py-β-Th-β-Im-Py-β-Dp (12), with the parent Im polyamide (2) are shown in Figure 16.

Figure 16 Chemical structures for 1:1 polyamides containing novel heterocyclic residues, with the variable positions in the central ring indicated by the letters X, Y, and Z. A binding model is shown at top with the variable polyamide position indicated by a circle containing the letter X proximal to the variable base pair, X•Y.

DNA Binding Affinity and Sequence Specificity. Quantitative DNase I footprintingwas carried out for polyamides 5 – 12 on the 298 bp PCR product of pAU8 (Figures 17 and 18). The variable base pair position was installed opposite the amino acid residue in question. Equilibrium association constants (K_a) for 1:1 polyamides containing Im, Py, and Hp residues tested against the four Watson- Crick base pairs have been discussed. However, in that study only the Im specificity experiment was performed at the more flexible central residue, as with the new polyamides reported here. Therefore, new polyamides containing Py and Hp residues at the central position have been included in this study for a

5'-A A A G A X A A G A G-3' +

3'-T T T C T Y T T C T C-5' X•Y = A•T, T•A, G•C, C•G

(2) Im-β-ImPy-β-Im-β-ImPy-β-Dp, X = N, Y=N-Me, Z = C-H (5) Im-β-ImPy-β-Py-β-ImPy-β-Dp, X = C-H, Y=N-Me, Z = C-H (6) Im-β-ImPy-β-Hp-β-ImPy-β-Dp, X = C-OH, Y=N-Me, Z = C-H (7) Im-β-ImPy-β-Nh-β-ImPy-β-Dp, X = NH, Y=C-H, Z = C-H (8) Im-β-ImPy-β-Ht-β-ImPy-β-Dp, X = S-OH, Y=S, Z = C-H (9) Im-β-ImPy-β-Fr-β-ImPy-β-Dp, X = O, Y=C-H, Z = C-H (10) Im-β-ImPy-β-Nt-β-ImPy-β-Dp, X = N, Y=C-Me, Z = S (11) Im-β-ImPy-β-Tn-β-ImPy-β-Dp, X = S, Y=C-Me, Z = C-H (12) Im-β-ImPy-β-Th-β-ImPy-β-Dp, X = S, Y=C-Me, Z = N

HN O HN

N HN O

H+N O

N N

HN O ZY

O H

N H

N N

HN O O

N N O HN

HN O N

N O

Figure 17 (A-D) Quantitative DNase I footprint titration experiments for polyamides 5-8, respectively, on the 298 bp, 5’-end-labelled PCR product of plasmid pAU8: (A and B) lane 1, intact DNA; lane 2, G reaction; lane 3, A reaction; lane 4, DNase I standard; lanes 5-15, 100 fM, 300 fM, 1 pM, 3 pM, 10 pM, 30 pM, 100 pM, 300 pM, 1 nM, 3 nM, 10 nM polyamide, respectively. (C) lane 1, intact DNA; lane 2, G reaction; lane 3 A reaction; lane 4, DNase I standard; lanes 5-15, 1 pM, 3 pM, 10 pM, 30 pM, 100 pM, 300 pM, 1 nM, 3 nM, 10 nM, 30 nM, 100 nM polyamide, respectively. (D) lane 1, intact DNA; lane 2, G reaction; lane 3 A reaction; lane 4, DNase I standard; lanes 5-15, 3 pM, 10 pM, 30 pM, 100 pM, 300 pM, 1 nM, 3 nM, 10 nM, 30 nM, 100 nM, 300 nM polyamide respectively. Each footprinting gel is accompanied by the following: (left, top) chemical structure of the residue of interest; and (left bottom) Langmuir binding isotherm for the four designed sites. θnorm values were obtained using a nonlinear least- squares fit.

10 10 10 10 10 10 10 10 1.2

1.0 0.8 0.6 0.4 0.2 0

-0.2-12 -11 -10 -9 -8 -7 -6 -5 10 10 10 10 10 10 10 10 1.2

1.0 0.8 0.6 0.4 0.2 0

-0.2 -13 -12 -11 -10 -9 -8 -7 -6 10 10 10 10 10 10 10 10

1.2 1.0 0.8 0.6 0.4 0.2 0 -0.2

5 6

7 8

-14 -13 -12 -11 -10 -9 -8 -7

1.2 1.0 0.8 0.6 0.4 0.2 0 -0.2

10 10 10 10 10 10 10 10-14 -13 -12 -11 -10 -9 -8 -7

5'-A A A G A X A A G A G-3' +

3'-T T T C T Y T T C T C-5' x

Figure 18 (A-D) Quantitative DNase I footprinting experiments for polyamides 9-12, respectively, on the 298 bp, 5’-end-labelled PCR product of plasmid pAU8: lane 1, intact DNA; lane 2, G reaction; lane 3, A reaction; lane 4, DNase I standard; lanes 5-15, 100 fM, 300 fM, 1 pM, 3 pM, 10 pM, 30 pM, 100 pM, 300 pM, 1 nM, 3 nM, 10 nM, respectively. Each footprinting gel is accompanied by the following: (left, top) Chemical structure of the residue of interest; and (left bottom) Langmuir binding isotherm for the four designed sites. θnorm values obtained using a nonlinear least-squares fit. Isotherms for C and D were generated from gels run out to a final concentration of 1 uM (not shown).

10 10 10 10 10 10 10 10 1.2

1.0 0.8 0.6 0.4 0.2 0

-0.2-14 -13 -12 -11 -10 -9 -8 -7

5'-A A A G A X A A G A G-3' +

3'-T T T C T Y T T C T C-5' x

9 10

11 12

N N S

H O

N S

H O

1.2 1.0 0.8 0.6 0.4 0.2 0 -0.2

10 10 10 10 10 10 1.2

1.0 0.8 0.6 0.4 0.2 0

-0.2-14 -12 -10 -8 -6 -4 10 10 10 10 10 10 10 10 1.2

1.0 0.8 0.6 0.4 0.2 0

-0.2-14 -13 -12 -11 -10 -9 -8 -7

N O

H O

S N N

H O

10 10 10 10 10 10-14 -12 -10 -8 -6 -4

36 more controlled comparison. Polyamide 5 (Py) binds with very high affinity (K_a

~ 6 x 10^{1 0} M^-1) at the X = A, T sites (5'-AAAGAXAAGAG-3') with a 5- to 10-fold preference over X = G, C (Table 3). Polyamide 6 (Hp) binds with lower affinity (K_a ~ 3 x 10⁹ M^-1) but with similar specificity to 5, preferring X = A, T > G, C by 5- to 10-fold. The Nh-containing polyamide (7) bound with very high affinity to the X = A, T sites (K_a = 7.5 x 10^{1 0} M^-1) but with a mere 3- to 5-fold selectivity over the high-affinity X = G, C sites. Compound 8 (Ht) bound with subnanomolar affinities to the X = A, T sites, similar to 6 but with ≥ 40-fold specificity for X = A, T > G, C. Polyamide 9 (Fr) showed high affinity for the X = A, T sites (K_a ~ 10^{1 0} M^-1) with a small 2- to 4-fold preference over X = G, C. The 5-methylthiazole- containing polyamide (10, Nt), which places the thiazole ring nitrogen into the floor of the minor groove, bound all four sites with similar high affinities (K_a ~ 5 x 10⁹ M^-1). Thiophene-containing polyamide (11, Tn) showed modest single-site specificity, binding the X = A site at K_a = 3.0 x 10^{1 0} M^-1 with 5-fold preference over

A•T T•A G•C C•G

Im Py Hp Nh Ht Fr Nt Tn Th Ring

7.2 x 10^{( 0.3)}^± ¹⁰ 5.3 x 10^{( 0.1)}^± ¹⁰ 3.2 x 10^{( 0.4)}^± ⁹ 9.4 x 10^{( 0.2)}^± ⁹ 2.5 x 10^{( 0.2)}± ¹⁰ 1.1 x 10^{( 0.1)}± ¹⁰ 2.6 x 10^{( 0.4)}± ¹⁰ 1.3 x 10^{( 0.3)}± ¹⁰

3.9 x 10^{( 0.1)}^± ⁹ 2.5 x 10^{( 0.3)}^± ⁹ 5.3 x 10^{( 0.5)}^± ⁸ 1.9 x 10^{( 0.5)}^± ⁸ 7.5 x 10^{( 0.2)}± ¹⁰ 7.4 x 10^{( 0.1)}± ¹⁰ 1.6 x 10^{( 0.2)}^± ¹⁰ 2.3 x 10^{( 0.1)}^± ¹⁰ 2.8 x 10^{( 0.5)}^± ⁹ 1.6 x 10^{( 0.6)}^± ⁹ 3.8 x 10^{( 1.3)}^± ⁷ 3.7 x 10^{( 0.7)}± ⁷ 2.2 x 10^{( 0.5)}^± ¹⁰ 1.0 x 10^{( 1.3)}^± ¹⁰ 4.4 x 10^{( 0.5)}^± ⁹ 5.0 x 10^{( 0.5)}^± ⁹ 5.4 x 10^{( 0.9)}^± ⁹ 2.9 x 10^{( 0.6)}^± ⁹ 8.0 x 10^{( 1.3)}^± ⁹ 4.2 x 10^{( 0.6)}^± ⁹ 3.0 x 10^{( 0.2)}± ¹⁰ 5.7 x 10^{( 0.4)}^± ⁹ 8.1 x 10^{( 0.4)}^± ⁷ 8.3 x 10^{( 0.2)}^± ⁷ 1.5 x 10^{( 0.2)}^± ¹⁰ 3.0 x 10^{( 0.7)}^± ⁹ 6.5 x 10^{( 0.5)}^± ⁶ 7.4 x 10^{( 0.5)}^± ⁶

(2)^c (5) (6) (7) (8) (9) (10) (11) (12)

a Values reported are the mean values from at least three DNase I footprint titration experiments, with the standard deviation given in parentheses. b Assays were performed at 22 o

C in a buffer of 10 mM Tris•HCl, 10 mM KCl, 10 mM, MgCl2, and 5 mM CaCl2 at pH 7.0. c The number in parentheses indicates the compound containing the unique residue.

Table 3 Equilibrium Association Constants, K_a (M^-1)^a,b for Polyamides Containing Novel Heterocycles (X) within Im-β-ImPy-β-X-β-ImPy-β-Dp

Figure 19 Geometric and electrostatic profiles for nine heterocyclic amino acids, derived from ab initio molecular modeling calculations using Spartan Essential software (Wavefunction, Inc.). (Top) Schematic illustrating the amide-ring-amide angle of curvature, θ. X, Y, and Z denote variable functionality at the different ring positions for each heterocycle. (Bottom) Table listing the functional groups at X, Y, and Z, along with the angle θ, and the electrostatic partial charge on X. For Ht, Nh, Py, and Hp, the positive charge on X is listed for the H atom.

X = T and ~ 70-fold preference over X = G, C. 3-methylthiophene-containing polyamide (12, Th), which places the thiazole ring sulfur into the floor of the minor groove, bound with similar X = A, T affinity as 11 (Tn) but with > 400-fold preference over X = G, C. In all cases, binding isotherms fit well to an n = 1 Hill equation, which is consistent with a 1:1 polyamide:DNA stoichiometry (Figures 17 and 18).

Calculations. Molecular modeling calculations were preformed by Michael Marques using Spartan Essential software package (Wavefunction Inc.).

Each ring was first minimized using an AM1 model, followed by Ab initio calculations using the Hartree-Fock model and a 6-31G* polarization basis set.

Each heterocycle exhibited a unique geometric and electronic profile (Figure 19).

Bonding geometry for imidazole, pyrrole, and 3-hydroxypyrrole were in excellent agreement with coordinates derived from x-ray structures of polyamides containing these heterocycles (Kielkopf et al., 1998a; 1998b). The overall curvature of each monomer was calculated to be the sweep angle (θ)

θ

θ (degrees) Ring

Fr Nt Ht Nh Im Py Hp Tn Th

126 127 133 136 137 146 148 149 153

-0.31 -0.60 +0.40 +0.34 -0.71 +0.21 +0.50 -0.21 -0.25

X Y Z

O N O-H N-H N C-H O-H S S

C-H C-Me

S C-H N-Me N-Me N-Me C-Me C-Me

C-H S C-H C-H C-H C-H C-H C-H N

X Y

RHN COR'

Charge on X (e)

38 created by the theoretical intersection of the two ring-to-amide bonds in each ring. The structures were ranked by increasing θ as follows: Fr > Nt > Ht > Nh >

Im > Py > Hp > Tn > Th. The ring atom in closest proximity to the floor of the DNA minor groove was examined for partial charge. The structures were ranked by decreasing partial charge on this atom as follows: Hp > Ht > Nh > Py

> Tn > Th > Fr > Im.

Discussion. The single-subunit•DNA complexes of the 1:1 motif provide a relatively flexible system for the exploration of novel recognition elements.

Due to the conformational freedom imparted by the β residues, changes in heterocycle geometry do not have as much of an impact on DNA sequence recognition as in the hairpin motif (Marques et al., in preparation). In fact, all 1:1 compounds in this study bind with high affinity to the X = A, T sites but with varying degrees of X = A, T > G, C specificity. The high-resolution solution structure presented later in this Thesis reveals an important register of amide NH groups with the purine N3 and pyrimidine O2 groups on the floor of the DNA minor groove (Urbach et al., 2002). Given this alignment as a driving force for DNA recognition in the 1:1 motif, one may view the subtle differences in heterocycle curvature as merely placing the central ring atom (X in Figure 19) closer to or farther from the DNA. In this view, increasing the ring curvature decreases the polyamide-DNA intimacy, thereby diminishing DNA specificity.

The results presented here fit well within this ideology.

Polyamides 5 (Py) and 7 (Nh) present a hydrogen with a positive potential to the minor groove floor. Both compounds exhibit a modest 3- to 5-fold selectivity for X = A, T > G, C, but 7 binds with higher affinity to all sites. The selectivity is probably due to the negative steric X-H to G-NH₂ interaction (X =

C3 for Py and N1 for Nh), which was predicted by Dickerson and coworkers for netropsin and supported by NMR studies discussed later in this Thesis (Kopka et al., 1985; Urbach et al., 2002). The higher affinity for 7 may be attributed to a combination of greater positive charge on N1-H and higher ring curvature, both of which should reduce specificity.

Polyamides 2 (Im), 9 (Fr), and 10 (Nt) present a small atom with an sp² lone pair directed toward the minor groove floor. Polyamide 2 was discussed in the previous section, binding all sites with high affinity and displaying virtually no discrimination between sites. 9 and 10 behave quite similarly. It is likely that the small atom (N for Im and Nt or O for Fr) presented to the DNA provides no steric clash with G-NH₂, and therefore all sites are bound with similarly high affinities.

Polyamides 6 (Hp) and 8 (Ht) present a hydroxyl group to the DNA minor groove. Previously in both hairpin and 1:1 systems, hydroxypyrrole successfully discriminated between A•T and T•A base pairs (White et al., 1998;

Urbach and Dervan, 2001). Yet when flanked on both sides by β-alanine residues, as with the Hp compound presented here, single base-pair specificity is lost. This loss may be attributed to a larger degree of conformational freedom afforded to the Hp ring by the two flanking aliphatic linkers, (Urbach et al., 2002).

Nonetheless, both 6 and 8 exhibit significant X = A, T > G, C specificity, as expected from a negative 3C-OH to G-NH₂ steric clash.

Polyamides 11 (Tn) and 12 (Th) present a sulfur atom with an sp² lone pair to the DNA minor groove. These compounds exhibit substantial X = A, T > G, C specificity ranging from ≥ 70 to ≥ 2300-fold. This remarkable selectivity may be attributed to the decreased curvature of thiazole and thiophene rings, which

40 forces a more intimate interaction of the large sulfur atom and the minor groove floor. In the case of X = G and C, this interaction is very negative, resulting in complete loss in measurable binding affinity.

Sequence Dependence of Polyamide Orientation

Approach. Because Im binds all four base pairs at a single position, it would be interesting to ask what happens if one varies simultaneously the base pairs proximal to all four Im residues. To meet this end, the plasmid pAU18 was prepared, which contains the four binding sites 5'-AAAXAXAAXAXAAA-3' (X = G, C, A, and T). Equilibrium association constants were determined for the complexes, and the orientation at each site was determined by affinity cleaving experiments.

DNA Binding Affinity and Sequence Specificity. Quantitative DNase I footprint titrations were performed to determine the equilibrium association constants for polyamide 2 at the designed sites on pAU18 (Figure 20). A slight preference for the X = G site is revealed, although all measured sites are bound with high affinity. Affinity cleavage analysis with polyamide 2 confirms a single orientation when X = G (Figure 20), consistent with 1:1 binding, which reverses when X = C, similar to observations made by Laemmli and coworkers (Janssen et al., 2000a). Although the X = A site lacks a DNase I footprint because of a characteristic lack of cleavage at A-tracts by DNase, the cleavage pattern of 2 is visible by affinity cleavage and is oriented to the 5' side of this binding site, similar to X = G. The cleavage pattern is broader than that observed at the other sites, likely because of an ensemble of slipped complexes. The X = T site reveals cleavage, however, on both sides of the binding site and to different extents.

Dalam dokumen One-to-one motif for DNA recognition by beta-alanine-linked polyamides (Halaman 36-139)