Libraries and Learning Services
University of Auckland Research Repository, ResearchSpace
Version
This is the publisher’s version. This version is defined in the NISO recommended practice RP-8-2008 http://www.niso.org/publications/rp/
Suggested Reference
Kwon, H., Squire, C. J., Young, P. G., & Baker, E. N. (2014). Autocatalytically generated Thr-Gln ester bond cross-links stabilize the repetitive Ig-domain shaft of a bacterial cell surface adhesin. Proceedings of the National Academy of Sciences of USA, 111(4), 1367-1372. doi:10.1073/pnas.1316855111
Copyright
Items in ResearchSpace are protected by copyright, with all rights reserved, unless otherwise indicated. Previously published items are made available in accordance with the copyright policy of the publisher.
For more information, see General copyright, Publisher copyright,
SHERPA/RoMEO.
Autocatalytically generated Thr-Gln ester bond cross-links stabilize the repetitive Ig-domain shaft of a bacterial cell surface adhesin
Hanna Kwon, Christopher J. Squire1, Paul G. Young, and Edward N. Baker1
Maurice Wilkins Centre for Molecular Biodiscovery and School of Biological Sciences, University of Auckland, Auckland 1010, New Zealand Edited by S. James Remington, University of Oregon, Eugene, OR, and accepted by the Editorial Board November 18, 2013 (received for review September 6, 2013)
Gram-positive bacteria are decorated by a variety of proteins that are anchored to the cell wall and project from it to mediate colonization, attachment to host cells, and pathogenesis. These proteins, and protein assemblies, such as pili, are typically long and thin yet must withstand high levels of mechanical stress and proteolytic attack. The recent discovery of intramolecular isopep- tide bond cross-links, formed autocatalytically, in the pili from Streptococcus pyogeneshas highlighted the role that such cross- links can play in stabilizing such structures. We have investigated a putative cell-surface adhesin fromClostridium perfringenscom- prising an N-terminal adhesin domain followed by 11 repeat domains. The crystal structure of a two-domain fragment shows that each domain has an IgG-like fold and contains an unprece- dented ester bond joining Thr and Gln side chains. MS confirms the presence of these bonds. We show that the bonds form through an autocatalytic intramolecular reaction catalyzed by an adjacent His residue in a serine protease-like mechanism. Two buried acidic residues assist in the reaction. By mutagenesis, we show that loss of the ester bond reduces the thermal stability drastically and increases susceptibility to proteolysis. As in pilin domains, the bonds are placed at a strategic position joining the first and last strands, even though the Ig fold type differs. Bioinformatic analy- sis suggests that similar domains and ester bond cross-links are widespread in Gram-positive bacterial adhesins.
intramolecular ester bond
|
protein stabilityA
striking feature of globular proteins is that despite the chemical diversity inherent in the side chains of their constituent amino acids, chemical reactions between these side chains are very rare.This may be explained by evolutionary selection, which minimizes reactions that could prejudice proper protein folding. Thus, the only common example of a covalent cross-link between protein side chains is the disulfide bond, which forms only in an appropriate redox en- vironment when two Cys residues are brought together by protein folding. Nevertheless, some surprising examples of unexpected cross- links have been brought to light by protein structure analysis or by the observation of unusual spectroscopic or biophysical properties.
Examples include the Cys-Tyr bond in galactose oxidase (1), which provides a radical center; similar bonds in some catalases (2); the His-Tyr bond in cytochrome C oxidase (3); and the remarkable chro- mophore of GFP (4). These, and other examples, arise through intramolecular reactions facilitated by particular local environments.
The recent discovery of isopeptide bonds joining Lys and Asn side chains in the proteins that make up pili on the Gram-posi- tive bacterium Streptococcus pyogenes(5), as well as on other Gram-positive pathogens (6), has highlighted a class of proteins in which intramolecular cross-links seem to be remarkably prev- alent. It includes not only Gram-positive pili but a number of other cell surface adhesins, known as microbial surface components rec- ognizing adhesive matrix molecules (MSCRAMMs) (7). Examples of the latter include the collagen-binding A domain and repetitive B domains from theStaphylococcus aureuscollagen-binding surface protein Cna (8, 9), the fibronectin-binding protein FbaB from
S. pyogenes(10), and the adhesin SspB fromStreptococcus gordonii (11). In contrast to the Gram-positive pili, which are assembled from discrete protein subunits (pilins) by sortase enzymes (12), the MSCRAMMs are typically single polypeptides folded into many domains. What both pili and MSCRAMMs have in common is that they are very long and thin but also subject to large mechanical shear stresses and protease-rich environments.
The pilus components and MSCRAMMs share a common domain organization: an N-terminal signal sequence followed by the protein segment that is to be displayed; a sorting motif (LPXTG or similar) that is processed by a sortase that attaches the protein to the cell wall or incorporates it into pili; and a C-terminal hydrophobic transmembrane segment and short, positively charged tail (13). MSCRAMMs commonly possess an N-terminal functional region followed by a repetitive series of domains that provide a supporting “stalk” that holds the func- tional domain(s) away from the cell surface (9).
Isopeptide bonds, both Lys-Asn and Lys-Asp, now appear to be common in the Ig-like domains that make up the shafts, or stalks, of these structures, providing tensile strength and stability along the length of the assembly (14). These bonds form spon- taneously on protein folding; the hydrophobic environment lowers the pKaof the lysine residue, enabling its nucleophilic attack on
Significance
We describe an unprecedented type of intramolecular cross- link in a protein molecule, which we have found in the repeti- tive domains of a cell surface adhesin from the Gram-positive organismClostridium perfringens. From high-resolution crystal structures of the protein, coupled with MS, we show that these domains contain intramolecular ester bonds joining Thr and Gln side chains. These bonds are generated autocatalytically by a serine protease-like mechanism and provide the long, thin protein with greatly enhanced mechanical strength and pro- tection from proteolytic attack. The bonds provide an intriguing parallel with the internal isopeptide bonds that stabilize Gram- positive pili. Bioinformatics analysis suggests that these intra- molecular ester bonds are widespread and common in cell sur- face adhesion proteins from Gram-positive bacteria.
Author contributions: H.K., C.J.S., P.G.Y., and E.N.B. designed research; H.K. performed research; H.K., C.J.S., P.G.Y., and E.N.B. analyzed data; and H.K., C.J.S., P.G.Y., and E.N.B.
wrote the paper.
The authors declare no conflict of interest.
This article is a PNAS Direct Submission. S.J.R. is a guest editor invited by the Editorial Board.
Data deposition: The atomic coordinates and structure factors have been deposited in the Protein Data Bank,www.pdb.org(PDB ID codes4NI6and4MKM).
See Commentary on page 1229.
1To whom correspondence may be addressed. E-mail: [email protected] or ted.
This article contains supporting information online atwww.pnas.org/lookup/suppl/doi:10.
1073/pnas.1316855111/-/DCSupplemental.
www.pnas.org/cgi/doi/10.1073/pnas.1316855111 PNAS | January 28, 2014 | vol. 111 | no. 4 | 1367–1372
BIOCHEMISTRYSEECOMMENTARY
the Cγof the Asn/Asp, aided by proton transfer via an adjacent Glu or Asp. The latter also polarizes the C=O bond of the Asn or Asp side chain, resulting in a partial positive charge on Cγ(10, 14). This is essentially a one-turnover autocatalytic reaction de- pendent on the polarity of the environment and the proximity of the reacting groups. So far, the bonds are found in just two types of Ig-like domain, labeled CnaB and CnaA, and appear in characteristic positions in each (14).
In an effort to find how prevalent intramolecular isopeptide bonds are in bacterial cell surface proteins, we carried out a bioinformatic analysis of ∼100 sequences for putative cell wall-anchored proteins (identified by their LPXTG motifs) from a variety of Gram-positive organisms, seeking potential MSCRAMMs. Among these was a putative surface-anchored protein from Clostridium perfringens (GenBank accession no.
EDT23863.1), which we refer to as Cpe0147 in the following discussion. This protein has an N-terminal domain that resem- bles, at the sequence level, the thioester-containing adhesin do- main from S. pyogenes pili (15). This domain is followed by a series of repetitive domains of∼150 residues each that share remarkably high sequence similarity, more than 85% identity between any pair of domains.
Mass spectral analysis of a two-domain fragment showed a loss of 34 Da from the expected molecular mass, suggesting the formation of two isopeptide bonds (a loss of 2×17 Da), yet no sequence pattern characteristic of isopeptide bonds could be
found. Further mass spectrometric and crystallographic studies, reported here, show that these domains contain unprecedented ester bonds joining Thr and Gln side chains, formed by an es- terification reaction in a solvent-exposed environment. A fo- cused search of sequence databases further suggests that these intramolecular ester bonds are a widespread and common fea- ture of cell surface adhesion proteins in Gram-positive bacteria.
Results
Structure Determination. Cpe0147 is an ∼220 kDa cell wall- anchored adhesion protein from C. perfringens. Bioinformatic analysis predicts that it comprises an N-terminal adhesin domain tethered to the cell wall by a shaft composed of 11 repeating domains terminating with a C-terminal cell wall-anchoring motif (LPKTG) (Fig. 1A). The 11 repeat domains are predicted to each have an all β-strand IgG-like fold such as is commonly found in both Gram-positive pili and other MSCRAMMs (6, 16).
The stalk domain sequences are highly conserved with a mini- mum pairwise sequence identity of 85%. In an effort to define the stalk domain boundaries, a construct spanning the first two predicted repeats was expressed inEscherichia coliand purified.
This construct (C2) encompasses residues 292–625 of Cpe0147;
however, for convenience, we number the start of C2 from res- idue 1. An additional construct comprising a single putative domain (residues 8–152 of the C2 construct) was also expressed
Fig. 1. Structure of Cpe0147. (A) Domain structure of Cpe0147 with 11 repeat domains (green) and an adhesin domain (blue). Location and size of C1 and C2 constructs are indicated. (B) Two-domain C2 structure. Strands A and G are linked by an internal ester bond (yellow sticks) such that an extended covalent connectivity (indicated in blue) is propa- gated along the 11 repeat domains from the cell wall to the adhesin domain. Strand G is interrupted by a loop (cyan), which is shown in detail inF. Cal- cium ions are shown as light blue spheres. (C) To- pology of the repeat domain highlights the IgG-like fold, the location of the ester bond (thick black line), and the loop interrupting strand G. (D) Cal- cium site in the linker region. The Ca2+ion is shown as a light blue sphere, the coordinated water is shown as a small red sphere, and metal–ligand bonds are shown as dashed yellow lines. (E) Second calcium site (inside repeat domain) shows two co- ordinated waters. (F) Close-up view of the internal ester bond between Thr-11 and Gln-141 side chains.
Side chains involved in ester bond formation or stabilization are labeled. The loop-interrupting strand G is highlighted in cyan and contains two of the amino acids critical to ester bond formation.
and purified inE. coli. For simplicity, the single- and two-domain constructs are referred to herein as C1 and C2, respectively.
Domain boundaries of C2 were determined by limited pro- teolysis, revealing a highly stable trypsin-resistant fragment of∼32 kDa (a loss of∼6 kDa). Crystals of proteolyzed selenomethionine (SeMet)-substituted C2 protein were subsequently grown, and the structure was solved by single-wavelength anomalous dis- persion (SAD). The crystal structure has one C2 molecule per asymmetric unit and was refined at a resolution of 1.90 Å (R= 19.1%,Rfree=21.9%). Data collection and refinement statistics are shown inTable S1.
Stalk Domain Structure.C2 is an elongated molecule that is∼104 Å long and ∼17 Å wide (Fig. 1B). It contains two IgG-like domains connected by a long linker. The two repeat IgG-like domains are virtually identical with an rmsd of 0.59 Å over 143 aligned Cαatoms. The protein forms aβ-sandwich consisting of opposing three- and four-stranded β-sheets, similar to an IgG constant domain (IgGC). The C2 domain fold differs from the IgGCfold, however, in that one edge strand (D) is switched from the first sheet of the β-sandwich to the second; the two β-sheets have the strand order A-B-E and G-F-C-D (Fig. 1C).
This corresponds to the switched-type Ig fold, as defined by Bork et al. (17).
The interdomain linker is ∼35 Å long comprising residues 140–152. The length of the linker implies interdomain flexibility and may allow large domain motions. The linker is predominantly stabilized by an extended loop, residues 256–274, that projects
∼20 Å up from the C-terminal domain betweenβ-strands F and G. This F-G loop stacks against the linker peptide and makes primarily hydrophobic interactions. A metal ion is bound at the end of the extended loop, octahedrally coordinated by Asp-267 (Oδ1), Asp-269 (Oδ1), Asp-271 (Oδ1), Asn-273 (O), Asp-275 (Oδ2), and a water molecule (Fig. 1D). The coordination envi- ronment and average metal–ligand bond length (2.3 Å) are in- dicative of a Ca2+ion, and although the C2 crystals were grown in the presence of both 30 mM Mg2+and 30 mM Ca2+, the metal ions are modeled as calcium (18). A second Ca2+ ion is co- ordinated by Asp-165 (Oδ1) fromβ-strand A, Asn-181 (O), Asp- 184 (Oδ1), and Gly-185 (O) on a short helix betweenβ-strands A and B, and two water molecules (Fig. 1E).
Both domains of C2 contain equivalent Ca2+ binding sites (Fig. 1B). In the single-domain C1 structure, however, which was crystallized without either Ca2+or Mg2+, neither site contains a bound metal ion (Fig. S1). We conclude that neither site is of high affinity. Thus, although bound calcium ions appear to be a common stabilizing feature of both pilin proteins, for example, both SpaA from Corynebacterium diphtheriae (19) and GBS80
from Streptococcus agalactiae (20), and multidomain adhesins, such asS. gordoniiSspB (11), the Ca2+binding sites in Cpe0147 seem unlikely to play a major role in overall stability. Ca2+
binding may enhance local stability, however, as shown by or- dering of the F-G loop in C2.
The most striking feature of the C2 structure is the presence of two clearly defined Thr-Gln covalent bonds, one in each domain, joining the side chains of Thr-11 and Gln-141 in the first domain and Thr-160 and Gln-290 in the second domain. In each case, the Thr residue is located on the firstβ-strand and the Gln residue is located on the last β-strand of the domain, reminiscent of the isopeptide bonds found in Gram-positive pili (5, 14, 21, 22) (Fig.
1F). However, this apparently spontaneous self-catalytic bond forms in an entirely different environmental context from that of autocatalytic isopeptide bonds.
Intramolecular Ester Bonds.To aid characterization of the covalent linkage we observed, and to probe its formation, a single-domain construct (C1) was produced. The crystal structure of C1 was solved by molecular replacement with a partial C2 model and refined at a resolution of 1.1 Å (R=18.2%,Rfree=21.0%). Data collection and refinement statistics are shown inTable S1. The structure of this single-domain protein (Fig. S1) is very similar to that of the same domain in the C2 protein, with an rmsd over 136 Cαpositions of 0.52 Å. The only significant difference is the loss of the two Ca2+ions, although this has little effect on the protein conformation. As in the C2 structure, there is clearly defined electron density linking the side chains of Thr-11 and Gln-141 (Fig. 2A). A careful analysis of bond geometry and interatomic contacts in this atomic resolution structure suggests that the covalent linkage is an ester bond formed between Thr-11 Oγ1 and Gln-141 Cδ; the side chain amino group of Gln-141 has been eliminated (Fig. 2B). The ester bond is stabilized by hydrogen bonding between Gln-141 Oe1 and a protonated Asp-41 Oδ2.
Asp-41 is itself stabilized by hydrogen bonding with Glu-108, which is buried in the hydrophobic core of the domain. Both Asp-41 and Glu-108 appear protonated as judged by the hy- drogen bonding interactions.
The presence of the ester bonds in both the C1 and C2 protein constructs was independently confirmed by electrospray ioniza- tion (ESI) TOF MS. The molecular masses of C1 and C2 were found to be 16,646 Da and 33,298 Da, respectively, which are 17 Da and 33 Da less than the theoretical masses and consistent with the elimination of one NH3 molecule from each domain (Table S2). The specific location of the ester bond was confirmed by proteolytic digestion of the C1 protein and analysis by liquid chromatography tandem MS (MS/MS). A peptide was identified that gave a parent ion with an m/z of 676.93+containing the
Fig. 2. Internal ester bond. (A) Electron density map (2Fo-Fc omit map contoured at 1σ) shows continuous density unambiguously assigned as an internal ester bond. (B) Stereo view of side chains involved in ester bond formation and stabilization. The hydrogen atoms on His-133, Asp-41, and Glu-108 are shown as small black spheres, and hydrogen bonds are shown as dashed white lines (H-bond distances in angstroms). Water molecules are shown as small light gray spheres.
Kwon et al. PNAS | January 28, 2014 | vol. 111 | no. 4 | 1369
BIOCHEMISTRYSEECOMMENTARY
ester bond joining Thr-11 and Gln-141, in conformity with the crystal structures (Fig. 3 andTable S3).
Like isopeptide bonds in CnaB folds (14), the ester bonds provide a covalent cross-link between the first and lastβ-strands of each domain, and as with isopeptide bonds, the ester bonds contribute to the proteolytic stability of the protein. C1 protein digested with trypsin at 37 °C for 24 h is found to be completely intact when analyzed by SDS/PAGE. In contrast, mutant pro- teins in which the bond is eliminated (T11A or Q141A) were completely digested after 6 h (Fig. S2A).
To investigate the requirements for ester bond formation, the following protein variants were made: T11A, T11S, D41A, E108A, H133A, D138A, and Q141A. All mutants were suc- cessfully expressed and purified, although they eluted as broad peaks on size-exclusion chromatography, suggestive of multiple species, possibly aggregated. With the exception of D138A, none of the mutant proteins contained an ester bond, as determined by MS analysis. The D138A mutation produces a mixed pop- ulation of cross-linked and non–cross-linked protein; immedi- ately after purification fromE. coli,∼50% of the protein has an intact ester bond (Fig. S3A). Incubation of the D138A protein at 37 °C, however, reduces the proportion with an intact ester bond to ∼40% after 48 h (Fig. S3B), and further to<20% of total protein after 150 h (Fig. S3C).
The contribution the ester bond makes to the stability of the C1 construct was examined by CD spectroscopy and differential scanning fluorimetry (DSF). WT type C1 gave a CD spectrum typical of a well-folded all-βprotein, whereas all mutants gave CD spectra characteristic of unfolded proteins (Fig. S2B). Melting curves measured by DSF showed that the WT C1 has a single unfolding curve with a melting temperature (Tm) of 68 °C, in- dicating that the protein is well folded and stable (Fig. S2C). In contrast, the D138A mutant has a much broader unfolding curve, reflecting the heterogeneous ester bond formation, with the species lacking an ester bond starting to unfold at 25 °C. All other mutants display characteristics of unfolded or aggregated protein (23).
Discussion
Proposed Mechanism of Ester Bond Formation. A distinguishing feature of the C2 fold is the presence of a seven-residue insertion in the middle of the lastβ-strand (G) of each domain (Fig. 1F).
Taking the first domain as the example, the insertion of these
seven residues between His-133 and Gln-141 forms a loop that positions His-133 and Asp-138 adjacent to Thr-11 and Gln-141.
In this arrangement, Thr-11, His-133, and Asp-138 form a triad similar to that seen in serine proteases. We therefore propose a mechanism for ester bond formation that is similar to that of the serine protease family (24), in which the hydroxyl oxygen of Thr-11 acts as a nucleophile in attacking the side chain carbonyl carbon of Gln-141; in this case, the Gln side chain is analogous to the peptide substrate of serine proteases (Fig. 4). The reaction appears to be specific to Thr, because no bond is formed when Thr-11 is substituted by Ser in the T11S mutant protein. We assume that steric factors in the state that exists before bond formation are responsible for this discrimination.
Additional residues promote the reaction, as shown by the fact that the mutant proteins H133A, D41A, and E108A do not form the ester bond and D138A forms the bond, but to a reduced extent. His-133 is positioned adjacent to Thr-11 Oγ1, where it is presumed to act as a catalytic base, accepting the hydrogen from the Thr-OH group, and thus enhancing the nucleophilic poten- tial of the Oγ1 oxygen. Although we only see the catalytic site structure after bond formation, we propose that His-133 would also hold the Thr-11 side chain in a suitable geometry for ca- talysis. Asp-138 hydrogen-bonds to His-133, and we propose that it plays a dual role in holding His-133 in a catalytically ideal orientation and in making the nonprotonated His-133 nitrogen more electronegative for proton abstraction. Asp-138 is clearly
Fig. 3. MS/MS spectrum of the peptide generated after trypsin digest of the C1 construct. Fragmentation spectra of the peptides containing the ester bond are shown. A full list of the assigned structures is provided inTable S3.
The observed peaks indicate a stable Thr-11/Gln-141 cross-linked fragment.
amu, atomic mass units.
Fig. 4. Proposed mechanism of ester bond formation. The first step is nu- cleophilic attack of Thr-11 on Gln-141, proton abstraction by His-133, and bond polarization by the Asp-41/Glu-108 pair. The next step highlights an oxyanion intermediate that rearranges, abstracts a proton from His-133, and results in the elimination of NH3. The final state (crystal structure) shows an internal ester bond stabilized by the Asp-41/Glu-108 pair and highlights the unusual pKavalues of these residues. The ester bond is prevented from hy- drolysis by the His-133/Asp-138 interaction.
not essential for the reaction, however, because the D138A mutant is still capable of bond formation, albeit less effectively. On the other side of the active site, a pair of acidic residues also promotes catalysis; both Glu-108 and Asp-41 are buried in the protein in- terior, and hydrogen bonding considerations suggest that both are protonated. Glu-108 hydrogen-bonds to Asp-41, which, in turn, hydrogen-bonds to the carbonyl oxygen of the Gln-141 side chain, increasing the electrophilic potential of the carbonyl carbon.
Nucleophilic attack should produce a tetrahedral intermediate, an adduct of Thr-11 and Gln-141 side chains with a high-energy oxyanion that is stabilized by the“proton shuttle”arrangement of the Glu-108/Asp-41 pair. The pKaof Asp-41 as calculated by PROPKA (25) from the crystal structure coordinates has an unusually high value of 10.7. Although this number may not be completely reliable, it indicates that at biological pH, the side chain is exclusively protonated as is required for stabilizing the oxyanion species; no other chemistry in the active site appears to stabilize the oxyanion. The tetrahedral intermediate breaks down with the ref- ormation of the carbonyl oxygen double bond and a concerted attack by the Gln-141 amino group on the now protonated His-133 residue to abstract a proton; one molecule of ammonia (NH3) is eliminated, resulting in the 17-Da or 33-Da loss of mass observed by MS of the single-domain and double-domain constructs.
The formation of the new bond and the elimination of am- monia produce the equivalent of the acyl intermediate in serine proteases. However, unlike the acyl intermediate of a protease, which is then attacked by water to break the ester bond and regenerate the catalytic site, the ester bond of the C1 protein, as previously noted, is stable; it does not react further, and the ester bond species is essentially trapped. In serine proteases, the his- tidine equivalent to His-133 mediates the hydrolysis of the ester bond (24). In our C1 structure, however, the hydrogen bond between Asp-138 (pKa3.0 as calculated by PROPKA) and His- 133 sequesters the histidine in a conformation in which it is unable to interact with the ester bond. The Asp-138/His-133 hydrogen-bonded pair is also in a position where it could block the entry of a water molecule into an appropriate location for ester bond hydrolysis.
Site-directed mutagenesis of the catalytic residues (as outlined above) supports our proposed catalytic mechanism, showing that in all but one case, ester bond formation is eliminated in mutant proteins. Mutation of Asp-138 to alanine produces a mixed population of ester-bonded and nonbonded protein as outlined above. Does this fit the proposed mechanism? A time-course analysis of ester bond presence in the D138A protein shows that over time, the proportion of ester bond-formed species is re- duced (Fig. S3). Consistent with our hypothesis that the Asp-138/
His-133 pair, in essence, traps the ester bond species, we believe that elimination of this critical interaction allows water-mediated hydrolysis to occur, again mirroring the serine proteases. In our proposed hydrolysis mechanism for the D138A mutant (Fig. S4), water binds between an unrestricted His-133 and the ester bond and His-133 functions as a base in abstracting a water proton and enhancing the nucleophilic attack of the water oxygen on the Gln-141 carbonyl carbon. A tetrahedral intermediate results, with a high-energy oxyanion species that is again stabilized by hydrogen bonding to the Glu-108/Asp-41 pair. The intermediate collapses as the threonine oxygen accepts a proton from His-133 and the oxyanion coalesces back into a carbonyl species; the ester bond has broken, the threonine is regenerated, and we presume that Gln-141 is converted to Glu-141.
Conservation of Side Chain Ester Bonds in Bacterial Cell Surface Proteins.A sequence search suggests that ester bonds of this type are a conserved feature in many cell surface proteins of Gram-positive bacteria. Despite low overall sequence identities between C2 and other proteins, all residues that are presumed to be involved in bond formation appear conserved (Fig. S5).
Both the CD and DSF data show that the ester bonds confer significant stability to the proteins. The WT C1 protein has high thermal stability, unfolding in a single transition at aTmof 68 °C, whereas the mutants that lack the ester bond are destabilized to the point of being unfolded when measured by these techniques.
This contrasts with the case of isopeptide bonds in pilin as de- scribed by Kang and Baker (22), where the loss of the isopeptide bond thermally destabilized the protein by∼25 °C but the pro- tein remained demonstrably folded as measured by DSF. The proteolytic stabilities of the C1 mutants are similarly compro- mised. The WT protein remains intact over 24 h, whereas the mutants that lack ester bonds are almost fully degraded after a few hours of incubation with trypsin.
Several factors may contribute toward the stability of Cpe0147.
The addition of an intramolecular covalent bond provides in- herent stability, but the positioning of these cross-links is highly significant (14). Like the intramolecular isopeptide bonds of bacterial surface proteins, the ester bonds join the first and last β-strands of each domain at a point where molecular dynamics simulations and force distribution analysis suggest the critical point of stress concentration exists (26). Mutagenesis studies of Ig-like domains, coupled with atomic force microscopy analysis, come to similar conclusions (27). In these multidomain proteins, the re- peated ester (or isopeptide) bonds essentially produce a covalent connectivity that extends from the cell wall anchor to adhesin along the axis of the protein and provides the main force-bearing feature of these cell surface proteins: greatly increased tensile strength.
Bond-Forming Chemistry.Why do some cell surface proteins use ester bonds rather than isopeptide bonds to provide strength?
We suggest that this relates to the specific environment at the cross-linking site. For isopeptide bond formation, a hydrophobic environment is critical in manipulating side chain pKa, whereas in domains such as those in Cpe0147, the chemistry that forms the cross-linking ester bond occurs in a relatively solvent-exposed environment and uses a well-established mechanism similar to that of serine/threonine proteases, an example of convergent evolution.
We hypothesize that Asp-138 is a key residue that separates the chemistry of Cpe0147 from that of serine proteases. The Asp-138/
His-133 interaction prevents the hydrolysis of the ester bond after its formation, in essence trapping an acyl intermediate.
Materials and Methods
Bioinformatics. To investigate how widespread intramolecular isopeptide bonds are in Gram-positive bacteria, the Jpred server (www.compbio.dundee.
ac.uk/jpred) (28) was searched using major pilin sequences to produce mul- tiple sequence alignments. Based on sequence alignment and on secondary and tertiary structure prediction (Jpred/I-TASSER) (28, 29), two sequences were chosen as potentially containing intramolecular isopeptide bonds, one of which was Cpe0147.
Gene Synthesis, Protein Expression, and Purification.The C2 construct was synthesized by GENEART (Life Technologies) from the GenBank sequence EDT23863.1 (residues 292–625) with codon optimization. The single-domain C1 construct (residues 8–152 of C2) was PCR-amplified from the C2 gene.
Both constructs were subcloned into pET22b, overexpressed inE. coli as C-terminal His-tagged proteins, and purified by nickel-affinity chromatog- raphy followed by size-exclusion chromatography according to Kang et al.
(30). SeMet-substituted C2 was produced according to a protocol from Sreenath et al. (31) and purified similar to the native protein. Full details of protein expression and purification are given inSI Materials and Methods.
Crystallization and Structure Determination. Crystallization conditions for native and SeMet C2 were screened after incubation with trypsin at a 1:20,000 trypsin/protein (wt/wt) ratio at 310 K for 2 h. Cleavage was monitored by SDS/
PAGE. Crystals were grown in sitting drops. The best native C2 crystals were obtained using a precipitant solution comprising 12.5% (vol/vol) PEG 1000, 12.5% (vol/vol) PEG 3350, 12.5% (vol/vol) 2-Methyl-2,4-pentanediol, 0.03 M MgCl2, 0.03 M CaCl2, and 0.1 M bicine/Trizma base (pH 8.5). The SeMet C2 crystals were macroseeded from native C2 crystals. The best C1 crystals were
Kwon et al. PNAS | January 28, 2014 | vol. 111 | no. 4 | 1371
BIOCHEMISTRYSEECOMMENTARY
obtained with 10% (vol/vol) PEG 20,000, 20% (vol/vol) PEG monomethyl ether 550, 0.02 M amino acids (0.02 M sodiumL-glutamate, 0.02 MDL-ala- nine, 0.02 M glycine, 0.02 MDL-lysine, 0.02 MDL-serine), and 0.1 M 3-mor- pholinopropane-1-sulfonic acid (MOPS)/HEPES (pH 7.5). All three proteins were crystallized at 100 mg/mL. Full details of crystallization protocols are given inSI Materials and Methods. Crystals were flash-cooled without fur- ther cryoprotection. X-ray diffraction data were collected at the Australian Synchrotron (MX1 beamline) to a resolution of 1.9 Å for SeMet C2 and to a resolution of 1.1 Å for C1. Data were processed and scaled with X-ray Detector (XDS) (32) and SCALA (Scala, Inc.) (33) software. The structure of C2 was solved by SAD phasing. Phase determination, density modification, and model building used PHENIX software (34). Model building was completed with Crystallographic Object-Oriented Toolkit (Coot) software (35). The SeMet C2 structure was refined with BUSTER software (36). The C1 structure was solved by molecular replacement using the C2 structure and refined using REFMAC software (37). Final validation used MOLPROBITY (38). Data collection and refinement statistics are shown inTable S1.
MS.SDS/PAGE gel bands containing recombinant protein were excised and trypsinized. Peptides were analyzed using a Q-STAR XL Hybrid MS/MS system (Applied Biosystems) and identified using the MASCOT search engine v.2.0.05 (Matrix Science). Unmatched peptides were manually inspected to identify cross-linked sequences. Accurate protein molecular mass was determined by ESI TOF MS. Samples were analyzed in the positive ionization mode on a Q-STAR XL Hybrid MS/MS system. Data were acquired in them/zrange of 500–1,600. Raw data were deconvoluted to give accurate molecular mass using the Bayesian Protein Reconstruct tool from the Bioanalyst extensions within Analyst QS 1.1 (Applied Biosystems).
Mutagenesis.All mutations were made on the C1 construct using 5′-phos- phorylated primers (Table S4). PCR-amplified products were gel-purified and then ligated at 18 °C overnight. Mutagenesis was sequence-verified. All mutants were expressed and purified similar to WT. Bond formation for each mutant was confirmed by MS. Mutants were subjected to proteolysis as for the native protein.
DSF.Protein thermal unfolding was monitored by the increase in fluorescence of SyproOrange (Sigma), using a real-time PCR device (7900HT Fast RT PCR System; Applied Biosystems). A reaction volume of 50μL contained 30μM protein, 5 μL of 25×SyproOrange, and 15μL of protein storage buffer.
Experiments were performed in triplicate in a 96-well plate with a temper- ature gradient from 25–95 °C in steps of 1 °C/min. Fluorescence emission at 600 nm was plotted as a function of temperature.Tmvalues were fitted to the Boltzmann equation using Microsoft Excel (39).
CD.WT and mutant C1 proteins were buffer-exchanged into 5 mM sodium phosphate buffer (pH 8.0) and 50 mM sodium fluoride to a final concen- tration of 2.5μM. CD spectra were recorded on a PiStar-180 (Applied Pho- tophysics) spectrometer. To obtain overall CD spectra, wavelength scans between 180 and 320 nM were collected at 20 °C using a 2-nm bandwidth, 1-nm step size, and time per step of 2 s. The data were collected over five accumulations and averaged.
ACKNOWLEDGMENTS.We thank Martin Middleditch for help with MS. This research was undertaken on the MX1 beamline at the Australian Synchro- tron. This work was supported by the Marsden Fund of New Zealand, the Health Research Council of New Zealand, and the Maurice Wilkins Centre for Molecular Biodiscovery.
1. Firbank SJ, et al. (2001) Crystal structure of the precursor of galactose oxidase: An unusual self-processing enzyme.Proc Natl Acad Sci USA98(23):12932–12937.
2. Díaz A, Horjales E, Rudiño-Piñera E, Arreola R, Hansberg W (2004) Unusual Cys-Tyr covalent bond in a large catalase.J Mol Biol342(3):971–985.
3. Kaila VRI, Johansson MP, Sundholm D, Laakkonen L, Wikstrom MR (2009) The chemistry of the CuB site in cytochrome c oxidase and the importance of its unique His-Tyr bond.
Biochim Biophys Acta, Bioenerget1787(4):221–233.
4. Barondeau DP, Putnam CD, Kassmann CJ, Tainer JA, Getzoff ED (2003) Mechanism and energetics of green fluorescent protein chromophore synthesis revealed by trapped intermediate structures.Proc Natl Acad Sci USA100(21):12111–12116.
5. Kang HJ, Coulibaly F, Clow F, Proft T, Baker EN (2007) Stabilizing isopeptide bonds revealed in gram-positive bacterial pilus structure.Science318(5856):1625–1628.
6. Kang HJ, Baker EN (2012) Structure and assembly of Gram-positive bacterial pili:
Unique covalent polymers.Curr Opin Struct Biol22(2):200–207.
7. Patti JM, Allen BL, McGavin MJ, Höök M (1994) MSCRAMM-mediated adherence of microorganisms to host tissues.Annu Rev Microbiol48:585–617.
8. Symersky J, et al. (1997) Structure of the collagen-binding domain from aStaphylo- coccus aureusadhesin.Nat Struct Biol4(10):833–838.
9. Deivanayagam CCS, et al. (2000) Novel fold and assembly of the repetitive B region of theStaphylococcus aureuscollagen-binding surface protein.Structure8(1):67–78.
10. Hagan RM, et al. (2010) NMR spectroscopic and theoretical analysis of a spontane- ously formed Lys-Asp isopeptide bond.Angew Chem Int Ed Engl49(45):8421–8425.
11. Forsgren N, Lamont RJ, Persson K (2010) Two intramolecular isopeptide bonds are identified in the crystal structure of theStreptococcus gordoniiSspB C-terminal do- main.J Mol Biol397(3):740–751.
12. Telford JL, Barocchi MA, Margarit I, Rappuoli R, Grandi G (2006) Pili in gram-positive pathogens.Nat Rev Microbiol4(7):509–519.
13. Marraffini LA, Dedent AC, Schneewind O (2006) Sortases and the art of anchoring proteins to the envelopes of gram-positive bacteria.Microbiol Mol Biol Rev70(1):
192–221.
14. Kang HJ, Baker EN (2011) Intramolecular isopeptide bonds: Protein crosslinks built for stress?Trends Biochem Sci36(4):229–237.
15. Pointon JA, et al. (2010) A highly unusual thioester bond in a pilus adhesin is required for efficient host cell interaction.J Biol Chem285(44):33858–33866.
16. Vengadesan K, Narayana SVL (2011) Structural biology of Gram-positive bacterial adhesins.Protein Sci20(5):759–772.
17. Bork P, Holm L, Sander C (1994) The immunoglobulin fold. Structural classification, sequence patterns and common core.J Mol Biol242(4):309–320.
18. Harding MM (2006) Small revisions to predicted distances around metal sites in proteins.Acta Crystallogr D Biol Crystallogr62(Pt 6):678–682.
19. Kang HJ, Paterson NG, Gaspar AH, Ton-That H, Baker EN (2009) TheCorynebacterium diphtheriaeshaft pilin SpaA is built of tandem Ig-like modules with stabilizing iso- peptide and disulfide bonds.Proc Natl Acad Sci USA106(40):16967–16971.
20. Vengadesan K, Ma X, Dwivedi P, Ton-That H, Narayana SVL (2011) A model for group B Streptococcus pilus type 1: The structure of a 35-kDa C-terminal fragment of the major pilin GBS80.J Mol Biol407(5):731–743.
21. Kang HJ, Middleditch M, Proft T, Baker EN (2009) Isopeptide bonds in bacterial pili and their characterization by X-ray crystallography and mass spectrometry.Bio- polymers91(12):1126–1134.
22. Kang HJ, Baker EN (2009) Intramolecular isopeptide bonds give thermodynamic and proteolytic stability to the major pilin protein ofStreptococcus pyogenes.J Biol Chem 284(31):20729–20737.
23. Lavinder JJ, Hari SB, Sullivan BJ, Magliery TJ (2009) High-throughput thermal scan- ning: A general, rapid dye-binding thermal shift screen for protein engineering.J Am Chem Soc131(11):3794–3795.
24. Hedstrom L (2002) Serine protease mechanism and specificity.Chem Rev102(12):
4501–4524.
25. Li H, Robertson AD, Jensen JH (2005) Very fast empirical prediction and ration- alization of protein pKa values.Proteins61(4):704–721.
26. Wang B, Xiao S, Edwards SA, Gräter F (2013) Isopeptide bonds mechanically stabilize spy0128 in bacterial pili.Biophys J104(9):2051–2057.
27. Li H, Carrion-Vazquez M, Oberhauser AF, Marszalek PE, Fernandez JM (2000) Point mutations alter the mechanical stability of immunoglobulin modules.Nat Struct Biol 7(12):1117–1120.
28. Cole C, Barber JD, Barton GJ (2008) The Jpred 3 secondary structure prediction server.
Nucleic Acids Res36(Web Server issue, Suppl 2):W197–W201.
29. Zhang Y (2008) I-TASSER server for protein 3D structure prediction.BMC Bioinformatics 9(1):40.
30. Kang HJ, Paterson NG, Baker EN (2009) Expression, purification, crystallization and preliminary crystallographic analysis of SpaA, a major pilin fromCorynebacterium diphtheriae.Acta Crystallogr Sect F Struct Biol Cryst Commun65(Pt 8):802–804.
31. Sreenath HK, et al. (2005) Protocols for production of selenomethionine-labeled proteins in 2-L polyethylene terephthalate bottles using auto-induction medium.
Protein Expr Purif40(2):256–267.
32. Kabsch W (1988) Automatic indexing of rotation diffraction patterns.J Appl Cryst 21(1):67–72.
33. Evans P (2006) Scaling and assessment of data quality.Acta Crystallogr D Biol Crys- tallogr62(Pt 1):72–82.
34. Adams PD, et al. (2002) PHENIX: Building new software for automated crystal- lographic structure determination.Acta Crystallogr D Biol Crystallogr58(Pt 11):
1948–1954.
35. Emsley P, Cowtan K (2004) Coot: Model-building tools for molecular graphics.Acta Crystallogr D Biol Crystallogr60(Pt 12 Pt 1):2126–2132.
36. Smart OS, et al. (2012) Exploiting structure similarity in refinement: Automated NCS and target-structure restraints in BUSTER.Acta Crystallogr D Biol Crystallogr68(Pt 4):368–380.
37. Murshudov GN, Vagin AA, Dodson EJ (1997) Refinement of macromolecular structures by the maximum-likelihood method.Acta Crystallogr D Biol Crystallogr53(Pt 3):240–255.
38. Davis IW, Murray LW, Richardson JS, Richardson DC (2004) MOLPROBITY: Structure validation and all-atom contact analysis for nucleic acids and their complexes.Nucleic Acids Res32(Web Server issue, Suppl 2):W615–W619.
39. Brown AM (2001) A step-by-step guide to non-linear regression analysis of experi- mental data using a Microsoft Excel spreadsheet.Comput Methods Programs Biomed 65(3):191–200.
Supporting Information
Kwon et al. 10.1073/pnas.1316855111
SI Materials and Methods
Protein Expression and Purification. C1 and C2 constructs were transformed intoEscherichia coliBL21 (DE3) cells. Cells were expressed in ZYP-5052 autoinduction media (1) for 20 h at 28 °C.
Cells were harvested by centrifugation (4,000×gat 277 K for 20 min) and resuspended in lysis buffer [50 mM Tris·HCl (pH 8.0), 300 mM NaCl, 20 mM imidazole] supplemented with Mini EDTA-free protease inhibitor (Roche). Cells were lysed by son- ication, and the lysate was clarified by centrifugation (13,000×g for 30 min). The supernatant was then filtered (0.2 μm) and loaded onto a 5-mL HiTrap Chelating column (GE Healthcare) charged with Ni2+ ions and equilibrated with the lysis buffer.
After washing the loaded column with the lysis buffer, proteins were eluted by a gradient of 20–500 mM imidazole. The proteins were further purified by size-exclusion chromatography using a Superdex 75 16/60 column equilibrated with 50 mM Tris·HCl (pH 8.0) and 150 mM NaCl. Fractions containing the target proteins were concentrated to 100 mg/mL in the elution buffer, 50 mM Tris·HCl (pH 8.0), and 150 mM NaCl.
For selenomethionine (SeMet) C2 expression, the C2 construct was transformed into DL41 (DE3) cells (Stratagene) and expressed in autoinduction media according to the method of Sreenath et al. (2). SeMet C2 protein was purified similar to native C2.
The C1 mutant proteins were expressed and purified as for the WT protein but eluted earlier than the WT from the size- exclusion column, and in broad peaks suggestive of multiple and possibly aggregated species. The mutant proteins could be con- centrated, but only to∼50 mg/mL.
Crystallization.Crystallization experiments of trypsin-treated C2 and native C1 were carried out by sitting drop vapor diffusion (100 nL of protein and 100 nL of precipitant) at 291 K using a Cartesian nanoliter dispensing robot (Genomic Systems) and
a locally compiled crystallization screen with 480 conditions (3).
Subsequent fine screens were performed in hanging drop vapor diffusion experiments with varying concentrations of protein. In these optimizations, reservoir solution (1μL) was added to 1μL of protein on a siliconized coverslip (Hampton Research) and the drops were incubated at 291 K over the reservoir solution.
Crystals for both C1 and trypsin-treated SeMet C2 were seeded by transferring small pieces of the respective native crystals into the hanging drops that had been equilibrated overnight. Multiple rounds of seeding improved the quality of the crystals, especially for the C2 construct.
The final trypsin-treated native and SeMet C2 crystals were obtained using 100 mg/mL protein [in 50 mM Tris·HCl (pH 8.0), 150 mM NaCl] equilibrated with 12.5% (vol/vol) PEG 1000, 12.5% (vol/vol) PEG 3350, 12.5% (vol/vol) 2-Methyl-2,4-penta- nediol, 0.03 M MgCl2, 0.03 M CaCl2, and 0.1 M bicine/Trizma base (pH 8.5). C1 crystals were obtained from 100 mg/mL pro- tein [in 50 mM Tris·HCl (pH 8.0), 150 mM NaCl] equilibrated with 10% (vol/vol) PEG 20,000, 20% (vol/vol) PEG monomethyl ether 550, 0.02 M amino acids (0.02 M sodiumL-glutamate, 0.02 M
DL-alanine, 0.02 M glycine, 0.02 MDL-lysine, 0.02 MDL-ser- ine), and 0.1 M 3-morpholinopropane-1-sulfonic acid (MOPS)/
Hepes (pH 7.5).
Bioinformatics Search for Internal Ester Bonds in Other Proteins.The amino acid sequence of C1 Cpe0147 was blasted against all known nonredundant sequences using the blastp (protein–
protein BLAST) algorithm on the National Center for Bio- technology Information BLAST server (http://blast.ncbi.nlm.
nih.gov). The sequence matches were visually inspected for conservation of the ester bond-forming residues (equivalent to Cpe0147 residues Thr-11, Asp-41, Glu-108, His-133, Asp-138, and Gln-141).
1. Studier FW (2005) Protein production by auto-induction in high density shaking cultures.Protein Expr Purif41(1):207–234.
2. Sreenath HK, et al. (2005) Protocols for production of selenomethionine-labeled proteins in 2-L polyethylene terephthalate bottles using auto-induction medium.Protein Expr Purif40(2):256–267.
3. Moreland N, et al. (2005) A flexible and economical medium-throughput strategy for protein production and crystallization.Acta Crystallogr D Biol Crystallogr61(Pt 10):
1378–1385.
Kwon et al.www.pnas.org/cgi/content/short/1316855111 1 of 8
Fig. S1. Structure of the C1 construct. A single-domain C1 structure highlights sections of strands A and G (blue) linked by the internal ester bond (yellow sticks). The loop interrupting the G strand is shown in cyan. (Inset) Close-up view of a structural superimposition of the calcium binding F-G loop of C2 (calcium is shown as a sphere) and the same loop in C1, which shows a different conformation and disorder, such that the tip of the C1 loop could not be modeled.
Fig. S2. Proteolytic and thermal stability data for the C1 construct. (A) SDS/PAGE analysis of WT, Q141A, and D138A proteins. Lane 1 shows the mass ladder (size in kilodaltons is indicated to the left), lane 2 shows untreated WT, lane 3 shows trypsin-treated WT (6-h incubation), lane 4 shows trypsin-treated WT (24-h incubation), lane 5 shows untreated Q141A, lane 6 shows trypsin-treated Q141A (6-h incubation), lane 7 shows trypsin-treated Q141A (24-h incubation), lane 8 shows untreated D138A, lane 9 shows trypsin-treated D138A (6-h incubation), and lane 10 shows trypsin-treated D138A (24-h incubation). The results show that WT protein is stable to trypsin proteolysis, whereas the Q141A (minus an internal ester bond) is readily proteolyzed. The D138A sample shows a mixed population of ester bond-formed and -unformed protein. (B) CD data for WT and mutant C1 proteins. The WT protein curve is indicative of a folded allβ-sheet protein and is consistent with the crystal structure. The mutant proteins (H133A, E108A, T11S, and Q141A) produce curves that are indicative of random coils.
(C) Differential scanning fluorimetry plot of SYPRO orange fluorescence vs. temperature for WT, D138A, T11A, D41A, E108A, and Q141A proteins. The WT curve is indicative of stable, folded protein. The curves corresponding to all mutants except D138A are indicative of unfolded protein. The curve for the D138A sample suggests a mixed population of ester bond-formed and -unformed protein and is consistent with the SDS/PAGE results.
Kwon et al.www.pnas.org/cgi/content/short/1316855111 3 of 8
Fig. S3. Time course of ester bond hydrolysis in D138A C1 protein. (A) Immediately after purification, a mixed population of ester bond-formed (16,603 Da, calculated and observed) and ester bond-unformed (16,620 Da calculated, 16,621 Da observed) protein is observed at 50:50 ratio. (B) After incubation at 37 °C for 48 h, the formed/unformed ratio is now 40:60. (C) After further incubation for 150 h, the ratio has dropped further to 20:80. Ratios were calculated from integrated curve data. The time course is suggestive of a slow hydrolysis mechanism in D138A protein, as shown in Fig. S4. amu, atomic mass units.
Fig. S4. Proposed D138A hydrolysis mechanism. In the first step, a water molecule initiates nucleophilic attack in concert with protein abstraction by the His- 133 base. The resulting high-energy tetrahedral intermediate containing an oxyanion species is stabilized by the Asp-41/Glu-108 hydrogen bond pair. The tetrahedral species spontaneously collapses, and a proton is transferred from His-133, now acting as an acid, to Thr-11. The Thr-11 side chain is regenerated, and the Gln-141 side chain has been converted to a glutamic acid.
Kwon et al.www.pnas.org/cgi/content/short/1316855111 5 of 8
Fig. S5. Multiple sequence alignment of proteins containing potential ester bonds from various bacterial species. In each of these proteins, the cross-linking Thr and Gln residues are conserved (red), as are the accessory residues equivalent to Asp-41, Glu-108, His-133, and Asp-138 (green). The locations of secondary structure elements (β-strands, green arrows;α-helix, blue cylinder) assigned from the crystal structure and calcium-binding residues (yellow stars) are indicated above the sequences. Sequences shown are as follows:Clostridium perfringens(EDT72456 putative surface-anchored protein) repeat domain 1,C. perfringens (EDT72456 putative surface-anchored protein) repeat domain 2,Streptococcus pneumoniae(EHD91134 Cna B-type domain protein, partial),Parvimonassp.
oral taxon 393 strain F0440 (EGV09726 Cna B-type domain protein),Bulleidia extructa(EFC05496 Cna B-type domain protein),Enterococcus raffinosus (WP_010744397 LPXTG domain-containing protein cell wall anchor domain),Mobiluncus mulieris(EFN93563 LPXTG motif cell wall anchor domain protein), Bacillus thuringiensis(AFQ30314 collagen adhesin protein),Bacillus cereus(EJR64346 LPXTG domain-containing protein cell wall anchor domain),Helcococcus kunzii(EHR34835 hypothetical protein HMPREF9709_00577), andCorynebacterium bovis(ZP_08516882 hypothetical protein CbovD2_04897).
Table S1. Data collection, phasing, and refinement statistics
Parameters SeMet C2 Native C1
Space group P212121 P21
Unit cell, Å 48.19, 63.45, 104.98 48.86, 28.98, 49.31
Data collection statistics
Resolution, Å 19.6–1.9 (1.94–1.90) 19.4–1.10 (1.16–1.10)
Wavelength, Å 0.95370 0.72930
No. of unique reflections 26,100 (1,733) 49,641 (7,065)
Redundancy 14.5 (14.6) 14.9 (14.8)
Completeness, % 99.9 (99.7) 97.7 (95.9)
Mean I/σ 23.3 (1.5) 21.6 (2.2)
CC1/2* 1.00 (0.78) 1.00 (0.89)
Phasing statistics for SeMet
No. of sites 2 Se N/A
Figure of merit 0.26
Autobuilt residues 286
CC 0.78
Refinement statistics
Resolution range, Å 19.6–1.90 19.4–1.10
No. of reflections 26,023 47,054
No. of reflections forRfree 1,332 2,571
R/Rfree, %† 19.1/21.9 18.2/21.0
rmsd from standard bond length/angles, Å/° 0.010/1.23 0.013/1.34
Ramachandran favored, % 98.3 97.8
MOLPROBITY score 99th percentile 93rd percentile
Figures in parentheses are for the high-resolution shell. N/A,▪▪▪.
*Mn(I) half-set correlation CC1/2as calculated by SCALA.
†R=Σhkl❘❘Fo(hkl)❘−❘Fc(hkl)❘❘/Σkhl❘Fo(hkl)❘. TheRvalue is calculated using 95% of the data selected randomly and used in refinement.
Rfreeis calculated from the remaining 5% of the data not used in refinement.
Table S2. Mutant MS details Mutant
Molecular mass observed, Da
Molecular mass
calculated, Da Δobs-calc, Da
Bond formation
WT C1 16,647 16,664 −17 Yes
WT C2 33,265 33,298 −33 Yes
T11A 16,633 16,634 −1 No
D41A 16,619 16,620 −1 No
H133A 16,598 16,598 0 No
D138A 16,619/16,603 16,620 −1/−17 Mixed
Q141A 16,606 16,607 −1 No
Δobs-calc, difference between observed and calculated molecular mass.
Table S3. Tandem MS of a peptide at 676.93+containing Thr11-Gln141 ester bond of Cpe0147 Observed,m/z* Charge Calculated,m/z† Δobs-calc Proposed structure‡
716.37 +3 716.37 0 AQTLIVEKPLEHHHHHH and T (−NH3)
750.05 +3 750.05 0 AQTLIVEKPLEHHHHHH and TT (−NH3)
783.06 +3 783.07 −0.01 AQTLIVEKPLEHHHHHH and TTV (−NH3) 806.74 +3 806.75 −0.01 AQTLIVEKPLEHHHHHH and TTVA (−NH3)
830.43 +3 830.43 0 AQTLIVEKPLEHHHHHH and TTVAA (−NH3)
868.77 +3 868.77 0 AQTLIVEKPLEHHHHHH and TTVAAD (−NH3)
887.77 +3 887.78 −0.01 AQTLIVEKPLEHHHHHH and TTVAADG (−NH3)
*Monoisotropic masses of observed ions.
†Calculated ions. Monoisotropic masses were calculated using the fragment ion calculator (http://db.systemsbiology.
net:8080/proteomicsToolkit/FragIonServlet.html).
‡Loss of 17 Da from losing NH3is shown in parentheses.
Kwon et al.www.pnas.org/cgi/content/short/1316855111 7 of 8
Table S4. Primer sequences
Primer name Primer direction Sequences
C1 cloning Fw TAT AAT GGC CAGCAT ATGACC CTG AAA ACC ACC GTT GCA GC Rev GGC TATCTC GAGCGG TTT TTC CAC AAT CAG GGT CTG G
T11A Fw GCAACC GTT GCA GCA GAT GGT G
Rev TTT CAG GGT CAT ATG TAT ATC TCC TTC TTA AAG
D41A Fw GCAACC GTT GCA GCA GAT GGT G
Rev TTT CAG GGT CAT ATG TAT ATC TCC TTC TTA AAG
H133A Fw GCAGAA GAT AAA AAT GAT AAA GCC CAG AC
Rev TTT CAC CAC CTG TTT GGT ATC CGA TTC
D138A Fw GCAAAA GCC CAG ACC CTG ATT GTG
Rev ATT TTT ATC TTC GTG TTT CAC CAC CTG TTT GG
Q141A Fw GCAACC CTG ATT GTG GAA AAA CCG
Rev GGC TTT ATC ATT TTT ATC TTC GTG TTT CAC Bold letters indicate restriction enzyme sites or mutation sites. Fw, forward; Rev, reverse.