ABSTRACT
4. Engineering of Protein Nanomaterials 1 In Silico Design of Protein Nanomaterials
Computational design of protein-based nanomaterials represents an intriguing avenue for nanomaterial engineering, and can be essentially
built on the techniques of in silico protein design. Rational design of new protein molecules aims at introducing novel function and/or behavior to a target protein structure (Richardson and Richardon 1989). Therefore, protein design is partially equivalent to protein structure-function prediction.
Proteins can be designed de novo by fully automated sequence selection (Dahiyat and Mayo 1997, Harbury et al. 1998, Kuhlman et al. 2003) or be redesigned by making calculated variations on a known protein structure and its sequence (Wu et al. 2010). Rational protein design approaches make protein-sequence predictions that will fold to specifi c 3D structures. These predicted structures can then be validated experimentally at the atomic level through x-ray crystallography, electron microscopy and nuclear magnetic resonance spectroscopy (Harbury et al. 1998). In the 1970s, initial protein design approaches were based mostly on sequence redesign and composition and did not involve in the prediction of designed 3D protein structures and specifi c interactions between side chains at the atomic level.
Recent advancement in molecular force fi elds, protein design algorithms, and structural bioinformatics, such as libraries of preferred side-chain conformations, have enabled the development of computational protein design tools that can predict the 3D atomic models of designed protein structures effi ciently in many cases. These computational tools make thorough calculations on protein energetics, structures, interactions and fl exibility, and perform exhaustive searches over enormous conformational spaces. Thanks to these developments, in silico rational protein design has become one of the most important tools in protein engineering.
In rational protein design, amino acid sequences are predicted so that the protein of designed sequence is expected to fold into a specifi c 3D structure that hopefully possesses pre-defi ned properties and functions. Given the functional constraints, the number of possible amino acid sequences may still grow exponentially with the size of the protein chain, and can therefore be astronomical for a large protein subunit design. Based on the energy landscape theory of protein folding, only a very small subset of them would fold reliably to a native structure. Protein design involves identifying novel sequences within this subset. Therefore, protein design is the search for sequences that have the chosen structure as a free energy minimum. In other words, a tertiary structure is designed and specifi ed based on its predicted function, and a sequence that can fold into it is to be screened and selected.
Protein design can then be translated into an optimization problem, that is, how to choose an optimized sequence that will fold to the desired structure using certain scoring criteria and energy functions. Recently, the principle of protein design was extended to predict the protein-protein interaction interfaces that may guide the self-assembly of de novo designed proteins into higher-order architecture and biomaterials (King et al. 2012, King et al. 2014). A general computational method for designing self-assembling
protein nanomaterials was proposed, which consists of two steps:
(1) symmetrical docking of protein building blocks in a target symmetric architecture, followed by (2) design of low-energy protein-protein interfaces between the building blocks to drive self-assembly. The approach was used to design a 24-subunit, 13-nm diameter complex with octahedral symmetry and a 12-subunit, 11-nm diameter complex with tetrahedral symmetry (King et al. 2012). This approach was later further generalized to the design of co-assembling of multi-component protein nanomaterials (King et al.
2014). In this case, the program design program, RosettaDesign (Leaver-Fay 2011) was used to sample the identities and confi gurations of the side chains near the inter-building-block interface, predicting interfacial interactions with features resembling those found in natural protein assemblies such as well-packed hydrophobic cores surrounded by polar side chains. The outcome is a pair of designed amino acid sequences, one for each building block component, which can stabilize the multi-component interface and drive assembly toward the target confi guration. The approaches essentially pave the way to the design of novel protein-based molecular machines with programmable structures, dynamics and functions.
4.2 Directed Evolution of Protein Nanomaterials
In contrast to computational protein design, directed evolution represents a generic experimental approach widely used in protein engineering, which mimics the process of natural selection and genetic evolution cycles in a laboratory-adapted setting to evolve engineered proteins toward a pre-defi ned functional feature (MacBeath et al. 1998, Lutz 2010). Directed evolution is used both for protein engineering as an alternative to designing proteins rationally, as well as studies of fundamental evolutionary principles in a controlled, laboratory-adapted environment (Voigt et al. 2000). The directed evolution can be performed either in vivo or in vitro. It consists of three major steps in each round of selection or evolution (Dalby 2011).
First, a gene is subject to iterative rounds of mutagenesis, creating a library of variants. Second, the variants are expressed in an appropriate host cells and isolating members with the desired function are isolated and selected for the next step. Third, one selected variant is amplifi ed and used as a template for the next round. In principle, evolution takes place upon three aspects of events (Barrick et al. 2009). First, there must be variation between replication of the ancestor gene; second, the variation of offspring causes fi tness differences upon which s election of certain descendants occurs; third, the feature introduced in the variation is heritable in the next round of selection. In directed evolution, a single gene is evolved in multiple rounds of mutagenesis, selection, and amplifi cation. These steps in each round are typically repeated, using the best variant from one
round as the template for the next to achieve incremental improvements of re-engineered functional features. The chance of success in a directed evolution experiment is related directly to the total library size, as evaluating more mutants increases the chances of finding one with the desired properties. Directed evolution has been widely adopted in biotechnology industry to re-engineer utility proteins to improve their enzymatic activity such as DNA polymerase frequently used in molecular cloning (Arndt and Muller 2007). The combination of directed evolution and computational de novo design will likely open up more opportunities to explore the universe of smart protein n anomaterials.
4.3 Reprogramming Proteins into Smart Nanomaterials
Building on the principal of rational protein design and directed evolution in protein engineering, one may also integrate other biochemical approaches in conjunction to modify the protein-protein interactions or oligomerization chemistry to program the self-assembly and functioning of the protein nanomaterials. For example, to circumvent the challenge of programming extensive non-covalent interactions to control protein self-assembly, the directionality and strength of metal coordination interactions was exploited to guide the assembly of closed, homoligomeric protein nanostructure (Brodin et al. 2012). This strategy was further extended to program protein self-assembly into one-dimensional nanotubes, two- and three-dimensional crystalline arrays. The assembly of these arrays is tunable by external stimuli, such as metal concentration and pH. In another example, an ATP-driven group II chaperonin, which resembles a barrel with a built-in lid, was reprogrammed to open and close on illumination with different wavelengths of light (Hoersh et al. 2013). By engineering photoswitchable azobenzene-based molecules into the structure, light-triggered changes in interatomic distances in the azobenzene moiety are able to drive large-scale conformational changes of the protein assembly. The different states of the assembly can be visualized with single-particle cryo-electron microscopy, and the nanocages can be used to capture and release non-native cargos.
Similar strategies that switch atomic distances with light could be used to build other controllable nanoscale machines.
Small proteins or short peptides have been engineered and designed to self-assemble into various nanoconstructions. Several surfactant-like peptides undergo self-assembly to form nanotubes and nanovesicles with an average diameter of 30–50 nm, which exhibit helical symmetry (Vauthey et al. 2002). The peptide monomer contains 7–8 residues and has a hydrophilic head composed of aspartic acid and a tail of hydrophobic amino acids such as alanine, valine, or leucine. Similar self-assembling peptides were redesigned to form nanofi ber scaffold (Yokoi et al. 2005). Using the
mammalian visual system as a model, it was demonstrated that the nanofi ber scaffold could create a permissive environment for axon regeneration at the site of an acute injury, as well as knit the brain tissue together (Ellis-Behnke et al. 2006). The peptide scaffold form a network nanofi bers that are similar to the native extracellular matrix, therefore providing a 3D cell-culture environment for cell growth, migration and differentiation. A protein fi ber system that is co-assembled from two peptides exhibits programmable assembly, stability and morphology (Papapostolou et al. 2007). The fi bers display surface striations separated by 4.2-nm distance that matches the length expected for designed -helical peptides precisely. The spacing of the striations can be programmed with changing the length of the peptides, demonstrating the potential programmability of protein and peptides in high-order nanomaterial self-assembly.
Engineering the interface between protein nanomachines and inorganic nanomaterials represents another important avenue for smart biomaterial development (Ishii et al. 2003). A clear understanding of interactions between proteins and inorganic nanomaterials, such as nanoparticles, solid nanopore and nanowires, could make possible the design of precise and versatile hybrid nanosystems. Although the surface properties of inorganic nanomaterials may only be modifi ed in a limited number of ways in specifi c cases, the interactions between protein and inorganic nanomaterials are conceptually more programmable due to the large number of amino acid sequences that can give rise to a wide range of variation in the protein’s ability to interact their counterparts. In addition, there remain options of covalently bio-conjugating proteins to the surface of inorganic nanomaterials, integrating them into programmable hybrid building blocks (Marco et al. 2010). With greater understanding of the patterns and paradigms in the interfaces between proteins and inorganic nanomaterials, the computational tools used to predict protein-protein assemblies may be adapted to program protein-inorganic-nanomaterial assemblies i n the near future.