The current thesis entitled "Exploring molecular adaptations of extremophilic proteins: A platform for protein engineering" is an attempt in this direction. 10 1.5.1 Role of amino acid in rendering proteins extreme stabilization 10 1.5.2 The biophysical properties responsible for extreme stability of proteins 14 1.6 Approaches to obtain extremely stable proteins by protein engineering 24 1.7 Available theoretical prediction stabilizing models for 25 1.8 Salient features derived from engineered extremely stable mutants to better.
Extremophile Protein Database (ExProtDB) 49-66
Information about extremophiles, their proteins and contributing genomic and proteomic factors
Understanding the specific codon usage patterns in extremophiles
Creation of comparative datasets and enumeration of statistically significant codons
Prioritizing the codons to understand their preference in extremophiles
Dataset creation and enumeration of statistically significant codons for extremophiles
Analysis of attributes contributing to extreme-stability of proteins
In silico prediction of mutations to attain extreme-stability of proteins and their validation
In vitro validation and characterization of predicted mutations
Secondary structure prediction of wild-type and mutant lipase by FTIR (ATR)
Employing predicted extreme-stable mutants for applications
Testing lipase compatibility with surfactants, oxidizing agents and commercial detergents before and after immobilization
Compatibility of lipase with surfactants, oxidizing Agents and commercial detergents
FTIR spectroscopic analysis for characterization of naïve and bioconjugated ZnO nanoparticles
146 5.4 Interaction of p-NPO with the active site of wild-type and mutant lipases 149 5.5 Graphical representation of unique contacts formed in wild-type and mutants. The green arrow represents the optimal activity of the wild-type lipase, and the blue arrow represents the optimal activity of the most stable mutant.
Introduction and literature review
Introduction
Organisms that are tolerant of extreme environments are known as extremophiles and have been present on earth for 4 billion years1. Extremophiles and their proteins (enzymes) are potentially and effectively used compared to their conventional mesophilic counterparts in many industries, as they have increased catalysis speed, solubility and reduce microbial contamination in heavy industrial processes15.
Extremophilic adaptations
Further, complications arise in understanding the rationale behind extreme stability of proteins, such as when extremophilic proteins belong to archaea, bacteria and eukaryotes. Therefore, understanding molecular adaptations of extremophilic proteins is not only important but also critical to engineer efficient enzymes that are extremely stable.
Multiple factors involved in extreme-stability of proteins
It has been reported that changes in nucleotide composition can have very significant effects on codon usage patterns and thermal stability39,40. Furthermore, it has also been reported that codon usage preferences can be based on a selection that minimizes errors at the protein level.
Proteomic adaptations of extremophiles: role in extreme-stabilization of proteins The rapid progress in the sequencing of genomes has opened many new avenues of
- Role of amino acid in rendering proteins extreme-stabilization
- The biophysical attributes responsible for extreme-stability of proteins
A decreasing trend in the number of hydrogen bonds at domain interfaces was previously reported in psychrophiles compared to their mesophilic counterparts 105–107 . Hydrogen bonds H…O, N–H…N are said to have higher energy than other types of hydrogen bonds and more biologically important116,117.
Approaches to attain extreme-stable proteins by protein engineering
Some of these recipes cease to exist that can make proteins thermostable through protein engineering approaches. Therefore, to overcome the disadvantages of such approaches, there is a need to develop a guided method that can predict mutations as extremostabilizing so that such mutations can be performed in vitro by means of site-directed mutagenesis.
Available theoretical prediction models for predicting stabilizing mutations To overcome the demerits of directed evolution approaches numerous in silico algorithms
MUSTAB Support Vector machine-based, sequence as input, multiple mutations 192 AUTO–MUTE Machine learning-based, structure as input, single mutation 193. Despite the success of the method, the only drawback is that it is sensitive and specific.
Salient features deduced from engineered extreme-stable mutants to better understand extreme-stability of proteins
Protein stability increases with increasing degree of glycosylation and, to a much lesser extent, with polysaccharide size205. The stabilizing effect depends on the position of the glycans; so, The barostable and thermostable behavior of the enzyme is consistent with the barophilic growth of M.
Vivid role of chaperones and extremolytes in stabilizing proteins in extreme milieu
Spin-labeling of the active-site serine revealed that the active-site geometry of the M. The molecular chaperones of the Hsp60 and Hsp70 family are the proteins characteristic of the heat-shock response in the extreme thermophilic microorganisms of Archaea domain222.
Conclusion
Apart from the intrinsic properties that aid in extreme stability, it was postulated that molecular chaperones and extremolytes also aid in achieving extreme stability. In addition to the intrinsic properties of a protein, it was reviewed that extrinsic methods exist to achieve extreme stability.
Origin of the research work
Other questions that need to be further investigated are the development of an algorithm that can rationalize the approach of achieving mutations for proteins that stabilize extrema. Further experiments are required to be performed to compare a better method and to develop an easier method to stabilize extreme proteins, i.e., either by modification of intrinsic properties through mutations or extrinsic engineering including glycosylation or the addition of extremolites.
Thesis organization
In this chapter, a comprehensive literature review is comprehensively reported that provides an insight into research dealing with the multifactorial level of extreme protein stability. This chapter provides a relevant description of in silico prediction of reliable mutations and their validation in Bacillus subtilis lipase (a selected mesophilic enzyme) by substituting preferred protein attributes to achieve extreme protein stability by designing mutants.
Analysis of the genome of an alkaliphilic Bacillus strain from an industrial point of view. A coarse-grained elastic network atom contact model and its application in simulation of protein dynamics and prediction of the effect of mutations. Evolution of the chaperonin families (HSP60, HSP 10 and TCP-1) of proteins and the origin of eukaryotic cells.
Extremophile Protein Database (ExProtDB)
- Introduction
- Methodology
- Servers and software used
- Organism and genome information
- Collection of protein data and their refinement
- Amino acid composition analysis of all thermostable proteins
- Structural analysis of all extremophilic proteins and feature generation
- Tools and downloads
- Literature and patent
- Data integration
- Features of ExProtDB
- Information about extremophiles, their proteins and contributing genomic and proteomic factors
- Genomic information
- Protein information
- Mutant information
- Physicochemical protein factors
- Literature and patents
- Tools
- Database architecture and integration
- Data collection and analysis: statistics
- Unique ID generation for collected extremophiles and their proteins
- Conclusion
- References
In addition, amino acid compositions and intra-protein interaction data are integrated with each protein entry in the database. Information on amino acid composition and intra-protein interactions has been integrated with each protein entry and is searchable for each individual entry in the database. You can also download tools for calculating amino acid composition and intra-protein interactions.
Understanding the specific codon usage pattern in extremophiles
Introduction
In addition, codon usage trends in microorganisms often carry signals from the surrounding environment. Thus, the adaptability of the extremophile proteome is predominantly dependent on codon usage bias to adapt to extreme environmental conditions 14, 15. The adaptability of extremophile proteins under extreme conditions is predominantly achieved through genomic adaptations through codon usage pattern 16.
Methodology
- Creation of comparative datasets and enumeration of statistically significant codons
- Statistical analysis of significant codons in different extremophiles
- Relative abundance analysis of codons
- Prioritizing the codons to understand their preference in extremophiles The most significant codons in each extremophile class were prioritized or ranked in 1-
- Analysis of AT- or GC-rich and A/T- or G/C-ending codons
- Analysing data-points of highest and lowest ranked codon
- Finding codon harmony among different classes of extremophiles
- Generation of machine learning models to classify and predict extremophilic codons
In the “Relative abundance analysis of codons” section, the codons showing positive weighted mean difference showed higher preference towards extremophiles and were taken up for analysis of AT- or GC-rich codons and A/T- or G/C-ending codons. The analysis of A/T or G/C termination codons in the comparative data set was estimated by analyzing nucleotides (A, T, G or C) at third positions of the codon. In the section “Prioritizing the codons to understand their preference in extremophiles”, the resulting highest and lowest ranked significant codons of each data set were used for data point analysis by plotting their percentage scores in their respective CDS.
Results and Discussion
- Dataset creation and enumeration of statistically significant codons for extremophiles
- Analysing relative abundance of codons among extremophiles and non- extremophiles
- Understanding codon preferences in extremophiles by ranking them in 1-9 scale The statistically significant codons were grouped out into 1 to 9 ranks according to their
- Analysis of AT- or GC-rich and A/T- or G/C-ending codons
- Exploration of codon harmony among various extremophiles
- Generation of machine learning models to classify and predict extremophilic codons
The result of 1-9 rank analysis of codons identified the highest (ie, rank 9) and lowest (ie, rank 1) ranked codons in the comparative data sets. The analysis showed significant variability in the highest and lowest ranked codons in all extremophile types. The adaptability of codons in different types of extremophiles showed commonalities in codon usage patterns and is termed codon harmony.
Conclusion
Here, the main advantages of using decision tree prediction were that it (i) reduces ambiguity in decision making, (ii) provides alternatives to possible courses of action, and (iii) is easy to interpret.
Amino acid contacts in proteins adapted to different temperatures: Hydrophobic interactions and surface charges play a key role. Codon usage regulates protein structure and function by influencing translation elongation speed in Drosophila cells. Investigating the causes of codon and amino acid usage variation between thermophilic aquifex aeolicus and mesophilic bacillus subtilis.
Analysis of attributes contributing to extreme-stability of proteins
Introduction
The alkaliphiles are adapted to an increase in surface exposure of acidic residues, which makes the net charge of the molecule negative. It was found that charged accessible surface, hydrogen bonds, hydrophobic interactions, electrostatic interactions and salt bridges make the structure of proteins more robust and stable in most extremophiles such as thermophiles, alkalophiles, barophiles and halophiles. Therefore, this chapter is devoted to collecting the relative datasets of homologous extremophilic and non-extremophilic proteins that have available 3D structures, followed by differentiating the key attributes.
Materials and methods
- Data collection and enumeration of statistically significant features
- Statistical analysis of feature selection using Kolmogorov-Smirnov test
- Finding relative abundance of selected significant features
- Classification of extremophilic proteins by employing machine learning analysis using statistically significant attributes
- Ranking of attributes contributing to extreme-stability through multi-criteria decision making approach
- Testing the performance of AHP generated ranking model by blind test
The relative priority vector or eigenvectors were calculated for finding comparative preference of each feature in both extremophilic and non-extremophilic proteins. FP is the number of non-extremophilic proteins that the model incorrectly identifies as extremophilic proteins. FN is the number of extremophilic proteins that the system misidentifies as non-extremophilic proteins.
Results and discussion
- Data collection and statistically significant feature generation
- Relative abundance of protein feature in extremophiles
- Machine learning model generation for extremophilic proteins
- Generation of ranking models for protein extreme attributes by AHP
- Blind test analysis: validation of generated ranking model
- Salient findings
Logistic Regression, Support Vector Machines, and Artificial Neural Networks to Build a Model for Extreme Protein Stability. The goal was to rank or prioritize protein properties according to their contribution to extreme protein stability. Using this method, all significant features were prioritized according to their importance in contributing to the protein's exceptional stability.
Conclusion
Machine learning algorithms for predicting protein folding rates and mutant protein stability: Comparison with statistical methods. This chapter provides a relevant description of in silico prediction of plausible mutations and their validation in Bacillus subtilis lipase (a selected mesophilic enzyme) by substituting preferred protein attributes to achieve extreme protein stability by designing in silico homology models. . In silico validation of the mutant was done by AHP-generated sequence models, various mutation prediction servers (HotSpot Wizard, I-Mutant2, Cupsat, iPTREE-STAB, WET-STAB and ERIS web servers), Docking Substrate ( Autodock Version 4.2 (Ramachandran PROCHECK Analysis) and Contact Map Analysis (CMView 1.1.1).
Introduction
This involves using a series of cumbersome experimental techniques to mutate and recombine the protein and randomly select based on a particular property2. Through our previous study, we developed a two-step protocol to perform site-directed mutagenesis for engineering thermostable proteins. Therefore, the goal was to validate the AHP sorting models in the design of extreme stabilizing mutations that can be performed using site-directed mutagenesis to increase the extreme stability of the lipase.
Methodology
- Selection of mesophilic candidate enzyme for experimentation
- In silico prediction and validation of mutations
- Predicted mutations in Bacillus subtilis lipase
- In silico validation of designed mutations through AHP ranking models
- Validation of mutants through various mutation prediction tools/servers
- Validation of mutants by Ramachandran plot
- Structure superimposition validation of generated mutations
- Enzyme-substrate docking analysis for functional efficiency
- Contact map analysis for enumeration of interactions in lipases
- In silico prediction of secondary structure of mutants
Finally, the lipases were validated by means of secondary structure prediction in terms of content of helices and sheets of wild-type and mutant lipases using PSIPRED server39. The rotations of the polypeptide backbone of wild-type and mutant lipases around their bonds between N-Cα (called Phi, φ) and Cα-C (called Psi, ψ) were determined by Ramachandran plot analysis. The prediction of secondary structure of wild-type and mutant lipases was done by PSIPRED server39.
Conclusion
That is, local interactions dominate in a helix, while a sheet is stabilized by long-distance contacts. The amino acid contributing to the formation of helices and sheets was found to be increased in the wild-type lipases, expected from psychrophilic mutants (bsl_psy1 and bsl_psy2). The increased content of α-helices and β-sheets was reported, which makes the lipase mutants stiffer by reducing the content of random coil, making the protein stiff to withstand extreme conditions.
- Strains, plasmids, enzymes, reagents and instruments
- Cloning and mutagenesis of wild-type and mutant enzymes
- Kinetic characterization of wild-type and mutant enzymes
- Differential scanning fluorimetry analysis for determining unfolding (T m value) The differential scanning fluorimetry (DSF) can identify the temperature at which a protein
Cloning and mutagenesis of wild-type and mutant lipase was confirmed by restriction digestion and sequence analysis (1st BASE DNA Sequencing Services, Singapore). The recombinant plasmids of wild-type and mutant lipases were purified using StrataPrep Plasmid Miniprep Kit 9. Wild-type and mutant lipases were purified from obtained supernatants by affinity chromatography using Ni-NTA columns10.
Results and Discussion
- Cloning and in vitro mutagenesis of Bacillus subtilis lipase
- Enzyme expression and purification
This assay was performed for all types of lipases to measure the melting temperature (Tm) of wild-type and mutant lipases. A spot of 20 µl of lipase solution on one side of the ATR was placed and the peak positions in the amide I range were measured and the entire experiment was performed at room temperature. Kinetic stability, melting temperature and secondary structure analysis confirmed the robustness of the mutants under extreme conditions compared to the wild type.