• Tidak ada hasil yang ditemukan

Multiple factors involved in extreme-stability of proteins

Introduction and literature review

1.3 Multiple factors involved in extreme-stability of proteins

Currently, research focuses mainly on genomic and proteomic level adaptations of extremophiles and each level comprise of numerous attributes which requires further exploration31. The significant enhancement in our understanding of extremophilic adaptation has also been nurtured by availability of complete genome sequences and high-resolution, three dimensional structures of proteins. Most of the research work in this direction concluded that stability of extremophilic proteins is a resultant of cumulative effect of various stabilizing factors and each protein shows different approaches to extreme-stability governed by their biophysical properties. Figure 1.1 is a vivid illustration of the contribution of various factors for extreme-stability which have been reported till today.

Thermophilic adaptations can be observed to be the most versatile among all being capable to utilize majority of the listed stabilization features. As the bulk of the literature is about thermophilic adaptations, further exploration of other types of extremophilicity will aid in development of a better picture about the robustness in adaptations of the other extremophiles. It is also interesting to note that thermophiles and psychrophiles choose vice-versa features for enhancing their stability. For example, thermostability increases by increase in protein rigidity whereas psychophilicity increases by increase in flexibility.

Similarly, alkaliphiles show enhanced basic amino acid residues whereas acidophiles

show an increase in acidic amino acid residues. Another complication in understanding extreme-stability is that certain proteins for example; psychrophilic proteins like lipases from Candida antarctica which is supposed to be psychrophilic is actually thermostable.

They are stable due to rigidity and higher number of van der Waals forces, reduction in electrostatic interaction and increase in hydrogen bonding32. Therefore, further in depth knowledge about extreme-stability is necessary to draw a complete conclusion regarding the range of features that can classify different types of extreme-stable proteins.

Figure 1.1: Positive contribution of features related to genome and proteome of extremophilic organisms towards extreme-stability17,33–36. The type of extremophile has been color coded. The figure has been deduced from the available data in literature. It is interesting to note that thermophiles are the most robust in adapting to elevated temperature conditions by employing the maximum number of versatile.

1.4 Genomic adaptations of extremophiles: codon adaptability and GC-content A large repertoire of extremophilic organisms has been identified. However, only few of them have been successfully sequenced. Search for genomes with the key words

“thermophilic”, “hyperthermophilic”, “psychrophilic”, “halophilic”, “alkaliphilic,

“acidophilic” and “barophilic” in NCBI genome search results in a total of 156 hits for thermophiles and hyperthermophiles, 14 hits for psychrophiles, 70 hits for halophiles, 13 hits for alkaliphiles, 29 hits for acidophiles and 8 hits for barophiles till 2017. Based on these genome search results it can be demonstrated that most of the work to understand extreme-stability has been carried out for thermophiles.

The genomes of extremophiles have evolved through the phenomenon of lateral gene transfer from bacteria by orthologous replacement or incorporation of paralogous genes

37. Researchers compare the extremophiles on the basis of their genomic features like their genome size, order of genes, codon usage bias, and GC-content to determine what mechanisms could have produced the great variety of genomes that exist today38. Comparative analysis of the genomes of extremophiles and non-extremophiles reveals some common trends in extremophiles that cross the boundary of phylogenetic relationships and genomics such as genome size, DNA repair system, DNA stabilizers, genomic GC-content and genome super-coiling. Sequencing of extremophile genomes and their analysis has shed light on their genetic evolution.

It has been reported that variations in nucleotide composition can have very significant effects on the patterns of codon usage and thermostability39,40. Thermophiles and hyperthermophiles have been reported to have a high GC-content41,42. Correspondingly, the codons such as ATC, CGG, TTG, CGC, which are G/C ending and therefore, likely to give higher stability to the codon–anticodon interaction, have been reported to be higher in thermophiles than mesophiles43. Work carried out by Zeldovich et al. concluded that an increase in purine (A+G) of thermophilic bacterial genomes due to the preference for isoleucine, valine, tyrosine, tryptophan, arginine, glutamine, and leucine, which have purine-rich codon patterns, is responsible for the possible primary adaptation mechanism for thermophilicity44. These amino acid residues increase the content of hydrophobic and charged amino acids, enhancing thermostability. To the contrary, for acidophiles, pyrimidine-rich codons are preferred45. Thus, the genomes of thermoacidophiles such as P. torridus have evolved by lowering the purine-containing codons in long open reading frames46. The comparison of base composition, codon and amino acid usages of thermophilic Aquifex aeolicus with those of Bacillus subtilis shows that there is a

significant increase in purine content and GC composition in codon selection. This type of alteration in base compositions also influences the codon and amino acid usages of A.

aeolicus47. Like the genome of hyperthermophiles and thermophiles, the halophiles also have high GC content of around 60 to 70% which can avoid ultraviolet-induced thymidine dimer formation and mutations48. At the DNA level, compared to non- halophilic genomes, the halophiles exhibit distinct dinucleotides (CG, GA/TC, and AC/GT) at the first and second codon positions, reflecting an abundance of aspartate, glutamine, threonine, and valine residues in halophile proteins, which leads to their stability49. The presence of high levels of CG dinucleotides leads to an increase in stacking energy and, thus, genome stability. By the above reports it can be concluded that the basis of extreme-stability selection purely based on GC content is fuzzy.

Further, it has also been reported that codon usage preferences might be based on an error-minimizing selection at the protein level. Mutation occurs in any codon for thermophilic adaptation, which results in similar amino acids that leads to the similar protein conformations which are relatively less deleterious compared to others. For example, Arg codon changing to Lys codon has less deleterious effect in protein conformations because of similar properties of both amino acids50. Arginine is coded by two sets of codons AGR (AGA and AGG) and CGN (CGU, CGC, CGA and CGG). The biasness is also seen in between these two sets of arginine codons AGR and CGN.

Whenever there is an increase in lysine for thermophilic adaptations, AGR codons are more preferred codons for arginine coding51. Because a positive error minimization mechanism between arginine and lysine codons (AAA and AAG) occurs due to a single mutation at second position that could turn AGR arginine codons into codons for lysine (which can be harmful for thermostability). Whereas the decrease in CGN codon usage in organisms that live in high temperatures can be interpreted as a negative error minimization51. Since a single mutation at second position could turn CGN arginine codons into codons for histidine (CAU and CAC) and glutamine (CAA and CAG) which can be harmful for thermostability51.

Additionally, DNA was observed to possess positive supercoils resulting in greater stability52. It was also reported that genome size matters for extreme stability. The

genome size of acidophiles is smaller than that of neutrophiles. For example, the smallest genome belongs to thermoacidophilic Thermoplasmatales (<2 Mb)53. Tyson et al.

reported a variety of genes in acidophiles involved in their membrane biosynthesis, indicative of acid tolerance capacity of these microorganisms. Similarly, alkali-stable genomes are adapted to maintained pH homeostasis. Recently, the complete genome of Bacillus subtilis and Bacillus halodurans C-125 has been sequenced. Genes responsible for the alkaliphily of B. halodurans C-125 and Bacillus firmus OF4 have been analyzed54. In their genome, several open reading frames for Na+ /H+ antiporters responsible for pH homeostasis in alkaliphiles have been characterized55. The tupA gene was identified in the B. halodurans genome, which is responsible for the synthesis of teichuronopeptide, a major structural component in the cell wall important for maintaining pH homeostasis54. To the best of our knowledge less work on genomes from barophilic microorganisms has been reported. Reports on barophilic adaptations can be found in the work of Di Giulio who initiated his studies by comparison of homologous genomic sequences of barophilic Pyrococcus abyssi and non-barophilic Pyrococcus furiosus. He reported that GC-rich codons were significant in barophiles56. It has also been reported that in order to survive in such environments, barophilic microorganisms require robust DNA repair systems since high pressure can damage DNA and proteins, so survival necessitates avoidance of damage or high repair rates14,57. Additionally, pressure-regulated operons (ompH) have evolved in barophilic genomes which are responsible for high-pressure adaptations in barophiles58,59. Conclusively, it can be said by comparisons made between extremophile and non-extremophile genomes that difference in the patterns of codon usage results in a partial resistance towards the changes in the concentration of a given amino acid. This buffering capacity might explain the observed differences in codon usage trends in genes of two different set of organisms35. These studies help in understanding the use of preferred codon in an organism as powerful tools that can be utilized to improve function predictions and genome-environment mappings35. But there is still lack of knowledge of extremophile genomes and global codon usage. Further, codon usage studies were mainly done only for thermophiles but with no conclusive finding as which codon is preferred in thermophiles.

1.5 Proteomic adaptations of extremophiles: role in extreme-stabilization of proteins