• Tidak ada hasil yang ditemukan

RankProt: Validation and Accuracy

To validate the thermostability predictor, the first blind test involving a set of 100 thermostable/mesostable pairs were randomly chosen from the dataset as alternatives (Appendix III, Table A3.3). Table 4.3 represents the accuracy and ranking capacity of the predictor.

Table 4.3.Validation and accuracy of RankProt Blind test set No. of

proteins

Homology Accuracy% Mean Rank

Rank Difference

1 100 TP Homologous

91

0.54

0.09

100 MP Homologous 0.45

2 40 MP Homologous 0.49

0.003

40 MP Homologous 0.50

The accuracy of RankProt for thermostability was calculated to be 91%. In a second blind test, when the mesostable-mesostable protein pairs were compared rank value difference was calculated to be 0.003. This is much less than the rank value difference of the thermostable-mesostable protein pairs. Here it can be conclusively said that this method could correctly differentiate between thermostable proteins from their mesostable proteins. The higher rank value difference for the thermostable- mesostable set also strongly indicates that the priority set to the 17 intra-protein interactions are correct and this priority vectors can thus be conveniently utilized to distinguish among thermostabilizing and non-thermostabilizing mutations. Any mutation in a protein which leads to a rank of 0.54 or higher when compared to its mesostable homologue through this method will predict thermostabilization of the protein.

The third blind test, involved 5 thermostable mutants of Bacillus subtilis lipases with optimum temperature >50°C. These were ranked against their mesostable wild type counterpart (1i6w) through RankProt. The results have been presented in Table 4.4.

Table 4.4. Ranking of thermostable mutants of Bacillus subtilis lipases Sl.

No

Muta nt

Ranks T °C No of muta tions

Mutations Method of mutation

1 1t2n 0.546 55 3 L114P, A132D, N166Y Error-prone PCR

1i6w 0.453 35 0 Wild Type

2 1t4m 0.541 55 2 A132D, N166Y Error-prone PCR

1i6w 0.458 35 0 Wild Type

3 3d2c 0.573 60 9 A15S, F17S, A20E, N89Y, G111D, L114P, A132D, I157M, N166Y

Directed evolution

1i6w 0.426 35 0 Wild Type

4 3qmm 0.566 78 12 A15S,F17S,A20E,N89 Y,G111D,L114P,A132 D,M134E,M137P,I157 M,S163P,N166Y

Directed evolution

1i6w 0.433 35 0 Wild Type

5 3qzu 0.537 60 7 R33Q, D34N, K35D,

K112D, M134D,

Y139C, I157M

Iterative saturation mutagenesis with randomization sites chosen on the basis of the highest B-factors

1i6w 0.462 35 0 Wild Type

*T-Temperature

It is clear that RankProt was successful to assign higher ranks to the thermostable mutants. As the ranks are relative and obtained by matrix normalization with the column sums equal to 1, the rank value of the mesostable protein differs from case to case and is not a constant. Thus the rank value of the subject protein is compared upon this rank and is case dependent. Thus it can be conclusively said that the tool RankProt can correctly identify thermostabilizing mutants. Moreover it can evaluate the role of multiple mutations at one go. It can also provide information

regarding the changes in the physicochemical properties of the protein which led to the enhancement of thermal stability of the mutated enzyme structure.

The fourth blind test involved a total of 104 mutated and wild type structures of bateriophage T4 lysozyme were retrieved from PDB. The wild type structure, 2lzm has Tm of 41.9 (Matsumura et al. 1989). The mutants were all gain of function mutations. The stability was reported to increase in the mutants. Therefore to test RankProt the mutants and the wild type were ranked (Appendix III Table A3.4).

RankProt performed extremely well in identifying the stable mutants by providing them with higher rank corresponding to the wild type. A snippet of the Table has been presented here in Table 4.5. Out of the 104 proteins enumerated, 99 were provided with the correct rank.

Table 4.5. Snippet of the ranking obtained for bacteriophage T4 lysozyme and its mutants

Mutant *T oC Rank_Mutant Rank_wild type

1dya 53.08 0.52 0.43

1dyb 53.08 0.5 0.45

1dyc 68.3 0.52 0.43

1dyd 64.7 0.5 0.45

1dye 53.08 0.51 0.44

1dyf 53.08 0.5 0.44

1dyg 53.08 0.51 0.44

1l00 66.3 0.5 0.45

1l02 53.6 0.51 0.44

1l03 42 0.5 0.42

1l04 42 0.52 0.43

1l06 53.6 0.5 0.42

*T-Temperature

The fifth blind test involved a total of 47 structures of loss of stability mutants of human lysozyme (PDB ID: 1lz1). They were ranked against the wild type which has a melting temperature of 64.9oC (Takano et al. 1997). Therefore according to

concept the mutants should be ranked lower than the wild type. Out of 47 structures, 42 were given lower ranks in comparison to the wild type protein. Table 4.6 is a snippet of the full table provided in Appendix III Table A3.5. Therefore RankProt is able to efficiently recognize thermo stabilizing mutants.

Conclusively it can be said that in ranking predicted mutations with RankProt, the resultant rank of thermostable mutations is expected to be higher and also the rank of thermostable proteins is expected to be higher than its mesostable protein.

Therefore when a protein with its mutated structure is considered as alternatives for ranking, the mutated protein will receive a higher rank if and only if the mutation contributes positively towards thermostability by increasing the highest ranked features through AHP. Therefore RankProt serves both the purpose of identifying thermostable mutations and classifying thermostable proteins.

Table 4.6. Snippet of the ranking obtained for human lysozyme and its mutants Mutant ToC Rank_Mutant Rank_wild type

2bqb 46 0.39 0.41

2bqc 47 0.38 0.41

2bqd 42.9 0.39 0.41

2bqe 44.3 0.39 0.41

2bqf 46 0.39 0.41

2bqg 46 0.39 0.41

2bqh 49.5 0.39 0.41

2bqi 39.9 0.39 0.41

2bqj 42.2 0.39 0.41

2bqk 44.4 0.39 0.41

2bqm 47 0.39 0.41

2bqn 44 0.39 0.41

*T-Temperature

4.5. Conclusions

Directedevolutionapproachesoften do not provide a minimalist design for obtaining a desired property in proteins. Furthermore ranking or prioritizing thermostabilizing protein features has not been performed as it is complicated

because of the interplay of each of the factors in protein stability. Moreover literature still lacks information on which factors is most important to render proteins stable at such extreme milieu. In this work we have tried to prioritize a statistically significant set of 17 physicochemical features for thermostability prediction through a novel multi criteria decision making approach; Analytical Hierarchical Process. This work was also successful to come up with a scoring model which can differentiate thermophilic/mesophilic proteins. A set of 127 structural homologous thermostable/mesostable protein structures formed the final dataset. The problem was decomposed into hierarchies and the factors ranked with the aid of eigen vectors.

Ionic interaction and main chain to main chain hydrogen bonding were given the highest priority for conferring thermal stability. This resulted in the development of a tool RankProt using the priority vectors for thermostability. Further the tool was validated through blind tests. A random set of 100 proteins were ranked and the thermostable proteins were assigned an average rank value of 0.54. The accuracy of the method was calculated to be 91%.

Three case studies to check the efficiency of RankProt were carried out with 5 thermostable mutants of Bacillus subtilis lipase, 100 thermostable mutants bacteriophage T4 lysozyme and 47 loss of stability mutants of human lysozyme. In all the three cases RankProt successfully recognized thermostabilizing mutants. Thus it can be conclusively said that this method can successfully identify thermostabilizing mutations. Moreover the edge of this method is that multiple combinations of mutations can be prioritized at a single go with higher rank assigned to the more stabilizing ones. In summary an efficient ranking model has been developed which can be used stand-alone on Centos Linux platform as VADAR is not web compatible (Fortran program) and also requires F77 compiler. The software package can be downloaded on request and the download link is available on the web interface of Thermostable Protein Structural Database.

Chapter V

Attaining Plausible Mutations