• Tidak ada hasil yang ditemukan

MULTIVARIATE QSAR STUDY ON ANTIOXIDANT ACTIVITY OF TRIPEPTIDES USING 2D AND 3D STRUCTURAL DESCRIPTORS Nong

N/A
N/A
Protected

Academic year: 2024

Membagikan "MULTIVARIATE QSAR STUDY ON ANTIOXIDANT ACTIVITY OF TRIPEPTIDES USING 2D AND 3D STRUCTURAL DESCRIPTORS Nong"

Copied!
5
0
0

Teks penuh

(1)

VIETNAM JOURNAL OF CHEMISTRY VOL. 50(5) 557-561 OCTOBER 2012

MULTIVARIATE QSAR STUDY ON ANTIOXIDANT ACTIVITY OF TRIPEPTIDES USING 2D AND 3D STRUCTURAL DESCRIPTORS

Nong Thi Hong Duyen', Pham Van Tat**

'Department of Chemistry, University of Hue Science

^Department of Engineering Chemistry, Industrial University of Ho Chi Minh City Received 19* May 2012

Abstract

The major economic alternative way for prediction of antioxidation activity of tripeptide analogue is the use of quantitative structure-activity relationships (QSARs). In this work, a group of 23 tripeptides was modeled for their antioxidation activity using interpretable structural descriptors tiiree-dimensional (3D) and two-dimensional (2D) descriptors. Quantitative relationships between structural descriptors and antioxidation activities (QSARs) were constmcted by incorporating the multivariate regression analysis with Genetic algorithm. Among models developed using different chemometric tools, the best model based on both internal and external validation. The irTq)ortant molecular descriptors xch6, xvchS, SaaN, SdO, Bother, Dipole, SpcPolarizability and Volume were selected by die forward and backward techiuque with die genetic algorithm. The best 4-variable model QSAR]u,eaj including SaaN, Bother, SdO and Dipoie was derived from those techniques. The quality of this model QSARimear was exhibited by values ^Tya^ of 97.8524, ^^^ of 92.784, standard error of estimation SE of 0.0355 and F-stat of 148.0816. For non- hnear model QSAR„„^, die architecUire type I(4)-HL(4)-0(I) with R^fiacw of 99.4293 was constructed by flie present predictors in linear model QSARiJnear- The antioxidation activities of tripeptides resulting from models QSARimear and QSARnt^, were depicted in values MARE, % of 31.0854 and 27.9484, respectively.

Keywords: Quantitative stmcture-activity relattonships (QSARs), multiple regression, neural network.

1. INTRODUCTION

Recent years the technological science have been developing rapidly, and the people wish that prolong life or against aging or antioxidative taking place in living body. Because of this oxidation reaction generates free radicals implicating in the initiation and progression of many diseases, especially neurodegenerative diseases [1]. This is the major cause of aging. The hydrolysis from various proteins, such as soybean, casein, bullfrog, royal jelly, venison, r-lactalbumin, myofibrillar, rice endosperm, have been shown to have antioxidant activities against the peroxidation of lipids or radical scavenging activities [4].

There is an essential need to use computation- based quantitative structure-activity relationship (QSAR) modeling for providing information about the physicochemical properties of chemicals and their environmental fate as well as their human health effects [4]. Quantitative relationships between structural descriptors and antioxidation activities QSARs may indicate the change of biological activity corresponding to composition of amino

acids in peptide chain [2, 3].

In the present paper, we reports the use of the antioxidation dataset of 23 tripeptides and the development of predictive global QSAR models on the dataset using 2D, 3D descriptors along with topological descriptors. The linear and non-linear models for tripeptides are constructed by multivariate analysis and neuro-fu2zy technique with genetic algorithm. The 2D and 3D structural descriptors for 23 tripeptides calculated by molecular mechanics method MM+ and semiempirical quantum calculation SCF PM3 were used for constructing QSAR models. The antioxidant activities of tripeptides resulting from linear and non-linear models are compared with those from literature.

2. MATERIALS AND METHODS 2.1. Data and Software

The experimental antioxidant activities of 23 tripeptides (ACexp: Antioxidant activities of peptides were measured by the ferric thiocyanate methods

557

(2)

VJC, Vol. 50(5), 2012

which are relative activities by adjusting the confrol 1.0) were taken from a main source of Li Yao Wang [1]. The onginal data set was split randomly into a ti-aining set of 18 tripeptides in Table 1, with AC^xp values in range of 0.0441-0.6369 and a test set of 5 tiipeptides in table 5, with AC„p values in range of 0.3170-0.6369. The test set was used to evaluate its predictability.

2.2. Molecular modeling and Structural descriptors

hi order to calculate the 2D and 3D structural descriptors, the experimental tripeptide structures were optimized using the MM+ molecular mechanics method with gradient level of 0.05 in HyperChem program [5]. After optimizing the molecular geometries of tripeptides the 2D and 3D structural descriptors for 23 tripeptides in Table 1 were calculated by using package QSARIS [7,9].

Table 1: Experimental structures and antioxidant activities AC„p of 23 tripeptides [1].

No 1 2 3 4 5 6 7 8 9 10 11 12

Tripeptide CYY HHA HHC HHD HHE HHG HHI HHK HHL HHM HHN HHQ

AC,„

0.4699 0.0680 0.1277 0.1877 0.1877 0.317O 0.0680 0.0635 0.0680 0.O817 0.3170 0.3170

No 13 14 15 16 17 18 19 20 21 22 23

rripeptidt HHR HHS HHT HKH HRH LWL PWK RWK RWQ RWV YYC

At^exD

0.0635 0.0862 0.0862 0.0441 D.0441 0.6061 0.4066 3.6061 5.6061 0.6061 0.6369 2.3. Multiple linear modeling

For quantitative sfructure-activity relationships (QSAR), the shiictural descriptors {X) are correlated with response variable {Y). The relationship is well represented by a model tiiat is linear in regressed predictors as

Y = Y,b,X,+C

(I)

Where parameters, hi are unknown regression coefficients; C is constant.

Multiple linear regression analysis based on

Pham Van Tat. et al.

least-squares procedure is very often used for estimating the regression coefficients. The multiple linear models QSAR|i„eir were constructed by using programs Regress [8] and QSARIS [7,9].

2.4. Non-Linear modeling

In recent years, the neural networks NNs as non- linear models QSAR^uni have proven to be fruitful in the study of activities and properties of organic compounds [9,10,11].

Artificial Neural Networks (ANNs) are computational architectures constructed with the goal of mimicking biological neural network. The ANN artificial counterpart reproduces a similar functionality to the biological one. It calculates a weighted sum of input signals and compares it against the activation function [6]. The collection of well-disfributed, enough number of observations, and accurately measured-simulated input data is the basic requirement to obtain an acceptable non-linear model. Multilayer Perceptron ANNs (MLP-ANNs) are the best known and most widely used. The MLP- ANN model generally I(0-HL(n)-O(m) consists of input layer 1(0 with / neurons, hidden layer HL(/i) with n neurons, and output layer 0(m) with ^ neurons. The ANN training process is performed !)y using the error back-propagation algorithm. The non-linear model QSAR^u^i was constructed by using 2D and 3D sbuctural descriptors in Eq. (1) as neurons of input layer and antioxidant activity AC as neuron of output layer.

3. RESULTS AND DISCUSSION 3.1. Building linear model

As a first step, the best linear QSARii„„r was searched through exploring regression models, with the purpose of identifying the representative structural descriptors of tripeptides that lead to their antioxidant behavior. The best important descriptors in Table 2 were found by multiple regression techniques. The brief description for the meaning of each descriptor is supplied by [9]. Furthermore, it is clear that four descriptors are able to lead to the best statistical values of parameters. The monitoring set that is partly considered during the model development.

The best linear model QSARimcar was established by incorporating the forward and backward techniques with genetic algorithms in package Regress [8] and QSARIS system [7]. All the modclfi^

selected were based on the statistical values R^ftnesigl standard error SE, R^.^,-, R^ta, and F-stat. The 558

(3)

VJC, Vol. 50(5), 2012

suitable linear models QSARiinear consists of 2D and 3D structural predictor k 1-5. The change of

Multivariate QSARsludy on antioxidant...

predictors in model QSAR leads to change the values R^fitnes, and R\th as depicted in figure 1.

Figure I: a) Change of RVimcss and R^„, versus predictor numbers fr, b) Correlation between AC^^p versus ACp„d resulting from linear model QSAR|.n The best appropriate linear models QSARimear

were shown in Table 2. The 2D and 3D suitable linear models QSARimeai and descriptors A 1 - 5 were found by using 18 tiipeptides in training group. The value R^fimess 's the multiple determination coefficient, SE value stands for Standard Error, and R lest obtained by using leave one out cross- validation technique [8,9].

The 2D and 3D models were shown in Table 2, the linear model with A = 4 is the best model QSARiu^ with highest value R\^ of 92.780, then this decreased causes by increment of predictor number k. This also is depicted in Figure I. The quality of this model was also reflected in the value R^fiincss of 97.850, standard error SE of 0.0355, F-stat

of. 148.0816, R^«, of 92.780 and P-values for regression coefficients in range 0.0001 - 0.0132.

The form of best linear model QSARimear is AC = 1.1463 - 0.0788SaaN - 0.0520Hother +

0.0087SdO- 0.0094Dipole (2) The cross-validation technique showed that linear model QSAR,,ne»r (2) can be used to predict antioxidant values AC of h-ipeptides. The regression coefficients in model (2) were also tested by mean of the statistical values in confident level a = 0.05. So, all the regression coefficients in this model QSARiinear (2) are very satisfactoiy.

Table 2: The linear models QSAR|,„ea( (with k from 1 to 5) and statistical values

Linear models QSAR[,near(with i 1 to 5) AC = 0.5726-0.0601 SaaN

AC = 1.5683 - 0.0841SaaN- 0.0551Hother AC = 1.2665 - D.0864SaaN - 0.0600Hother + O.OOSOSdO AC = 1.1463 - 0.0788SaaN - 0.0520Hother + 0.0087SdO

- 0.0094 Dipole

AC = 23.005 - 136.0187xcll6 - 308.1490xvch5+ 0.0093SdO -O.OlOlDipole - 0.0012Volume

Statistical values

•^ filness

88.62 94.82 96.49 97.85 98.11

R adi 87.91 94.12 95.74 97.19 97.32

SE 0.074

• 0.051 0.044 0.036 0.035

F-stat 124.616 137.166 128.437 148.082 124.553

RV, 85.70 91.54 92.24 92.78 88.33 The important contribution of molecular magnitude in order; SaaN > Hother > Dipole > SdO.

descriptors in linear models QSARiinear in table 2 was The values Pmik,% and MPx^,% of each predictor in exhibited by values PmXkyo and MPx^,%. The values these models were calculated by using following MPxi^,% were used to arrange the important level of formula, as described in figure 2:

predictors in order [9]: SaaN > Hofrier > xvchS >

SdO > Dipole, but otherwise the coefficients P x ,% = lOO.\b x l / c (3) corresponding with each predictor in model " ' "'' ""''''

QSARiinear (2) arranged were based on their absolute

(4)

VJC, Vol. 50(5), 2012

MP.x,.% = -^'Z(m.\b„x„\/C„

k

withC,„,.i= 5^|ft„jiAr„.,|

Pham VanTat.etal.

Where JV of 18 is number of tripeptides in training set; and m of 4 is number of predictors in linear model OSARii„,„.

(4)

= ;.llll U LI 11II

a) b) Figure 2: (a) Contribution percentage, MPxy,,%; (b) Frequencies % of predictor in linear models

The values MP;ck,% and Frequencies % of each predictor in QSAR models with k = \-S can be exhibited in Figure 2a and 2b, respectively.

3.2. Building non-linear model

The non-linear model ^SARncurai was founded by combining the basic fiizzy logic technique with genetic algorithms on INForm system [6]. The neural network architecture consists of three layers 1(4)-HL(4)-0(1). The input layer 1(4) involves four neurons SaaN, Hother, Dipole and SdO in the best linear model QSARimeai (2). The output layer 0(1) has one neural node as antioxidant activity AC. The hidden layer HL(4) includes four neurons. The back propagation error algorithm was used for training this neural network. The sigmoid transfer function on each node of input and output layer was used.

The parameters for training neural network are learning rate of 0.7, moment of 0.7, target epochs of 2000 and mean square error MSE of 0.0005443.

Afler neural network fraining process the value R^fiincss is 99.4293, whereas best linear model QSARjincar gavc R^Htnesi of 97.8524, as shown in Table 2. Thus, non-linear model QSARneuni 1(4)- HL(4)-0(I) achieved better adaptation than 4- predictor linear model QSARimear-

3.3. Predictability of linear and non-linear model The predictabili^ of linear model QSARiineK and non-linear model QSARneur»i were evaluated carefully by the leave-one-out and cross-validation techniques. The predicted results of antioxidant activities for 5 tripeptides in test group were given in table 3.

Table 3: Experimenatal AC^^p and predicted antioxidant activities ACpred resulting from models.

No 1 2 3 4 5

Tripeptides HHN HHQ PWK RWQ YYC

AC„p[l]

0.3170 0.3170 0.4066 0.6061 0.6369

linear model QSAR[,„eM AC„rf

0.2576 0.1742 0.6448 0.6738 0.4975

MARE, % ARE, % 18.7445 45.0513 58.5720 11.1669 21.8926 31.0854

Non-linear model QSARncuni ACp.„

0.3975 0.1775 0.6025 0.6052 0.4967

ARE,%

25.3943 44.0063 48.1800 0.1485 22.0129 27.9484

(5)

VJC.Vol. 50(5),2012

The ANOVA one factor pointed out that the predicted values resulting from linear model QSARiii^r and non-linear model QSARneurai is not different (F = 0.0024 < Faos = 5.3177). However, the non-linear model QSARneuiai gave less MARE, % value than 4-predictor linear model QSARnnear-

The predicted values resulting from these models QSARs were judged by the absolute value of the relative error ARE, %:

ARE,% = 100\{AC,,^ - ACp„,)/AC„J (5)

The medium absolute values of the relative error MARE, % were used for assessing overall error for models QSARs:

100 (AC„ -Aa,^) (6)

Where N of 5 is number of tiipeptides in test set;

ACexp and ACp^d are experimental and predicted antioxidant activity.

4. CONCLUSIONS

The chemical information encoded by 2D and 3D topological molecular descriptors participating in a linear model QSARjii^ar enabled to explain the variation of the experimental antioxidant activities of tripeptides in a satisfactory extent, and allowed a proper characterization of structurally tripeptides from both the training and test sets.

The QSARs designed involved 2D and 3D molecular descriptors that have a quite direct interpretation, and this relationship proved to have general applicability. The statistical parameters of the proposed model compare fairly well with others published previously [1].

This work constructed linear model QSAR|,near

Multivariate QSAR study on antioxidant...

and non-linear model QSARneurai successfully. The genetic algorithm helped to select the important predictors to reach in the models QSARs accurately.

The neural network I(4)-HL(4)-0(1) is better predictable than linear model QSARuMar- The results obtain from this work showed a way, which it can predict antioxidant activities of new tripeptides.

REFERENCES

1. Li Yao-Wang, B. Li, J. He, P. Qian. J. Molecular Stmcture, No. 998, 53-61 (2011).

2. S. Mittermayr, M. Olajos, T. Chovan, G. K. Bonn, A.

Guttman. Trends in Analytical Chemistry, 27(5) (2008).

3. K. Saito, J. Dong-hao, T. Ogawa, K. Muramoto, E.

Hatakeyama, T. Yasuhara, and K.. Nokihara. J. Agric.

Food Chem., Issue 51, 3668-3674 (2003).

4. H. Z. Zhang, D. P. Yang and G. Y. Tang. 11(15/16), P. 749-754 (2006).

J. I. Lessigiarska, A. P. Worth, T. I. Netzeva, J. C.

Dearden, M. T. D. Cronin. Quantitative structure- activity-activity and quantitative structure-activity investigations of human and rodent toxicity, Chemosphere, 65, 1878-1887 (2006).

6. HyperChem Release 8.05, Hypercube Inc., USA (2008).

7. INForm v2.0, Intelligensys Ltd., UK (2000).

o. QSARIS 1.1, Statistical Solutions Ltd., USA (2001).

9. D. D. Steppan, J. Wemer, P. R. Yeater. Essential Regression and Experimental Design for Chemists and Engineers (2000).

10. Pham Van Tat. Development of Quantitative Structure-Activity Relationship and Quantitative Structure-Property Relationship, Natural science and technology publisher, Hanoi (2009).

11. Pham Van Tat, Pham Thi Tra My. Vietnam Journal of Chemistry and Application, Issue 4,10-15 (2010).

12. Pham Van Tat. Vietnam Journal of Chemislry, 47(4A), 611-616 (2009).

Corresponding author: Pham Van Tat

Industrial University of Ho Chi Minh City

12 Nguyen Van Bao, Go Vap distiict. Ho Chi Minh City Email: [email protected].

Referensi

Dokumen terkait