Computational Toxicology 26 (2023) 100265
Available online 11 March 2023
2468-1113/© 2023 Elsevier B.V. All rights reserved.
2D-QSAR study and design of novel pyrazole derivatives as an anticancer lead compound against A-549, MCF-7, HeLa, HepG-2, PaCa-2, DLD-1
Fatima Ezzahra Bennani
a,b,e,*, Latifa Doudach
c,2, Khalid Karrouchi
b,2, Youssef El rhayam
d, Christopher E. Rudd
e,f,g,1, M ’ hammed Ansar
h,1, My El Abbes Faouzi
a,1aLaboratory of Pharmacology and Toxicology, Bio Pharmaceutical and Toxicological Analysis Research Team, Faculty of Medicine and Pharmacy, Mohammed V University in Rabat, BP6203 Rabat, Morocco
bLaboratory of Analytical Chemistry, Faculty of Medicine and Pharmacy, Mohammed V University in Rabat, BP6203 Rabat, Morocco
cDepartment of Biomedical Engineering Medical Physiology, Higher School of Technical Education of Rabat, Mohammed V University in Rabat, BP6203 Rabat, Morocco
dAgro-Resources Laboratory, Organic Polymers and Process Engineering (LRGP)/Organic and Polymer Chemistry Team (ECOP), Faculty of Sciences Ibn Tofail University, Kenitra, Morocco
eDivision of Immunology-Oncology, Centre de Recherche Hˆopital Maisonneuve-Rosemont (CR-HMR), Montreal, QC, Canada
fDepartment of Microbiology, Infection and Immunology, Faculty of Medicine, Universit´e de Montreal, Montreal, QC, Canada
gDivision of Experimental Medicine, Department of Medicine, McGill University Health Center, McGill University, Montreal, QC, Canada
hLaboratory of Medicinal Chemistry, Faculty of Medicine and Pharmacy, Mohammed V University in Rabat, BP6203 Rabat, Morocco
A R T I C L E I N F O Keywords:
2D-QSAR PCA PLS
Pyrazole derivatives Anti-cancer activity
A B S T R A C T
In this study, a local quantitative structure–activity relationship (QSAR) models were developed for set of compounds tested for their inhibitory activity against six different cancer cell lines viz. A-549, MCF-7, HeLa, HepG-2, PaCa-2 and DLD-1. Two different statistical approaches Principal Component Analysis (PCA) and Partial Least Square (PLS) analyses were employed to developed QSAR models. Further, activity predictions were carried out for in-house synthesized 63 pyrazole derivatives. Prediction of pIC50 value of all 63 synthesized pyrazole derivatives were estimated based on the most significant QSAR model developed for each cancer cell line. Several statistical parameters such as correlation coefficient R2, RMSE, Cross validated R2, Cross validated RMSE, internal validation Q2 and the external validation R2 revealed that developed models showed a significant value for explaining an acceptable QSAR model. The results derived highlighted some important compounds for being the most promise lead candidate against the six-cancer cell line with a significant pIC50 value. Considering the contribution of most important descriptors, we have designed new molecules which found to have greater inhibitory potentiality than the reference compounds. Overall, the results suggest that the developed QSAR models might be useful as a theoretical reference for experimental studies and designing more potent anti-cancer therapeutic pyrazoles based compounds.
1. Introduction
Cancer remains a major source of morbidity and mortality, despite decades of scientific research and clinical trials of promising new treatments. There are several types of cancer including its subcategories, since each organ and each type of cell can be the origin of a cancerous tumor under certain condition, such Lung, breast, Cervical, liver, pancreatic and colon cancer which were the subject of the current in silico study.
Lung cancer is one of the most common cancer in the world and considered among the leading cause of death (~1.80 million deaths are reported in 2020) according to World Health Organization (WHO), Precisely it is a type of non-small cell cancer (NSCLC), and cause >1.4 million deaths per year [1,2]. Another important cancer type studied in this work was breast cancer which is considered as the most common cancer type for women, and ranked as second leading cause of death after lung cancer [3,4]. Breast cancer is a heterogeneous group of neo- plasms arising from the epithelial cells lining the milk ducts, the
* Corresponding author at: Division of Immunology-Oncology, Centre de Recherche Hopital Maisonneuve-Rosemont (CR-HMR), Montreal, QC, Canada. ˆ E-mail address: [email protected] (F.E. Bennani).
1 Pr FAOUZI, Pr ANSAR and Pr RUDD are contributing equality in this work.
2 Pr Doudach and Pr Karrouchi are contributing equality in this work.
Contents lists available at ScienceDirect
Computational Toxicology
journal homepage: www.sciencedirect.com/journal/computational-toxicology
https://doi.org/10.1016/j.comtox.2023.100265
Received 18 September 2021; Received in revised form 7 January 2023; Accepted 3 March 2023
heterogeneity of cancer cell phenotypes accompanied by dynamic plasticity of the tumor microenvironment makes tumor categorization a demanding task, especially as it relates to therapeutic responses and disease progression.; thus, novel therapeutic drugs for this aggressive tumor type are urgently needed [5]. Cervical cancer is another most common type of cancer has occupied third highest mortality in women, worldwide [6]. According to the WHO statistics, cervical cancer shows approximately 12% of all type of cancers, and regarded as fourth most common cancer in women worldwide and commonly found cancer types in developing countries [7]. In 2018, an estimated ~570000 women were diagnosed with cervical cancer worldwide and about 311,000 women deaths were observed [8]. In practice, cisplatin, paclitaxel, topotecan, and gemcitabine like drugs and other chemotherapies are utilized to treat cervical cancer.
Moreover, in 2020, liver cancer was ranked in third place for causing death among the different types of known cancers, and considered one of the few cancers registered globally with a rapid upward inclination [9]. Among the various important factors that lead to the high mortality rate in patients with liver cancer, are the difficulty of early diagnosis, rapid development and its resistance to treatment [10]. Several considerable efforts have been made to develop new targeted chemo- therapeutic agents for all types of cancer including liver cancer [11–13].
Subsequently, pancreatic cancer is a type of malignant tumor found in the gastrointestinal tract, with a risk of malignant cells forming in the pancreatic tissues [14,15]. It is regarded as one of the deadliest cancers in the world, owing to the difficulties in diagnosing it until it is in its advanced stages in the majority of cases. Until today, there were just a few effective therapies available [14]. According to the American Can- cer Society, the survival rate for pancreatic adenocarcinoma at all stages is about 7% in less than 5 years [16]. However, the cause of PC is yet to be determined prominently for research expedition [17]. According to the WHO data, cancer in all of its forms is now one of the top causes of mortality [18], surpassing a number of diseases, including cardiovas- cular disorders, as the leading cause of death as well [19].
Herein, we have taken an initiative to computationally explore a set of compounds dataset which were experimentally tested against six different cancer cell line which represent the cancer types mentioned above, in order to establish six different significant 2D quantitative structure activity relationship (QSAR) models, the models were derived from the collected compounds dataset using two different methods Principal Component Analysis (PCA) and Partial Least Square (PLS), which represent through different classes of chemical moieties those have wide range of anti-cancer activities against six cancer cell lines.
Then, based on the created QSAR model, the primary goal of the current work was to predict the anti-cancer activity of the 63 in-house synthe- sized pyrazole derivatives belonging to hydrazone–carbohydrazide, hydrazone- acetohydrazide, triazol-thiol, and Pyrazole oxadiazole-thiol families. The synthesis of 63 compounds (pyrazoles derivatives) considered in the present study was reported in the previous work by Karrouchi et al. [20–26]. The choice of pyrazoles is based on the fact that pyrazoles have grown in popularity as a result of their numerous applications. They are one of the most active families of chemicals, with a wide range of pharmacological effects including anti-bacterial, anti- convulsant, analgesic, anti-inflammatory, anti-tubercular, cardiovascu- lar, and anti-cancer properties [27,28].
This research study was accomplished using quantitative structur- e–activity relationship (QSAR) modeling technique- widely used in many research areas including expediting identification of potent chemical entities for diverse therapeutic applications. This mathemat- ical method was chosen due to it’s a powerful chemometric [29,30]
approach for establishing an empirical rule relating the structural de- scriptors of compounds under investigation to biological activities.
Usually, different types of molecular descriptors describe different as- pects of a molecule under investigation. They are numerical values connected with the chemical constitution that are used to correlate the chemical structure with their physical characteristics, chemical
reactivity, and biological activity [31–34]. Currently, there are a large number of molecular descriptors that can be classified as physico- chemical descriptors (hydrophobic, steric, or electronic), structural de- scriptors (based on frequency of occurrence of a substructure), topological and electronic descriptors (based on molecular orbital cal- culations), geometric descriptors (based on a molecular surface area calculation), and simple indicator parameters (dummy variables) [35,36], these molecular descriptors are being used in different studies to explore essential features of congeneric series of compounds. Herein, the employed QSAR method was used to predict the bioactivities of some congeneric series of in-house synthesized compounds, as well as to design new compounds based on their structural descriptors.
2. Materials and methods 2.1. Experimental data sources
In this present work, IC50 values of set of compounds experimentally tested (MTT assay) against six different cancer cell lines which are A549 (Lung Cancer) [37], MCF-7 (Breast Cancer) [37], HeLa (Cervical cancer) [37], HepG-2 (Liver cancer) [38,39], PaCa-2 (Pancreatic cancer) [40–42] and DLD-1 (Colon cancer) [43,44] were retrieved from several literatures (the source is summarized in Supplementary Information from Tables S1 to S6). All compounds considered for QSAR model development were studied for inhibition of different cancer cell prolif- eration (or anti-cancer activity) and IC50 values was measured and re- ported in their studies. For each cancer cell line, all experimental activity values (IC50) collected were normalized in µM then converted to the negative logarithm (i.e., pIC50 ¼-log10(IC50)). The choice of pIC50
value instead of IC50 has several advantages: it orients to think loga- rithmically into the data instead of the arithmetic scales, as basically, dose–response inhibition is a logarithmic phenomenon, so it makes more sense, one nice aspect of pIC50 is that the potent compounds have the higher numbers instead of lower one, pIC50 and logarithmic thinking improve the way you examine and view the reliability of the data, additionally, the using the log give possibility to conveniently compare numbers that differ by several orders of magnitude, finally the possi- bility to plot the pIC50 on linear axes when creating an SAR plot, instead of using logarithmic axes for the IC50.
2.2. Molecular descriptors calculation
After retrieving the information of collected compounds, initially all structures were drawn using ChemDraw15.0 [45]. Thereafter, a total of 184 two-dimensional (2D) descriptors were calculated via MOE 2008.10 (Molecular Operating Environment) software [46]. The list of the different descriptors used for the established QSAR models of this study as well as their meanings is summarized in the Table S7 of Supple- mentary Information. Then a correlation matrix was employed to remove the highly correlated descriptors with correlation values higher than 0.75 (>0.75). Such removal from the descriptors lists help in choosing the appropriate one, and avoid problems in over fitting the data after model generation, also enhance the model predictability [47].
From largest of molecular descriptors used in QSAR studies, once stati- cally validated, a few selective descriptors can be utilized to predict the biological activity of untested compounds. In this study, after QSAR model development, best model and their descriptors was employed to predict activity of in-house synthesized 63 pyrazole derivatives.
2.3. Data set division/splitting
As per fundamental of QSAR principle for carrying out a QSAR model development, each cancer line data set was divided into two sets:
training set to develop the QSAR models, and the test set to validate and examine the prediction quality of the developed models [48]. Particu- larly, data set division followed partitioning of approximately 75% of
molecules into the training set while remaining approximately 25% in the test set (Table 1). The test set compounds were selected manually considering the structural diversity and wide range of activity in the data set with 5:1 ratio, for small dataset, note that in general two models can be generated by randomly splitted dataset (80:20, 90:10 or 95:05 etc.) or full dataset (100% molecules in training set only, in this case external validation is not possible).
The table below summarize the number of molecules corresponds to the training and test set. Furthermore, all compounds used in test set are indicated by Asterix in Supplementary Information from Tables S1 to S6.
2.4. QSAR models development and validation
For each cancer cell line, the QSAR models were developed following the OECD (Organization for Economic Co-operation and Development) guidelines for acceptable QSAR models ensuring ambiguity and trans- parency [49]. Herein among several modeling methods such ANN, SVMs, MLR… etc., two different statistical approaches were employed for QSAR model’s development i.e., principal component analysis (PCA) and partial least squares (PLS) –the choice of PCA is due to the fact that it summarizes all the information encoded in the structures of compounds.
It is also a very useful method for understanding the distribution of the studied compounds in the space. This is an essentially descriptive sta- tistical method which aims to present, in graphic form, the maximum of information contained in the data additionally PCA and PLS are widely used and accepted methods–. The set of variables/descriptors was given as input parameters to perform both the PCA and PLS analysis using XLSTAT 2014 software package which is a flexible Excel data analysis add-on that allows users to analyze, customize and share results within Microsoft Excel [50]. In order to validate the quality of the developed QSAR models, different statistical parameters were calculated particu- larly the correlation coefficient (R2), the root mean square error (RMSE), moreover, models were also validated by internal cross-validation (Q2), external validation (R2 external), cross validated R2 and cross validated RMSE values [51,52,53], all the validation process was repeated four time, using the default setting of XLSTAT.
2.5. Applicability domain analysis
Applicability domain (AD) defines as the structural domain for a given QSAR model or “represents a chemical space from which a model is derived and where a prediction is considered to be reliable” [54,55].
Therefore, prediction reliability of a particular model is dependent on its AD assessment. The accuracy of a QSAR model’s prediction capacity for novel chemical entity is its primary benefit, therefore after the model is constructed, its AD must be determined. A model is only deemed valid if it can make predictions within its training area, and only forecasts for novel substances falling within its application domain, not model ex- trapolations, may be considered trustworthy. The most frequent
approach for determining the AD is to calculate the leverage value of each compound. A model is valid only within its training domain and new molecules must be considered as belonging to the domain before the model is applied (OECD Principle 3) [56]. Moreover, AD is useful to find out compounds that are outside of the QSAR model and it detects outliers present in the training set compounds. The most common method for determining the AD is to calculate the leverage value of each compound [57]. The j AD calculation can be explained as follows, hi = xTi(XTX)-1xi, where xi is the descriptor row vector of query compound, X is the is the descriptor matrix derived from the training set descriptor values, and the superscript T refers to the transpose of matrix/vector [58].
3. Results and discussion
We displayed bellow the results of PCA, PLS, the prediction of pIC50
of the in-house synthesized pyrazole derivatives and design of new compounds, while the results of applicability domain of each QSAR model are displayed in Supplementary Information (from Figures S1 to S6).
3.1. Correlation matrix and PCA analyses for all compound dataset of each cancer cell line
The PCA [59] analyses were conducted to correlate between the different variables under study. To decrease the redundancy existing in the data matrix, the descriptors that are highly correlated (R ≥0.75), were excluded. The rest of total number of descriptors constituting the training set was submitted for PCA. These descriptors demonstrated significant correlation with pIC50 and a feeble correlation between them. For all cell lines, the two main axes F1 and F2 were sufficient to describe and explore the information given by the correlation matrix which was established to obtain information on the negative and posi- tive correlation between the descriptors. Then, to better understand the effect or influence of each descriptor on anticancer activity, we choose the Correlation Circle and the Biplot graph to represent the projections of the variables on the first two axes, F1 and F2.
3.1.1. PCA based QSAR model of compounds dataset for lung cancer cell line - A-549
The correlation between the 4 selected descriptors and pIC50 against lung cancer cell line is illustrated in the correlation matrix given in Supplementary Information Table S8.
For the lung cancer cell line, the correlation circle of PCA showed good co-linearity between the different descriptors and anti-cancer ac- tivity partition coefficient was observed (Fig. 1A). On the other hand, the Biplot diagram presented in Fig. 1B, illustrate the clustering of training set molecules in two groups.
The first group formed by molecules 12, 23, 14, 43 and 31 observed on the top right, showed moderate anticancer activity (4.60 ≤pIC50 ≤ 4.95) against this lung cancer cell line. This may be due to the descriptor
‘Chi1v_C’signifiesthe ‘Carbon Connectivity Index ‘which contributed negatively to the anticancer activity, with a correlation coefficient r value of − 0.559. The second group of molecules includes compounds 5–13, 15, 16, 19–21, 24, 25, 27–30, 34–36, 39–42 that observed on the bottom right, showed much better anticancer activity (4.95 ≤pIC50 ≤ 5.95) against lung cancer cell line. The descriptor ’a_nCl’(i.e., number of chlorine atom) negatively affects the anticancer activity (r =-0.417) and the other descriptor ’SlogP_VSA6′(indicates the accessible area of van der Waals in an interval of (0.20, 0.25)) also contributed negatively to the anticancer activity with a correlation coefficient of − 0.296.
3.1.2. PCA based QSAR model of compounds dataset for breast cancer cell line -MCF-7
The correlation between the 4 selected descriptors and pIC50 for Table 1
Splitting of the entire dataset into training and test set for the six cancer cell lines.
Type of Cancer cell
line used Total Numbers of
molecules Numbers of
Training set Numbers of Test set
Lung cancer (A-549) 33 26 7
Breast cancer (MCF-
7) 38 30 8
Cervical cancer
(HeLa) 30 25 5
Liver cancer (HepG-
2) 23 19 4
Pancreatic cancer
(PaCa-2) 33 26 7
Colon cancer (DLD-
1) 28 22 6
compounds against breast cancer cell line is illustrated in the correlation matrix given in Supplementary Information Table S9.
For compounds of breast cancer cell line, the correlation circle from PCA shows a strong correlation between pIC50 and SlogP_VSA4 (Fig. 2A). On the other hand, the Biplot diagram (Fig. 2B) revealed three distinct clusters of molecules.
The first cluster consist of three molecules (compound 44–46) observed in the top right of Biplot, and these compounds shows potential anti-cancer activity of pIC50 =4.60–5.01 against MCF-7 cancer line, and mainly characterized by the descriptor SlogP_VSA4 having a Pearson coefficient r value =-0.564. The second cluster consist of molecules 64,
74 and 84 and can be characterized by moderate anticancer activity on MCF-7 breast cancer cell line, probably due to negative contribution of the descriptor ’GCUT_SMR_1′ with a correlation coefficient r value of
− 0. 448.The third cluster is made up of various others molecules in the dataset that clump together around the Biplot core and have limited anti-cancer activity.
3.1.3. PCA based QSAR model of compounds dataset for cervical cancer cell line - HeLa
The correlation matrix presented in Supplementary Information Table S10 shows the correlation values between the 6 selected
A B
Fig. 1. PCA plot -(A) Correlation circle of the relevant descriptors associated with anticancer activity (pIC50), (B)Biplot of molecules explained in two axes F1 and F2, the percentages of the variance are estimated as 53.9% and 34.06% for the two axes F1 and F2, respectively. The total estimated variance is 87.97%.
A B
Fig. 2.PCA plot - (A) Correlation circle of the relevant descriptors associated with anti-cancer activity (pIC50), (B) Biplot of molecules explained in two axes F1 and F2. The percentages of the variance are estimated as 40.13% and 36.56% for the two axes F1 and F2, respectively. The total estimated variance is 76.69%.
descriptors and the pIC50 for compounds against the cervical cancer cell line.
Correlation circle plot obtained in the PCA for compounds of cervical cancer cell line, displayed in Fig. 3A which shows a strong correlation between the anti-cancer activity of the HeLa line and the TPSA descriptor (the 2D topological polar surface (A**2). Furthermore, the correlation circle also indicates that there is certain level of correlation between BCUT_SLOGP_3, PEOE_VSA_FNEG (that designate Fractional negative van der Waals surface area. This is the sum of the νi such that qi is negative divided by the total surface area. The vi are calculated using a connection table approximation) and anti-cancer activity of the HeLa.
The Biplot graph presented in Fig. 3B, distinguishes three distinct cluster of molecules separated according to their anti-cancer activities.
Molecules 99, 109 119, and 129, belong to the first cluster, repre- sented by two descriptors such as ’a_nCl’ (number of chlorine atom) and
’GCUT_SMR_1′ (atomic contribution to molar refraction), has shown negative contribution on HeLa cell for exhibiting cervical cancer activ- ity. The second cluster consists of the molecules 115, 125 and 126 shows good anticancer activity, due to the fact that the descriptor ’TPSA’
designates the 2D topological polar surface based on the contribution of functional groups of all polar atoms (e.g., oxygen and nitrogen,). This descriptor contributed to increase the anti-cancer activity of the mole- cules against the HeLa cell line with a correlation coefficient r =0.641.
The descriptor ’BCUT_SlogP_3′(molecular polarity and the way in which the negative and positive electric charges are distributed in a molecule) also contributed positively to increase the anticancer activity with a correlation coefficient r = 0. 472. The third cluster, which includes molecules 92, 94, 95, 124, and 126, has shown moderate anticancer activity. This group is described by the descriptor ’a_nBr’ (which des- ignates the number of bromine atoms in the molecule), and has a cor- relation coefficient of r =0,412. Study can suggest that introduction of bromine atom in the molecule probably will increase the anticancer activity against the cervical cancer.
3.1.4. PCA based QSAR model of compounds dataset for liver cancer cell line - HepG-2
The correlation matrix in Supplementary Information Table S11 shows the relationship between the 3 selected descriptors and the pIC50
for compounds against the liver cancer cell line.
For liver cancer cell line, the correlation circle obtained through PCA depicted in Fig. 4A which suggest that there is a correlation between cytotoxicity and the descriptor PEOE_VSA_PPOS energy. The Biplot di- agram displayed in Fig. 4B also highlighted the contributions of set molecules in anti-cancer activity against the HepG-2 cell line and those are found to be divided into two distinct groups. The first group consist of molecules 155, 159 and 160 shows a higher anti-cancer activity against HepG-2 liver cancer cell line. Such observation might be due to the fact that the descriptor ’PEOE_VSA_PPOS’ (van der Waal’s positive total polar area (VDW) in 2D) has contributed positively with a corre- lation coefficient r = 0.559.The second group consisting of weal’s compounds 134 and 146, and characterized by moderate anticancer activity, and such observation might be due to the fact that the descriptor ’PEOE_VSA +2′(Van DerWaal’s total positive area 2 (vdw 2D)) has contributed positively on HepG-2 liver cancer cell line with a moderate correlation coefficient r value of 0.374.
3.1.5. PCA based QSAR model of compounds dataset for pancreatic cancer cell line -PaCa-2
The correlation matrix in Supplementary Information Table S12 shows the relationship between the four chosen descriptors and the pIC50 for drugs against the pancreatic cancer cell line.
Fig. 5A shows the correlation circle corresponding to a projection of the variables on a two-dimensional plane obtained in PCA. A good correlation is observed between the descriptor ’oprt_nrot’ and
’BCUT_PEOE_1′with a coefficient r value of 0.710. On the other hand, negative correlation between the descriptors ’BCUT_PEOE_1′ and
’PEOE_VSA + 5′ is also observed. Analysis of the Biplot diagram (depicted in Fig. 5B) highlighted the distribution of compounds in 4 distinct groups. One group consist of compounds 172, 171, 161, 162, and 167, and these molecules shows relatively lower anticancer activity
A B
Fig. 3.PCA plot - (A) Correlation circle of the relevant descriptors associated with anti-cancer activity (pIC50), (B) Biplot of molecules explained in two axes F1 and F2. The percentages of the variance are estimated as 38.86% and 21.69% for the two axes F1 and F2, respectively. The total estimated variance is 63.56%.
against PaCa-2 pancreatic cancer cell line, as probably the descriptor
’Q_RCP+’ (the relative positive partial charge 2D) has contributed to decrease the anticancer activity, observed with a correlation coefficient r = −0.305. Another group consists of molecules 164, 165, 182,183,185, 186, and 194–197 is characterized by moderate anticancer activity against PaCa-2 pancreatic cancer cell line. It has been observed that descriptor ‘opr_nrot’, i.e., molecule which contains a number of single bonds in rotation could contribute to decrease the anticancer activity as reflected with a correlation coefficient r = −0.581. The third group consists of molecules 163, 178, 181, 184, 191, 192, and 194-196shows moderate cytotoxicity against this cancer cell line and that is charac- terized by the descriptor ’BCUT_PEOE_1′ which probably contributed negatively on anticancer activity with a correlation coefficient r = − 0.
238. Such observation indicates that the anticancer activity of these molecules might have affected by partial charge. Finally, the fourth group consists of molecules 169, 170, 171, 174, 175, 180, 187, 190 and 193 showed promising cytotoxic effect against this cancer cell line PaCa- 2, and it characterized by the descriptor ‘PEOE_VSA + 5’ which contributed positively to increase the anticancer activity with a corre- lation coefficient r value of 0.449.
3.1.6. PCA based QSAR model of compounds dataset for colon cancer cell line– DLD1
The correlation matrix shown in Supplementary Information Table S13 shows the relationship between the 3 selected descriptors and the pIC50 for compounds against the colon cancer cell line.
A B
Fig. 4.PCA plot - (A) Correlation circle of the relevant descriptors associated with anti-cancer activity (pIC50), (B) Biplot of molecules explained in two axes F1 and F2. The percentages of the variance are estimated as 40.68% and 30.73% for the two axes F1 and F2, respectively. The total estimated variance is 71.42%.
A B
Fig. 5.PCA plot - (A) Correlation circle of the relevant descriptors associated with anti-cancer activity (pIC50), (B) Biplot of molecules explained in two axes F1 and F2. The percentages of the variance are estimated as 50.10% and 26.60% for the two axes F1 and F2, respectively. The total estimated variance is 76.70%.
In Fig. 6A, PCA generated correlation circle plot highlighted the variables on two-dimensional plane which shows good correlation co- efficient (r >0.5) between most of the variables. Precisely, a negative correlation observed between the descriptors ‘GCUT_SMR_1′and rings (r =-0.505) presented along the F1 axis. However, a low correlation is observed between ‘GCUT_SMR_1′and ‘a_nCl’ (r =0.009), as well as the other descriptors between ‘petitjeanSC’ and ‘a_nCl’ (r = 0.240). The Biplot diagram (depicted in Fig. 6B) distinctly highlighted three groups of molecules under study. The first group consists of molecules 201, 204, and 208–211, shows moderate anticancer activity against the DLD-1 cell line. Such observation might be due to the negative contribution of the descriptor ‘a_nCl’ (number of chlorine atoms in the molecule), which has been reflected with a correlation coefficient r value of − 0.197. On the other hand, second group that has formed by molecules 221, 225, 233, 235 and 236 also shows moderate anticancer activity, explained by the descriptor ‘GCUT_SMR_1′(indicates the molar refraction) with a corre- lation coefficient r =0.357. The third group constituted by molecules 213, 215–218, 220, 225, 226 and 232 shows slightly better anticancer activity against the DLD-1 cell line, that might due to the positive contribution of the descriptor ‘petitjeanSC’ reflected with a correlation coefficient r value of 0.653.
3.2. PLS models for each cell lines
Using the same dataset used in PCA method, six more QSAR models were developed using the PLS technique, correlating biological activity with relevant descriptors for each cancer cell line. Then all PLS based developed models were rigorously evaluated for statistical validation and accepted parameters, and also good prediction performance. The best models obtained employing this PLS method was retained after several attempts to develop a strong relationship with the cytotoxicity indicator variable pIC50, and models that did not meet the OECD guidelines were excluded. The table below (i.e., Table 2) shows the PLS derived results and corresponding equations for each cancer cell line under the present study.
The model equations in Table 2 are statistically significant with appropriate internal and external validation values (Table 3), except for
A B
Fig. 6.PCA plot - (A) Correlation circle of the relevant descriptors associated with anti-cancer activity (pIC50), (B) Biplot of molecules explained in two axes F1 and F2. The percentages of the variance are estimated as 42.56% and 32.62% for the two axes F1 and F2, respectively. The total estimated variance is 66.18%.
Table 2
QSAR equations derived from PLS method corresponding to each cancer cell lines.
Cancer cell
lines QSAR models equations developed employing PLS method A-549 pIC50=10.40364− 0.79685×chi1v C− 0.35955×
a nCl− 0.02641×SlogP VSA6+0.11718×logP(o/w)/
[Equation1]
MCF-7 pIC50= −0.18535− 1.79854×BCUT PEOE0−20.79526× GCUT SMR1−0.40536×chi1v C− 0.06209×
SlogP VSA4/[Equation 2]
HeLa pIC50= −15.79229+10.09778×BCUT SLOGP3−0.94880× GCUT SMR1+0.81794×a nBr+0.01606×
a nCl− 5.23240×PEOE VSA FNEG−0.01542× TPSA/[Equation 3]
HepG-2 pIC50=4.23965+0.40534×BCUT PEOE0+0.04445× PEOE VSA+2+0.03732×PEOE VSA PPOS/
[Equation 4]
PaCa-2 pIC50=8.79762+0.01637×PEOE VSA+5−5.58363× Q RPC+ −0.33990×opr nrot+2.43658×
BCUT PEOE1/[Equation 5]
DLD-1 pIC50=2.83887+6.02707×petitjeanSC+13.63331× GCUT SMR1+0.33955×rings− 0.43086×a nCl/
[Equation 6]
Table 3
Different statistical parameters obtained for the six cancer cell lines during PLS based QSAR model development.
Cancer
cell line Correlation coefficient (R2)
RMSE Cross validated RMSE
Internal validation (Q2)
External validation (R2)
A-549 86.35% 14.20 18.98 76.20% 74.0%
MCF-7 85.44% 3.75 19.87 70.50% 80.8%
HeLa 82.80% 15.20 20.70 68.70% 58.9%
HepG-2 86.46% 7.82 9.70 79.67% 80.4%
PaCa-2 90.15% 11.95 15.66 83.18% 59.4%
DLD-1 82.08% 29.80 36.96 72.90% 87.1%
the two models corresponding to HeLa and PaCa-2 cell lines which are moderate. During model development and validation procedure, ob- tained values of different statistical parameters are given in Table 3 for each cancer line. Particularly, the correlation coefficient (R2), cross validated R2 and low RMSE, cross validated RMSE, suggest that all of these models are statistically significant and have strong external pre- dicting ability. All estimated R2 and Q2 values are found to be >0.5.
3.2.1. PLS based QSAR model analysis for compounds against lung cancer cell line - A-549
PLS method derived QSAR model equation identified that the pres- ence of number of chlorine atoms could have some strong contribution to increase the bioactivity or anti-cancer activity. This suggests that the introduction of chlorine atoms in the structures brings a favorable effect to the cytotoxic activity.
3.2.2. PLS based QSAR model analysis for compounds against breast cancer cell line-MCF-7
For breast cancer cell line, PLS method generated QSAR model revealed that the descriptors GCUT_SMR_1 and BCUT_PEOE-0 likely to have negative influence in the model equation. QSAR model also suggest that increased activity can be achieved by decreasing the molar refrac- tivity and relative partial positive charge values.
3.2.3. PLS based QSAR model analysis for compounds against cervical cancer cell line - HeLa
QSAR model generated through PLS method indicate that the total negative polar number of surfaces vdw (PEOE_VSA_FNEG_0), the molar refractivity (GCUT_SMR_1) and the topological polar surface area (TPSA) have negative influence or contribution in the equation of the developed QSAR model, suggesting that an increased anti-cancer ac- tivity can be gained by decreasing the negative charge, the molar refractivity and the polar surface area of the molecule. It has been also noticed that the number of bromine atoms (a_nBr) may contribute to increase the anti-cancer activity as well. Overall, model equation sug- gests that introduction of bromine atoms in the chemical structure might bring an enhanced cytotoxic activity.
3.2.4. PLS based QSAR model analysis for compounds against liver cancer cell line -HepG-2
Compounds experimentally tested against liver cancer cell line - HepG-2, used for PLS based QSAR model development suggest that positive polar surface of Van Der Waals ((PEOE_VSA_PPOS) has a posi- tive influence and can contribute to increase the anticancer activity. This suggests that the positive partial charge in the structures can favorably affect the cytotoxic activity. The total positive area of vdw (PEOE_vsa_+2) descriptor has positive coefficient value in the model equation suggesting that increased activity can be achieved by decreasing the positive charge.
3.2.5. PLS based QSAR model analysis for compounds against pancreatic cancer cell line-PaCa-2
PLS method derived QSAR model equation revealed that the Relative Positive Partial Charge (Q_RPC+) has a negative coefficient value in the model equation, suggesting that increased activity can be achieved by decreasing the value of this descriptor in the molecule. Moreover, it has been also found that number of rotating single bonds (opr_nrot) can contribute to increase of anti-cancer activity against pancreatic cancer cell line PaCa-2.
3.2.6. PLS based QSAR model analysis for compounds against colon cancer cell line -DLD-1
In PLS based QSAR model developed for compounds against colon cancer cell line - DLD-1 revealed that the number of chlorine atoms (a nCl) has negative coefficient value indicating that the number of chlo- rine atoms can lead to an increase in anticancer activity. Moreover, the
number of nuclei in the molecule (rings) has a positive coefficient and is the smallest in the model equation, implying that lowering the number of nuclei might result in enhanced activity.
3.3. Prediction of anticancer activity of the in-house synthesized pyrazole derivatives
The present study has been also aimed to predict the anticancer ac- tivity of some in-house synthesized series of pyrazole derivatives. After successful derivation of statistically significant, calibrated and classified QSAR models, we have predicted the pIC50 values of in-house synthe- tized 63 pyrazole derivatives, and results of best pIC50 values of four selected (top ranked) pyrazole derivative compounds has been given in the Table 4. The results of all 63 in-house synthesized compounds along with their pIC50 values are summarize in Supplementary information Table S14.
3.3.1. Anti-proliferative effects of in-house synthesized pyrazole derivatives against the lung cancer cell line -A-549
According to the predicted results (Table 8) obtained of our pyrazole derivatives through structure–activity relationship (SAR) analyses, compound 63 shows the most promising anticancer activity against lung cancer cell line (A-549) with pIC50 value =9.40. In contrast, compounds 1, 44 and 46 (Fig. 7) showed moderate to good potency against the cell line studied with the pIC50 values of 9.13, 9.22 and 8.85 respectively.
The predicted pIC50 values for all 63 pyrazole derivatives range from 7.57 to 9.40.
Particularly, Compound 63 bearing an electron-donating substitu- tion (–CH3) on the pyrazole ring A (Fig. 8) and the thiol substitution (-SH) on the oxadiazole ring showed the best promising cytotoxic effect against the lung cancer cell line A- 549 (pIC50 =9.40). Such observation is comparatively better than some other pyrazole chalcone derivatives (pIC50 =3.82) [60]. While Compound 1 also has a methyl group on the pyrazole A ring and the hydrazide-substituted ketone function showed moderate cytotoxicity against the A-549 cell line (pIC50 = 9.13) compared to the result obtained in said literature [60]. On the other hand, conjugate 44 bearing a methyl group attached with ring A (pyr- azole) and a thiol group substituted in ring B (triazole) showed good anticancer activity against cancer cell line A-549 with a pIC50 =9.22.
Finally, conjugate 46 having a di-methyl (3,5-methyl) group on the A ring (pyrazole) and the thiol substitution on the B ring (triazole) showed a moderate cytotoxic effect against the A-549 cell line with a pIC50 = 8.85.
3.3.2. Anti-proliferative effects of in-house synthesized pyrazole derivatives against the breast cancer cell line - MCF-7
The pIC50 prediction study of the breast cell line revealed that four compounds viz. pyrazole derivatives 5, 9, 10, and 13 have the strongest anti-proliferative potentiality as observed from the model equation.
Table 4
Anti-cancer potentiality pIC50 of best four in-house synthesized pyrazole de- rivatives predicted based on derived QSAR model of each cancer cell line.
pIC50
Cell line pIC50valuesof best potential inhibitors from each cell line A-549
Lung cancer M1 =9.13 M44 =9.22 M46 =8.85 M63 =9.40 MCF-7
Breast cancer M5 =6.80 M9 =6.53 M10 =6.88 M13 =6.54 HeLa
Cervical cancer M30 =7.17 M32 =7.20 M42 =7.27 M43 =7.19 HepG-2
Liver cancer M48 =5.80 M50 =5.80 M55 =5.80 M59 =5.83 PaCa-
pancreatic cancer M11 =5.38 M44 =5.97 M46 =5.89 M63 =6.22 DLD-1
Colon cancer M46 =8.15 M49 =8.32 M50 =8.21 M52 =7.88
Compound10 with a methyl substitution on the pyrazole cycle (ring A) and the benzene cycle substituted by an electron donor group4- hydroxy (4-OH) and 3-methoxy (3-OCH3) (ring B) shows the best cytotoxic effect against the MCF-7 cell line (pIC50 =6.88). Such obser- vation explains that the presence of electron donating groups such as -CH3group, hydroxy group (–OH) and methoxy group (–OCH3) increase
the anti-proliferative activity. Also, the obtained value of pIC50 =6.88 of compound10 has been found to be higher than that of the pyrazoline derivatives (pIC50 = 5.17 ± 3.54) against MCF-7 cell line, reported earlier by Irfan Khan et al., and Fatihtok et al. [60,61]. In contrast, the compounds synthesized carry the electron withdrawing groups, have a low anti-proliferative potential. Based on this activity profile, study has noticed that the presence of 2,4-Cl, Br, N(CH3)2 groups on the B-ring can be considered as a key parameter responsible for the decreasing cytotoxicity. Based on predicted study, in addition to compound 10, derivatives 5, 9 and 13 are also found to be potent compounds with their pIC50 of 6.80, 6.54 and 6.53, respectively (Fig. 9). Interestingly, these pyrazole compounds contain methyl group at carbon 5 position on ring A and Nitro, Furan and Fluro groups on ring B, respectively, indicates the presence of electron-withdrawing groups on ring B probably have enhances the cytotoxicity for this cancer cell line (Fig. 10). Therefore, the optimal ascending order of potential activity based on substitutions on the ring might be proposed as 2-OH >OCH3 >Furan >NO2>(-N- CH3)2 > 2,4-Cl > Br > CH3. Previously reported other study also demonstrated the cytotoxicity of breast cancer cell line MCF-7and esti- mated to be 6.88–6.53, pyrazole chalcone IC50 = 3.82 ± 0.22 [60], pyrazolopyrimidine derivatives IC50 = 7.69 [60], and pyrazoline de- rivatives pIC50 =5.18 ±3.54.
3.3.3. Anti-proliferative effects of in-house synthesized pyrazole derivatives against the cervical cancer cell line - HeLa
For cervical cancer cell line – HeLa, we have found four pyrazole derivative compounds 30, 32, 42 and 43 (Fig. 11) as the most potential among all based on the predicted pIC50 values.
Particularly, structural analyses suggest that dimethyl substitution on the pyrazole ring A as well as the phenyl, methoxy, N-dimethyl, bromo and methyl substitutions on the benzene ring B might be H
H H
H H
Fig. 7.Structures of novel series of pyrazole triazole thiol and pyrazole oxadiazole thiol found to exhibit as best pIC50values against the lung cancer cell line- A-549.
Fig. 8. Structure-activity relationship study (SAR) of pyrazole triazole thiol derivative as potent anticancer agents.
promising for exhibiting cytotoxic effect against the cervical cancer cell line – HeLa. The predicted pIC50values of compounds 30, 32, 42 and 43 are found to be 7.17, 7.20, 7.27 and7.19, respectively. Such observation is highly corroborated with other study finding, where investigator [60]
has synthesized and evaluated the pyrazoline derivatives as potential anticancer agents against HeLa cell line, and obtained the value of pIC50
=9.41 ±2.20. Another study on synthesized pyrazolo derivatives also reported similar study findings against of HeLa cell line [61]. Based on all observation, it can be suggested that optimal ascending order of ac- tivity for substitutions on the benzene ring B might be as (-N-CH3)2 >
Phenyl >Br >CH3 (Fig. 12). On the contrary, compounds bearing –NO2, 2,4-Cl and –CH3 groups exhibited less anti-proliferative activity against
cervical cancer cell line.
3.3.4. Anti-proliferative effects of in-house synthesized pyrazole derivatives against the liver cancer cell line – HepG-2
Predicted pIC50 values of in-house synthesized pyrazole derivatives against the liver cancer cell line revealed that compounds 48, 50, 55 and 59 (Fig. 13) have the prominent anti-proliferative potentiality than other predicted values of synthesized compounds as observed from the model equation.
Compounds 48 and 50 bearing methyl (–CH3) substitution on ring A and bromo (Br) and methoxy (OCH3) substitutions on benzene ring B shows promising cytotoxic effect against this cancer cell line with Fig. 9.Structures of in-house synthesized pyrazole acetohydrazide showing the best pIC50 values against the breast cancer cell line - MCF-7.
Fig. 10.Structure-activity relationship study (QSAR) of pyrazole acetohydrazide derivative as potent anticancer agents.
similar pIC50 value i.e., 5.80 (Fig. 14). On the other hand, compounds55 and 59 have ‘–CH3′ group on pyrazole ring A and benzene ring B substituted by electro-attractant groups can exhibit significant cyto- toxicity against liver cancer cell line with pIC50 = 5.80and 5.83, respectively. This observation indicates that the presence of electron- withdrawing groups can promote anti-cancer activity. Our study find- ings are also comparable to other study reported earlier [49], where pyrazole thiazole[2,3–b] quinazolinones compounds shows promising
anti-cancer activity against HepG-2 cell line with pIC50 values 5.12–6.14, [63] as well as Afifi et al [64] that synthesized a new series of pyrazoles and pyrazolo [1,5–a] pyrimidines containing benzothiazole, their results show significant anticancer activity against HepG-2 cell line pIC50 =4.73–4.02. Moreover, it has been found that phenyl, chloro (-Cl) and bromo (-Br) groups on the B ring can exhibit significant anti-cancer potentiality compared to other compounds bearing a furan group on the same ring, suggesting that the presence of an electron-withdrawing
Z N H Z H N
Z N H Z H N
Fig. 11.Structures of in-house synthesized series of pyrazole acetohydrazide showing the best pIC50 values against cervical cancer cell line-HeLa.
Fig. 12.Structure-activity relationship study (SAR) of pyrazole acetohydrazide derivative as potent anticancer agents.
group on the B ring can increases the cytotoxicity profile.
3.3.5. Anti-proliferative effects of in-house synthesized pyrazole derivatives against the pancreatic cancer cell line – PaCa-2
Based on the model QSAR equation obtained for pancreatic cancer cell line – PaCa-2, prediction of in-house synthesized pyrazole de- rivatives revealed most prominent pIC50 values for compounds 11, 44, 46 and 63 (Fig. 15) against the PaCa-2 cancer cell line.
Particularly, close structural analyses revealed that oxadiazole compound 63 contain a thiol group shows very prominent cytotoxic effect against PaCa-2 cell line with pIC50 =6.22. Other compounds 44 and 46 also shows moderate anticancer activity with pIC50 values 5.97 and 5.88, respectively, suggesting that the presence of an –NH2 as an electron-donor group on the triazole ring (compound 44) and di-methyl on the pyrazole ring (compound 46) probably enhances the cytotoxic effect against PaCa-2 pancreatic cancer cell line. Another compound 11 contains a methyl group on the benzene ring also shows slightly mod- erate types of anti-cancer activity on the tested cell line with a pIC50
value of 5.38.
3.3.6. Anti-proliferative effects of in-house synthesized pyrazole derivatives against the colon cancer cell line – DLD1
In case of colon cancer cell line DLD-1, the findings of our QSAR model’s prediction allowed us to identify best four pyrazole derivatives 46, 49, 50, and 52 with their pIC50 values of 8.15, 8.32, 8.21, 7.88, respectively (Fig. 16).
Precisely, the triazole pyrazole compound 46, bearing an amino group (–NH2) on the triazole ring and a dimethyl group on the pyrazole ring shows the highest cytotoxicity against the DLD-1 cell line. Other compounds 49 and 50 bearing methyl group (–CH3) on pyrazole ring (A) and another –CH3 group on benzene ring (B) (for compound 49) and methoxy group (for compound 50) also exhibited significant anticancer activity, suggesting that the presence of –CH3 and –OCH3 as electron- donor groups on the benzene (B) ring can probably increase cytotox- icity. When comparing compounds 46, 49, and 50 with the compound 52 that contains a phenyl group on the pyrazole ring (A) and a dimethyl
Z H H
Fig. 13.Structures of in-house synthesized pyrazole triazole thiol derivatives showing the best pIC50values against the liver cancer line- HepG-2.
amino substituent on benzene ring (B) shows less growth inhibition. In particular findings suggest that the pyrazole carbohydrazide compounds shows moderate to low cytotoxic activity against this cancer cell line.
Therefore, anticancer activity can be proposed by means of presence of the following groups as in descending order of NH2 >OCH3 >N(CH3)2
>Cl>2,4-Cl >2-Br >furan >F >CH3 (Fig. 17).
4. Design and prediction of pIC50 of newly designed pyrazoles derivatives
Apart from the prediction of already in-house synthesized pyrazoles derivatives, we have also designed few new compounds by comparing the contribution of descriptors which might have greater potentiality than those previously synthesized compounds. Based on pIC50 values obtained thorough best model equation for each cell line, we again calculated the standardized coefficient or t-test, and further noted the largest t-test value of the specific descriptors which has the most influ- ence on the pIC50. The t-test values obtained for each cancer line are listed below. In addition, the compound was used as template to design all new structures also included with their predicted pIC50 value for each cancer cell line. We displayed bellow the top five ranked designed compounds for each cancer cell line based on their pIC50 predicted, while in Supplementary Data we exhibited the 15 new designed com- pounds for each cancer cell line from Tables S15 to S20.
4.1. Design and prediction of new potential compound against lung cancer cell line A-549
The contribution of each descriptor in the developed QSAR model has been analyzed carefully. It has been revealed that descriptors chi1v- C, a_nCl and SlogP_VSA6 have negative influence on anticancer activity (pIC50), in contrast other descriptor logP(o/w) shows positive influences towards the anti-cancer activity, as observed through pIC50. The t-test values of the calculated model descriptors are 33.402068, 15.07148591, 1.107044758 and 4.91190855 for chi1v_C, a_nCl, SlogP_VSA6 and logP (o/w), respectively. As observed from the t-test values, chi1v_C and a_nCl are the largest. Considering the above-mentioned results, we have carried out respective substitutions of chemical group at specific posi- tion and then re-calculated their activities using the equation of the proposed model. The compound that served as template for the Fig. 14. Structure-activity relationship study (SAR) of pyrazole triazole thiol
derivative as potent anticancer agents.
E N H
H
Fig. 15. Structures of in-house synthesized pyrazole acetohydrazide, oxadiazole and triazole thiol shows the best pIC50 values against the PaCa-2 pancreatic cancer cell line.
substitution is the compound 63 displayed below.
pIC50=10.40364− 0.79685×chi1v C− 0.35955×a nCl− 0.02641
×SlogP VSA6+0.11718×logP(o/w)
On the other hand, to increase the value of the anti-cancer activity, it is necessary to decrease the values of descriptors chi1v_C (denotes the Carbon Valence Connectivity Index (order 1)) which must be lower than 1.4571 (Chi1v_C less than 1.4571). To do so, we have replaced the ‘R’
group by different chemical groups characterized by low carbon dis- tance connectivity (if ‘R’ group contains more carbon atoms, and hence increase in the chi1v_C descriptor value may observed) and at the same time avoid introducing groups containing a chlorine atom (a_nCl = 0.00). Keeping the SlogP_VSA6 descriptor (denotes the sum of van der waals area by SlogP contribution) value as zero (SlogP_VSA6 =0.00), by
‘R’ group substitution with smaller chemical species and stronger elec- tron acceptance ability such as POCl2, NH2, N(CH3)2, NO2, OH, (exception the fluor form-substituted compound 15) may also give similar observation. Increasing the value of descriptor logP(o/w) (as logP(o/w) >1.3280), indicates that the activity is higher for a group containing OH, NO2, SH, Br and phosphonic dichloride function. In
E H
H
Z H
H
Fig. 16.Structures of in-house synthesized series of pyrazole triazole thiol compounds showing the best pIC50 values against the DLD-1 colon cancer cell line.
Table 5, newly designed structures with their substituted chemical groups and predicted activity have been given for lung cancer.
4.2. Design and prediction of new potential compound against breast cancer cell line MCF-7
The mathematical equation obtained from the QSAR model for the breast cancer cell line MCF-7, indicates that changes in some descriptors can allow us to increase or decrease the anticancer activity. According to the equation of the obtained QSAR model, descriptors BCUT_PEOE_0,
GCUT_SMR_1, chi1v_C and SlogP_VSA4 has negative influence on anti- cancer activity. The calculated-test values for these descriptors BCUT_- PEOE_0, GCUT_SMR_1, chi1v_C and SlogP_VSA4 are observed as 66.69459188; 771.1429153; 15.03181456 and 2.302460446, respec- tively. Descriptors BCUT_PEOE_0 and GCUT_SMR_1 is found to have the largest values than the rest.
Therefore, designing new compounds with the improved anti-cancer activity we have made appropriate substitutions using specific chemical groups and then re-calculated their anti-cancer activities using the proposed model equation. The compound that served as template for the substitution is compound 10, presented below.
pIC50= − 0.18535− 1.79854×BCUT PEOE0− 20.79526
×GCUT SMR1− 0.40536×chi1v C− 0.06209×SlogP VSA4 Particularly, decreasing the value of descriptor BCUT_PEOE_0 (means that this descriptor uses partial charge) of new compounds be more beneficial and potential. Such potential compounds can be made by replacing the methyl group attached to pyrazole ring with groups containing partial charge like NO2, CN, OCH3, and imidamide etc.
Similarly, decreasing the value of descriptor GCUT_SMR-1 (means GCUT descriptors using atomic contribution to molar refractivity instead of partial charge) of new compounds should be lower than that of the parent pyrazole molecule GCUT_SMR-1 < − 0.2175, and such com- pounds can be made by replacing the methyl group attached to pyrazole with electron-withdrawing groups like NO2 and CN. In addition, to improve the pIC50 value, descriptor SlogP_VSA4 (sum of the van der waals area by the SlogP contribution) value must be close to the Fig. 17. Structure-activity relationship study (SAR) of pyrazole triazole thiol
derivative as potent anticancer agent.
Table 5
Descriptors values for newly designed pyrazole carbohydrazide derivatives and their predicted anticancer activity against lung cancer cell line A-549 calculated according to the developed QSAR mathematical model.
New designed compounds R Chi1v_C a_nCl SlogP_VSA6 logP(o/w) pIC50
1
R ¼Phosphonc
dichloride 0.96 2 0 2.08 9.88
2
R ¼Nitro 0.96 0 0 1.40 9.80
3
R =Dimethylamine 0.96 0 0 1.40 9.80
4
R =Amine 0.96 0 0 1.10 9.77