pharmacologic applications
William C. Reinhold
Genomics and Bioinformatics Group, DTB, CCR, NCI, NIH, Bethesda, MD 20892
Introduction
http://discover.nci.nih.gov/cellminer/
NCI60
Breast Prostate Melanoma
Leukemia Lung (NSCL) Colon
Ovarian CNS Renal
* *
* *
* *
6 are epithelial
* *
3 are non-epithelial.*
*
*
DTP web site: http://dtp.nci.nih.gov/
Inventory 60-cell line
screen
Animal studies
Clinical trials
>500,000 cmpds >100,000 cmpds
Our Genomics and Bioinformatics Group (GBG) web site originated under the auspices of Dr John Weinstein. It provides web-applications and (Kohn) molecular interaction
maps,
http://discover.nci.nih.gov/
tool conceptualized and developed by Dr. John Weinstein. Cluster image maps are now ubiquitous to the field.
Liu et al, MCT, 2010 Weinstein, et al., Science, 1997
Of note there is our CIMminer tool, the original cluster image map (CIM) tool conceptualized and developed by Dr. John Weinstein. CIMs are now
ubiquitous to the field.
The GBG site contains MIMminer, which contains the detailed scholarly molecular interaction maps developed by Dr Kurt Kohn.
Kohn, et al., MBC, 2006
Luna, BMC Bioinformatics, 2011 Aladjem, Scie. STKE, 2004
Zeeberg, et al., Genome Biol, 2003
Zeeberg, et al., BMC Bioinformatics, 2005
The GBG site contains CellMiner, our web-application containing data and tools that provide access to and enhance interpretation of the NCI-60
http://discover.nci.nih.gov/cellminer/
UT Shankavarum, et al., BMC Genomics, 2009.
WC Reinhold, et al., Cancer Research, 2012.
WC Reinhold et al, Clin Cancer Res, 2015.
Countries using the site in rank order
The GBG web site is internationally recognized and heavily used, with 5,362 unique users from 112 countries in May
Our focus today will be CellMiner
http://discover.nci.nih.gov/cellminer/
Databases types
DNA data
Sequencing
Sequence mutation analysis of 24 known cancer genes (Ikedobi et al 2006) Whole exome DNA sequencing
(Abaan, Doroshow, Pommier, Meltzer et al, 2013) DNA methylation
Bisulfite sequencing of E-cadherin promoter (Reinhold et al, MCT, 2007)
DNA fingerprinting
For the NCI-60 cell line panel (Lorenzi et al, 2009) aCGH
Agilent 44,000 feature Human Genome CGH Microarray (Varma, PLOS One 2014).
NimbleGen 385,000 feature Human Whole-Genome array CGH Microarray (Varma, PLOS One, 2014).
SNP / aCGH
Affymetrix 500,000 SNP GeneChip Human Mapping Array (Ruan et al., 2012) Illumina 1,000,000 feature Human1M-Duo BeadChip (Varma, 2014).
Spectral karyotyping
A collaborative study, not on CellMiner. (Roschke et al., Cancer Res2003) available at http://www.ncbi.nlm.nih.gov/sky/skyquery.cgi or SKYGRAMS at NCI DTP.
Expression data
http://discover.nci.nih.gov/cellminer/
Protein
Reverse-phase lysate array, 94 genes (Nishizuka et al., 2003).
microRNA
Agilent 799 feature Human miRNA Microarray V2 (Liu et al., 2010).
mRNA
Affymetrix HG-U95 65K probeset microarray (Shankavaram et al., 2007).
Affymetrix HG-U133 44K probeset microarray (Shankavaram et al., 2007).
Affymetrix ~47,000 transcript Human Genome U133 Plus 2.0 Microarray (Reinhold et al., 20010).
Agilent 44,000 feature Whole Human Genome Oligo (transcript) Microarrays. (Liu et al., 2010).
Affymetrix 1.4x106 probeset Gene Chip Human Exon 1.0 ST array (Gminer et al., 2010).
Compound and Drug activity
Assayed by DTP, J. Collins, S. Holbeck, J. Morris, B. Teicher et al (http://dtp.nci.nih.gov)
In CellMiner 1.6 we have:
20,861 compounds (with another 20k investigational) 158 FDA-approved drugs
79 in clinical trials
427 with known mechanism of action Growth inhibition 50%, total protein (at 48 hours by sulforhodamine B assay):
-2 -1 0 1 2 3 4
BR:MCF7 BR:MDA_MB_231 BR:HS578T BR:BT_549 BR:T47D CNS:SF_268 CNS:SF_295 CNS:SF_539 CNS:SNB_19 CNS:SNB_75 CNS:U251 CO:COLO205 CO:HCC_2998 CO:HCT_116 CO:HCT_15 CO:HT29 CO:KM12 CO:SW_620 LE:CCRF_CEM LE:HL_60 LE:K_562 LE:MOLT_4 LE:RPMI_8226 LE:SR ME:LOXIMVI ME:MALME_3M ME:M14 ME:SK_MEL_2 ME:SK_MEL_28 ME:SK_MEL_5 ME:UACC_257 ME:UACC_62 ME:MDA_MB_435 ME:MDA_N LC:A549 LC:EKVX LC:HOP_62 LC:HOP_92 LC:NCI_H226 LC:NCI_H23 LC:NCI_H322M LC:NCI_H460 LC:NCI_H522 OV:IGROV1 OV:OVCAR_3 OV:OVCAR_4 OV:OVCAR_5 OV:OVCAR_8 OV:SK_OV_3 OV:NCI_ADR_RES PR:PC_3 PR:DU_145 RE:786_0 RE:A498 RE:ACHN RE:CAKI_1 RE:RXF_393 RE:SN12C RE:TK_10 RE:UO_31
Resistant Sensitive
Activities Z scores NSC #: 168411
Current initiatives
DNA methylation, Illumina Human Methylation 450 DNA Analysis BeadChip Kit
Keith Killian, Holly Stevenson, William Reinhold, Sudir Varma, Yves Pommier, David Goldstein, Paul S Meltzer -in progress
http://discover.nci.nih.gov/cellminer/
Omic protein analysis of the NCI60 using mass spectrophotometery.
Vinodh Rajapakse, Augustin Luna, Tiannan Guo, Rudolf Aebersold (Institute of Biotechnology, Switzerland)
-in progress
Whole exome RNA sequencing
Sean Davis, William Reinhold, Sudir Varma, Susan Holbeck, James Doroshow, Yves Pommier, Paul S Meltzer -in progress
Cross institute assessment of drug activities with the Cancer Genome Project (CGP).
Vinodh Rajapakse, Augustin Luna, Michael Garnett, Michael Stratton, Cyril Benes (Sanger Institute / MIT) -in progress
CellMiner database access:
Data is accessed in 3 forms
http://discover.nci.nih.gov/cellminer/
The first is as whole datasets, at the “Download Data Sets” tab.
H o m e N C I-6 0 A n a ly s is T o o ls Q u e ry G e n o m ic D a ta Q u e ry D ru g D a ta D o w n lo a d D a ta S e ts C e ll L in e M e ta d a ta D a ta S e t M e ta d a ta Download Raw Data Set
Download .cel files, pixel intensities, or the like, depending o n platform. See the "Data Set Metadata" page to see the raw data description
Step 1 - Select a data set
Large datasets available from GEO (due to size co nstraints)
DNA: Affy 500K DNA: Illumina 1M SNP RNA:Affy HuEx 1.0
All others available from CellMiner
DNA: aCGH Agilent 44K DNA: E-cadherin methylation DNA: Fingerprinting
DNA: aCGH Roche DNA: Sanger sequencing DNA: Sequenom methylation
RNA: Affy HGU133 Plus 2 RNA: Affy HGU133A RNA: Affy HGU133B
RNA: Affy HGU95 RNA: Agilent microRNA RNA: Agilent RNA RNA: microRNA OSU V3 RNA: microRNA OSU Transporter Protein: Lystae Array Compound activity:DTP NCI-6
Go to download page Reset
Download Normalized Dataset
Download processed values ready for analysis and/or integra tion with other data sets.
Step 1 - Select one or more chip/normalization method(s) to download:
Data type Data type
DNA: aCGH Agilent 44K DNA: E-cadherin methylation DNA: Fingerprinting
DNA: aCGH Roche DNA: Sanger sequencing DNA: Sequenom methylation
RNA: Affy HGU133 Plus 2 RNA: Affy HGU133A RNA: Affy HGU133B
RNA: Affy HGU95 RNA: Agilent microRNA RNA: Agilent RNA RNA: microRNA OSU V3 RNA: microRNA OSU Transporter Protein: Lystae Array Compound activity:DTP NCI-6
Go to download page Reset
The 2 allows selection of subsets of that data, including either by gene or drug (NSC). These are accessed at either the “Query Genomic Data” or “Query Drug Data” tabs.
http://discover.nci.nih.gov/cellminer/
The 3rd provide signatures for each data type, selected by gene or drug (NSC). It makes assumptions as to the “best” representation of that data type, and is designed to facilitate data
integration and systems pharmacology. It is accessed using the “NCI-60 Analysis Tools” tab.
http://discover.nci.nih.gov/cellminer/
Step 1 - Select analysis type:
Cell line signature
Gene transcript z scores (input HUGO name)1 microRNA mean values1
Drug activity z scores (input NSC#)1 Gene DNA copy number (input HUGO name)1 Genetic variant summation (input HUGO name)1 Protein mean values (input HUGO name)1 Gene methylation values (input HUGO name)1
H o m e N C I-6 0 A n a ly s is T o o ls Q u e ry G e n o m ic D a ta Q u e ry D ru g D a ta D o w n lo a d D a ta S e ts C e ll L in e M e ta d a ta D a ta S e t M e ta d a ta
CellMiner: Tools
Each of these is designed to facilitate systems or
molecular pharmacology
http://discover.nci.nih.gov/cellminer/
Cell line signatures: These provides signatures for 71,781 molecular alterations and 20,861 drug and compound activities
Cross correlations: Compares all combinations of gene or microRNA transcript levels, and drug activities (for 2 to 150 inputs)
Pattern comparison: Our tool that compares any single input pattern to 92,642 molecular, pharmacologic, and phenotypic parameters (using Pearson Correlation)
http://discover.nci.nih.gov/cellminer/
http://discover.nci.nih.gov/cellminer/
✓
Genetic variant versus drug visualization: Provides an assessment of the association of one gene’s variants and one drug’s activity
Input:
761431:BRAF
Vemurafenib activity A kinase inhibitor FDA-approved NSC761431
BRAF:
A kinase.
Elevated in melanoma (63%) and thyroid cancer (62%).
WC Reinhold et al, PLOS One, 2014
Genetic variant summation: The logical extension of the idea that a one gene change will determine a pharmacological outcome, is that multiple genes might be involved. This is
the idea behind this tool.
http://discover.nci.nih.gov/cellminer/
Identifies variants in two forms, i) that are amino acid changing, or ii) that are protein function affecting and absent in normal genomes, for up to 150 genes
KRAS
KRAS, EGFR ERBB2, BRAF KRAS, EGFR
ERBB2 KRAS
EGFR
WC Reinhold et al, PLOS One, 2014
rcellminer: An R package for exploring molecular profiles and drug response of the NCI-60 Cell Lines A Luna, V Rajapakse, et al, Bioinformatics, 2016
rcellminer: an R package that provides R objects for the CellMiner data, extending the users ability to explore the data
General systems pharmacological considerations
http://discover.nci.nih.gov/cellminer/
There is substantial effort in the field at large to integrate various large databases to provide insight for and improvement of pharmacological interventions
JN Weinstein, Nature, 2012
Cancer Genome Project (CGP):
Wellcome Trust Sanger Institute / Massachusetts General Hospital Cancer Center.
Datahub
GlaxoSmithKline (GSK) Cancer cell line
encyclopedia (CCLE):
Novartis Institutes for Biomedical Research / The Broad Institute
Developmental Therapeutics Program (DTP) CCR, NCI, NIH
The Cancer Genome Atlas (TCGA) NCI / National Human Genome Research Institute (NHGRI)
International Cancer
Genome Consortium (ICGC)
There is a wide selection of algorithms that can be used, dependent on the question being asked, the nature of the data, and the bias of the researcher
http://discover.nci.nih.gov/cellminer/
t-Tests
Pearson Correlation Spearman Correlation ...
Statistical Linear
Linear Regression Elastic Net
...
Random Forest Nearest Neighbor ...
Non-Linear
Increasing Complexity WC Reinhold et al, Human Gen, 2015
Their success, or lack thereof, is affected by the selection of the relevant biological factors to consider mathematically, as well as how well the algorithmic approach models the inherent biological complexities.
K Kohn, EGRF MIM
Omics databases and approaches suffer from the Gartner’s hype curve reactions
Gartner Group, Inc.,
Applications of the molecular pharmacological data and tools:
http://discover.nci.nih.gov/cellminer/
Genetic variant vs pharmacological response
examples
Variants in the DNA repair gene SLX4 effect the activities of the DNA affecting drugs raltitrexed, cytarabine and camptothecin
Correlation between SLX4 mutations and drug activities.
Causality between SLX4 and raltitrexed (Ds), cytarabine (Ds) and camptothecin (T1) activity was demonstrated Sousa, Pommier, et al., PNAS, 2014
MUS81’s variants are associated to Cladribine activity.
http://discover.nci.nih.gov/cellminer/
Cladribine
A DNA synthesis inhibitor FDA-approved
NSC105014 MUS81:
A DNA replication gene.
Elevated in head and neck cancer (4.6%) and melanoma (4.3%).
MUS81 variant status Cladribine activity increases in the presence of the
variants
R = 0.64
-2 -1 0 1 2 3
BR-MCF7 BR-MDA_MB_231 BR-HS578T BR-BT_549 BR-T47D CNS-SF_268 CNS-SF_295 CNS-SF_539 CNS-SNB_19 CNS-SNB_75 CNS-U251 CO-COLO205 CO-HCC_2998 CO-HCT_116 CO-HCT_15 CO-HT29 CO-KM12 CO-SW_620 LE-CCRF_CEM LE-HL_60 LE-K_562 LE-MOLT_4 LE-RPMI_8226 LE-SR ME-LOXIMVI ME-MALME_3M ME-M14 ME-SK_MEL_2 ME-SK_MEL_28 ME-SK_MEL_5 ME-UACC_257 ME-UACC_62 ME-MDA_MB_435 ME-MDA_N LC-A549 LC-EKVX LC-HOP_62 LC-HOP_92 LC-NCI_H226 LC-NCI_H23 LC-NCI_H322M LC-NCI_H460 LC-NCI_H522 OV-IGROV1 OV-OVCAR_3 OV-OVCAR_4 OV-OVCAR_5 OV-OVCAR_8 OV-SK_OV_3 OV-NCI_ADR_RES PR-PC_3 PR-DU_145 RE-786_0 RE-A498 RE-ACHN RE-CAKI_1 RE-RXF_393 RE-SN12C RE-TK_10 RE-UO_31
Resistant Sensitive Drug ac vity (Z score)
NCI-60 Drug activity z scores e
Genetic variant versus drug visualization (inputs NSC:HUG O name)
WC Reinhold et al, Clinical Can Res, 2015
RAD52’s variants are associated to Ifosfamide activity
Ifosfamide
A DNA damaging drug (AA) FDA-approved
NSC109724 RAD52:
A DNA repair gene.
Elevated in bladder (15%) and ovarian cancer
(11%).
Ifosfamide activity increases in the presence of the variants
R = 0.63 RAD52 variant status
Genetic variant versus drug visualization (inputs NSC:HUG O name)
WC Reinhold et al, Clinical Can Res, 2015
Input this pattern into
“Pattern comparison”
Pattern comparison
EGFR ERBB2 pathway genes, and their association to drugs that targeted that pathway.
http://discover.nci.nih.gov/cellminer/
Significant enrichment for drugs targeting the pathway: p = 3 x 10-6
0 50 100 150 200 250
BR:MCF7 BR:MDA_MB_231 BR:HS578T BR:BT_549 BR:T47D CNS:SF_268 CNS:SF_295 CNS:SF_539 CNS:SNB_19 CNS:SNB_75 CNS:U251 CO:COLO205 CO:HCC_2998 CO:HCT_116 CO:HCT_15 CO:HT29 CO:KM12 CO:SW_620 LE:CCRF_CEM LE:HL_60 LE:K_562 LE:MOLT_4 LE:RPMI_8226 LE:SR ME:LOXIMVI ME:MALME_3M ME:M14 ME:SK_MEL_2 ME:SK_MEL_28 ME:SK_MEL_5 ME:UACC_257 ME:UACC_62 ME:MDA_MB_435 ME:MDA_N LC:A549 LC:EKVX LC:HOP_62 LC:HOP_92 LC:NCI_H226 LC:NCI_H23 LC:NCI_H322M LC:NCI_H460 LC:NCI_H522 OV:IGROV1 OV:OVCAR_3 OV:OVCAR_4 OV:OVCAR_5 OV:OVCAR_8 OV:SK_OV_3 OV:NCI_ADR_RES PR:PC_3 PR:DU_145 RE:786_0 RE:A498 RE:ACHN RE:CAKI_1 RE:RXF_393 RE:SN12C RE:TK_10 RE:UO_31
Summation of variants Summation of amino acid changing variant(s) absent in the normal genomes a
EGFR - ERBB2 Erlotnib (718781)
AG-1478 (693255)
PIK3CB, PIK3C3 PIK3R5 PTEN
AKT1, 2, 3 TSC1, 2
MTOR Temsirolimus (683864) Everolimus (733594)
HRAS KRAS NRAS
RAF1
(MEK) MAP2K1, MAP2K3, MAP2K6
(ERK) MAPK1, 3, 15 BRAF Vemurafenib
(761431) PD-99059 (679828)
Selumetinib (741079) Hypothemycin(354462)
Hypothemycin(354462)
WC Reinhold et al, PLOS One, 2014