10.1021/pr4001527.s002 Emma Yue Zhang Emma Yue Zhang Massimo Cristofanilli Massimo Cristofanilli Fredika Robertson Fredika Robertson James M. Reuben James M. Reuben Zhaomei Mu Zhaomei Mu Ronald C. Beavis Ronald C. Beavis Hogune Im Hogune Im Michael Snyder Michael Snyder Matan Hofree Matan Hofree Trey Ideker Trey Ideker Gilbert S. Omenn Gilbert S. Omenn Susan Fanayan Susan Fanayan Seul-Ki Jeong Seul-Ki Jeong Young-ki Paik Young-ki Paik Anna Fan Zhang Anna Fan Zhang Shiaw-Lin Wu Shiaw-Lin Wu William S. Hancock William S. Hancock Genome Wide Proteomics of ERBB2 and EGFR and Other Oncogenic Pathways in Inflammatory Breast Cancer American Chemical Society 2013 bioinformatics sites GeneGo CRKL p 53 subpathway EPH proteomic data sets 190 cell lines PYCARD MYC 2D JAK genome Wide Proteomics ACTN PYD cell lines 3K MTDH SFN oncogene expression levels PI Interologous Interaction Database CTNND GRB NCL ERBB 2 PLEC Other Oncogenic Pathways SUM proteomics data sets RPKM SKBR breast cancer cell lines SERPINB CAV EPHA IBC EGFR S 100A caveolin 1 protein CAD TFRC NCI ERBB 2 transcript Inflammatory Breast CancerIn FLNA BCAT 3 cell lines 2013-06-07 00:00:00 Dataset https://acs.figshare.com/articles/dataset/Genome_Wide_Proteomics_of_ERBB2_and_EGFR_and_Other_Oncogenic_Pathways_in_Inflammatory_Breast_Cancer/2408779 In this study we selected three breast cancer cell lines (SKBR3, SUM149 and SUM190) with different oncogene expression levels involved in ERBB2 and EGFR signaling pathways as a model system for the evaluation of selective integration of subsets of transcriptomic and proteomic data. We assessed the oncogene status with reads per kilobase per million mapped reads (RPKM) values for ERBB2 (14.4, 400, and 300 for SUM149, SUM190, and SKBR3, respectively) and for EGFR (60.1, not detected, and 1.4 for the same 3 cell lines). We then used RNA-Seq data to identify those oncogenes with significant transcript levels in these cell lines (total 31) and interrogated the corresponding proteomics data sets for proteins with significant interaction values with these oncogenes. The number of observed interactors for each oncogene showed a significant range, e.g., 4.2% (JAK1) to 27.3% (MYC). The percentage is measured as a fraction of the total protein interactions in a given data set vs total interactors for that oncogene in STRING (Search Tool for the Retrieval of Interacting Genes/Proteins, version 9.0) and I2D (Interologous Interaction Database, version 1.95). This approach allowed us to focus on 4 main oncogenes, ERBB2, EGFR, MYC, and GRB2, for pathway analysis. We used bioinformatics sites GeneGo, PathwayCommons and NCI receptor signaling networks to identify pathways that contained the four main oncogenes and had good coverage in the transcriptomic and proteomic data sets as well as a significant number of oncogene interactors. The four pathways identified were ERBB signaling, EGFR1 signaling, integrin outside-in signaling, and validated targets of C-MYC transcriptional activation. The greater dynamic range of the RNA-Seq values allowed the use of transcript ratios to correlate observed protein values with the relative levels of the ERBB2 and EGFR transcripts in each of the four pathways. This provided us with potential proteomic signatures for the SUM149 and 190 cell lines, growth factor receptor-bound protein 7 (GRB7), Crk-like protein (CRKL) and Catenin delta-1 (CTNND1) for ERBB signaling; caveolin 1 (CAV1), plectin (PLEC) for EGFR signaling; filamin A (FLNA) and actinin alpha1 (ACTN1) (associated with high levels of EGFR transcript) for integrin signalings; branched chain amino-acid transaminase 1 (BCAT1), carbamoyl-phosphate synthetase (CAD), nucleolin (NCL) (high levels of EGFR transcript); transferrin receptor (TFRC), metadherin (MTDH) (high levels of ERBB2 transcript) for MYC signaling; S100-A2 protein (S100A2), caveolin 1 (CAV1), Serpin B5 (SERPINB5), stratifin (SFN), PYD and CARD domain containing (PYCARD), and EPH receptor A2 (EPHA2) for PI3K signaling, p53 subpathway. Future studies of inflammatory breast cancer (IBC), from which the cell lines were derived, will be used to explore the significance of these observations.