2Omicsway Corp., 340 S Lemon Ave, 6040, Walnut, 91789 CA, USA
3Oncobox Ltd., 121205 Moscow, Russia
4Moscow Institute of Physics and Technology, 141701 Dolgoprudny, Moscow Region, Russia
5Sechenov First Moscow State Medical University, 119991 Moscow, Russia
6Shemyakin–Ovchinnikov Institute of Bioorganic Chemistry, 117997 Moscow, Russia
Received October 31, 2023; Revised March 13, 2024; Accepted March 13, 2024
Identification of genes and molecular pathways with congruent profiles in the proteomic and transcriptomic datasets may result in the discovery of promising transcriptomic biomarkers that would be more relevant to phenotypic changes. In this study, we conducted comparative analysis of 943 paired RNA and proteomic profiles obtained for the same samples of seven human cancer types from The Cancer Genome Atlas (TCGA) and NCI Clinical Proteomic Tumor Analysis Consortium (CPTAC) [two major open human cancer proteomic and transcriptomic databases] that included 15,112 protein-coding genes and 1611 molecular pathways. Overall, our findings demonstrated statistically significant improvement of the congruence between RNA and proteomic profiles when performing analysis at the level of molecular pathways rather than at the level of individual gene products. Transition to the molecular pathway level of data analysis increased the correlation to 0.19-0.57 (Pearson) and 0.14-057 (Spearman), or 2-3-fold for some cancer types. Evaluating the gain of the correlation upon transition to the data analysis the pathway level can be used to refine the omics data by identifying outliers that can be excluded from the comparison of RNA and proteomic profiles. We suggest using sample- and gene-wise correlations for individual genes and molecular pathways as a measure of quality of RNA/protein paired molecular data. We also provide a database of human genes, molecular pathways, and samples related to the correlation between RNA and protein products to facilitate an exploration of new cancer transcriptomic biomarkers and molecular mechanisms at different levels of human gene expression.
KEY WORDS: transcriptomics, proteomics, high-throughput analysis of human gene expression, cancer genomics, pathway activation levelDOI: 10.1134/S0006297924040126
Publisher’s Note. Pleiades Publishing remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.