Introduction (Part 3) - Public Data Retrieval e Databases [Corrected 3].docx

**INTRODUCTION** **[3 Public Data Retrieval e Databases]** In recent years, the use of accessible databases has become of fundamental importance in the field of biological research. One of the concrete examples in this regard is the study conducted on complex pathologies, such as gastric cancer. \"The Cancer Cell Line Encyclopedia\" (CCLE) and \"The Cancer Genome Atlas (TCGA)\" are two of the most widely used public databases in the field of cancer research. Both platforms display the same type of data in different settings; the first platform gives integrated genomic, transcriptomic, and epigenomic information from patient samples, while the other one provides the same for cancer cell lines. With the molecular characterizations of more than 20,000 primary tumors and matched normal samples spanning 33 distinct cancer types, the TCGA database stands as a significant tool in cancer genomics. Researchers can benefit from the abundance of comparable data offered by this study, looking into the molecular basis of cancer in communities and finding novel therapy targets. In a database focused on patient samples, the types of genetic alterations and molecular subtypes found in stomach cancer are described in more detail. Conversely, the CCLE database includes data derived from cancer cell lines. Among these there are pharmacological profiles, gene expression profiles, and genomic data from hundreds of cancer cell lines. For instance, studies using gastric cancer cell lines from the CCLE provide a more controlled setting for assessing disease molecular pathways and testing new drugs. Combining the information from all of these sources allow a better comprehension of the stomach cancer framework. By approaching the pathology from every point of view, including the patient\'s viewpoint and the cellular and molecular causes, it is possible to develop improved therapies and effective treatments. **[3.1 TCGA (The Cancer Genome Atlas) Database:]** Currently, \"The Cancer Genome Atlas (TCGA)\" is one of the most significant advancements in the field of cancer genomics^1^. The National Cancer Institute (NCI) and the National Human Genome Research Institute (NHGRI) contributed to its completion back in 2006. This research used over 20,000 original cancer samples, thorough molecular characterization of over 33 distinct cancer types and validation using normal samples. Therefore, gaining a thorough understanding of cancer physiology has become essential to TCGA\'s efforts to improve state detection, therapy, and prevention. Throughout those years, TCGA collected 2.5 petabytes of genomic, epigenomic, transcriptomic, and proteomic data. This huge data collection is a turning-point in cancer research since it has advanced our knowledge of the different pathologies and it has enabled wide accessibility of data to researchers anywhere in the world. *3.1.1 TCGA\'s Contributions and Resources* The therapeutic treatment of cancer patients underwent a significant transformation as a result of TCGA onset, as well as the field\'s current understanding of the biology of cancer. The Pan-Cancer Atlas, a cross-cancer resource published in 2018, is one of its major achievements^2^. It addresses broader general issues and biological activities in cancer research, including signaling pathways, oncogenic processes, and development-origin patterns. In addition, TCGA has developed a suite of computational tools that manage operations related to data processing and visualization, enabling scientists to investigate various data viewpoints inside extensive and detailed datasets. The Genomic Data Commons Data Portal is an informative resource that offers web-based research and visualization capabilities in addition to TCGA data access, improving the dataset\'s value for a large variety of academic purposes. *3.1.2 Methodology and Selection in TCGA* In order to represent the biological heterogeneity of cancer, 33 types of carcinomas were selected to obtain a molecular characterization on the TCGA database^3^. The selection criteria and procedures used were chosen to emphasize the global and comprehensive approach to Cancer Genomics. In this regard, the data stored in the TCGA website were obtained using multi-omics sequencing platforms and technologies (Genomics, Transcriptomics, Proteomics, Molecular, etc.), in order to produce a complete and exhaustive characterization, which is documented with the resources and methods used. The TCGA timeline and milestones, considering the start of the program and the most promising results, demonstrated how the use of public and accessible data for the study of the molecular biology of cancer has become increasingly dominant. *3.1.3 Gastric Cancer in The Cancer Genome Atlas (TCGA)* The Cancer Genome Atlas (TCGA) has played a key role in expanding the general knowledge of gastric cancer, an important and complex form of cancer. Gastric cancer is the subject of extensive research in the TCGA dataset, providing insights into its molecular and genetic origins. In this regard, several multiple gastric cancer samples have been characterized as a part of the TCGA\'s comprehensive approach to the disease, providing a precise map of genomic changes and molecular subtypes linked to gastric cancer. One of the crucial advances achieved thanks to the classification conducted by TCGA was the discovery of unique and distinctive molecular subtypes of gastric cancer^4^. Due to this classification, it was possible to understand the variability of gastric cancer, which also has implications for individualized treatment plans. It is possible to distinguish 4 main subtypes of stomach cancer recognized by the TCGA: 1. **Epstein-Barr Virus (EBV)-Positive**: This tumor subtype was identified thanks to the presence of EBV and it is characterized by DNA hypermethylation and amplification of the PD-L1 and PD-L2 genes. 2. **Microsatellite Instable (MSI)**: This tumour subtype presents a high rate of mutations due to defects in the DNA mismatch repair machinery. Because of the high rate of mutation, this results in the production of neo-antigens, which the immune system does not recognize as self, increasing sensitivity and predisposition to immunotherapy. 3. **Genomically Stable (GS)**: This subtype is different from the \"Diffuse\" type of gastric cancer, as it is characterized by a smaller number of mutations and it presents with specific alterations in genes such as RHOA and genes involved in cell adhesion pathways. 4. **Chromosomal Instability (CIN)**: This subtype represents the predominant form of gastric carcinoma and is characterized by a high frequency of chromosomal amplifications and deletions involving a large number of genes linked to cell cycle regulation. The development of targeted therapies and personalized medicine is significantly influenced by the molecular characterization of gastric cancer obtained by TCGA. Prognosis and response to treatment can be predicted by understanding the numerous molecular subtypes. For example, patients in the EBV-positive subtype may benefit from targeted treatments, while those in the MSI subtype may benefit from immunotherapies. Additionally, the TCGA dataset is a valuable tool for further studies, allowing researchers to better understand the molecular causes of stomach cancer and to create more powerful treatment plans and strategies. **[3.2 CCLE (Cancer Cell Line Encyclopedia) Database:]** In the field of cancer research, the Cancer Cell Line Encyclopedia (CCLE) is a precious resource that provides a large collection of genomic and pharmacological information. This section provides a detailed coverage of the organization, development and extensive collection of data available in the CCLE dataset. Along with other collaborators, the Broad Institute and the Novartis Institutes for Biomedical Research were the primary project managers in the development of the CCLE database. *3.2.1 Data Collection e Development* The objective of the CCLE project was to perform genomic and molecular characterizations of a broad range of cancer cell lines, given the need to develop targeted therapies and understand the molecular heterogeneity of cancer. This extensive database was partly developed through collaboration between Novartis and the Broad Institute, and with contributions from other smaller partners. With its all-encompassing methodology, the CCLE project seeks to provide a comprehensive genetic and molecular map of a wide range of tumor cell lines, thereby promoting a better understanding of cancer biology and supporting the creation of more potent therapeutic approaches^5^. To achieve this result, it was necessary to combine the efforts of experts in genomics, pharmacology, genetics and bioinformatics whose experience and joint work contributed to broadening the area of oncology research. A broad range of biological data is included in the CCLE database, including transcriptomic analyses, pharmacological profiles of more than 1,000 tumor cell lines, and genomic data^6^. The latter set of data includes the complete profile of mutations, copy number variations and gene expression. In particular, the transcriptomic data presented in the CCLE represent the informative contribution that has best managed to improve knowledge of cancer biology through the gene expression models of the different cell lines. Furthermore, CCLE has proven essential in identifying new oncogenic factors and possible therapeutic targets for distinct types of carcinomas. Indeed, as part of the genomic characterization of CCLE, over 1,650 genes have been sequenced, providing a complete knowledge of the genomic alterations observed in tumor cell lines. The discovery of new therapies that can overcome cancer disease has benefited greatly from this large data set, which has been essential in understanding the genomic basis of cancer. A further contribution, such as the addition of DNA methylation data from all CCLE cell lines to the database, has allowed to obtain a more concrete knowledge of the epigenetic changes of tumors^6^. The discovery of biomarkers and the advancement of personalized medicine techniques in the field of cancer depend on this level of in-depth analysis. 2. *CCLE Database Features and Accessibility* The Broad Institute\'s DepMap portal and CCLE website provide researchers with access to the CCLE database, which has an intuitive user interface that makes navigation and data retrieval very easy. These platforms are not only gateways to data, but they also contain advanced tools in terms of visualization, analysis and customized download options. As such these platforms meet the diverse needs of the scientific community. In addition to serving as data access points, these platforms also provide sophisticated tools for data analysis and visualization, as well as personalized download choices. The power of CCLE lies in its ability to unify and integrate huge volumes of data, offering a single resource that incorporates comprehensive transcriptomic, pharmacological and genomic profiles. For scientists hoping to gain in-depth knowledge of cancer biology, this integration is critical, especially when it comes to therapeutic development and discovery^6^, and to understand complex genomic landscapes. The CCLE\'s multifaceted perspective in cancer biology, encompassing genetic, molecular and pharmacological components, is further strengthened by its integration with platforms such as Expression Atlas, cBioPortal and REACTOME^6^. For this reason, the CCLE is considered a vital resource for cancer research, which, thanks to its high degree of integration and access to cutting-edge analytical techniques, enables breakthrough discoveries and innovations in the discipline. *3.2.3 Applications in Gastric Cancer Research* The CCLE database, which provides an in-depth characterization of all cell lines present, has proven very useful in advancing the field of gastric cancer research. The combination of information such as DNA modifications, gene expression patterns and the unique pharmacological profiles of cell lines provides researchers with important insights into the genomic and molecular landscape of gastric cancer. Given the complexity of cancer biology, caution should be used when interpreting results, although the CCLE\'s use of gastric cancer-derived cell lines has proven crucial for both drug screening and basic biological research in oncology^7^. Further, CCLE plays a critical role in drug sensitivity studies, allowing scientists to examine how different drugs affect gastric cancer cells. This information is essential to evaluate the effectiveness of current therapies and to create new therapeutic agents. The CCLE database is also crucial in the field of biomarker discovery because it establishes a connection between genetic traits and disease characteristics. This connection facilitates the creation of personalized treatment regimens and diagnostic tools. Furthermore, it allows comparison of gastric cancer cells with other cancer types, providing important insights into distinct and shared cancer growth pathways at the molecular level^8^. Additionally, CCLE helps elucidate the molecular pathways that underlie stomach cancer. Researchers can identify the mechanisms and processes that cause disease by examining genetic and molecular data. Last but not least, the CCLE serves as a collaborative platform, bringing together researchers from around the world and encouraging the exchange of ideas that accelerate the speed in the study of the causes and treatments of gastric cancer^6^. **[BIBLIOGRAPHY:]** 1\. Wang, Z., Jensen, M. A. & Zenklusen, J. C. A Practical Guide to The Cancer Genome Atlas (TCGA). *Methods Mol. Biol. Clifton NJ* **1418**, 111--141 (2016).2. Cooper, L. A. *et al.* PanCancer insights from The Cancer Genome Atlas: the pathologist's perspective. *J. Pathol.* **244**, 512--524 (2018).3. Tomczak, K., Czerwińska, P. & Wiznerowicz, M. The Cancer Genome Atlas (TCGA): an immeasurable source of knowledge. *Contemp. Oncol.* **19**, A68--A77 (2015).4. Zhang, W. TCGA divides gastric cancer into four molecular subtypes: implications for individualized therapeutics. *Chin. J. Cancer* **33**, 469--470 (2014).5. Barretina, J. *et al.* The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity. *Nature* **483**, 603--607 (2012).6. Ghandi, M. *et al.* Next-generation characterization of the Cancer Cell Line Encyclopedia. *Nature* **569**, 503--508 (2019).7. Cai, S. *et al.* Cautions should be taken when using cell models for gastric cancer research. *Gene* **806**, 145922 (2022).8. Smith, M.-G., Hold, G.-L., Tahara, E. & El-Omar, E.-M. Cellular and molecular aspects of gastric cancer. *World J. Gastroenterol.* **12**, 2979--2990 (2006).

Introduction (Part 3) - Public Data Retrieval e Databases [Corrected 3].docx

Document Details

Tags

Related

Full Transcript