Lecture 7 Proteomics PDF
Document Details
Uploaded by AmbitiousHazel759
دار الفرسان الأهلية
Tags
Summary
This is a lecture on proteomics, covering methods, types and applications, including 2D gel electrophoresis and mass spectrometry, and discussion of relevant databases like UniProtKB. It examines the connections between genes and protein expression, and the significance of proteomics in understanding diseases and processes in biology.
Full Transcript
Proteomics Lecture outlines Protein Sequence and Structure Analyses Methods in proteomics 2D gel approach Comparisons between RNA and proteomic data Proteomics Proteomics is the study of the interactions, function, composition, and structures of proteins and their cel...
Proteomics Lecture outlines Protein Sequence and Structure Analyses Methods in proteomics 2D gel approach Comparisons between RNA and proteomic data Proteomics Proteomics is the study of the interactions, function, composition, and structures of proteins and their cellular activities. Proteomics provides a better understanding of the structure and function of the organism than genomics. Proteomics provides a better understanding of (1) Protein Characterization (2) Protein interaction (3) Identification of disease biomarkers Proteomics is a new type of ‘omics’ that has rapidly developed, especially in the therapeutics field. The word proteome was created by Marc Wilkins in 1995. Proteomics meaning Proteomics application Types of Proteomics There are three types of proteomics 1. Expression proteomics 2. Structural proteomics 3. Functional proteomics Expression proteomics Expression proteomics is used to study the qualitative and quantitative expression of total proteins under two different conditions. Normal and diseased state. E.g : tumor or normal cell. It studied that protein is over expressed or under expressed. E.g : 2-D electrophoresis. Structural proteomics Structural proteomics helps to understand three-dimensional shape and structural complexities of functional proteins. It determine either by amino acid sequence in protein or from a gene this process is known as homology modeling. It identify all the protein present in complex system or protein- protein interaction. Mass spectroscopy is used for structure determination. Functional proteomics Functional proteomics explains understanding the protein functions as well as unrevealing molecular mechanisms within the cell that depend on the identification of the interacting protein partners. So that detailed description of the cellular signaling pathways might greatly benefit from the elucidation of protein- protein interactions Important of proteomics Many types of information cannot be obtained from the study of genes alone. For example, proteins, not genes, are responsible for the phenotypes of cells. It is impossible to elucidate mechanisms of disease, aging, and effects of the environment solely by studying the genome. Tools of proteomics Tools of proteomics Tools of proteomics Protein Sequence and Structure Analyses Protein Sequence Analysis is the process of subjecting a protein or peptide sequence to one of a wide range of analytical methods to study its features, function, structure, or evolution. Methodologies used include sequence alignment, searches against biological databases, and other methods. The development of methods of high-throughput production of protein sequences, the rate of addition of new sequences to the databases increased exponentially. Such a collection of sequences does not, by itself, increase the researcher's understanding of the biology of organisms. However, comparing these new sequences to those with known functions is an important way of studying the biology of an organism from which the novel sequence comes. Thus, protein sequence analysis can be used to assign function to proteins by the study of the similarities between the distinct sequences. Nowadays, many tools and techniques are available to analyze the alignment product and provide the sequence comparisons to study its biology. Methods of proteomics Several high-throughput technologies have been developed to investigate proteomes in depth. The most commonly applied are mass spectrometry (MS)-based techniques such as Tandem-MS and gel-based techniques such as differential in-gel electrophoresis (DIGE). These high-throughput technologies generate huge amounts of data. Databases are critical for recording and carefully storing this data, allowing the researcher to make connections between their results and existing knowledge. Tandem mass spectrometry (MS) Overview of analysis by Tandem Mass Spectrometry. The digested peptides are ionized and passed through the first mass analyzer; analyzed in the second analyzer; and detected as MS spectra, survey spectra, or MS1. The high-intensity peaks from the resulting spectra (MS1) are further selected in the first mass analyzer, fragmented in the collision cell, and then analyzed in the second mass analyzer, resulting in the product ions, detected in MS/MS spectra (MS2), which contain the information for peptide sequencing followed by protein identification Tandem mass spectrometry (MS) Overview of the different modes of data collection in tandem mass spectrometry. In DIA, all precursor ions are analyzed simultaneously and in MS mode (i.e., with low-collision energy) and then fragmented in the collision cell using high-collision energy. Multiple product ions that resulted from the fragmentation of multiple precursor ions are then detected in one spectrum. In DDA, all precursor ions are detected in the survey MS spectrum, and then, the precursor ions that have the highest intensity are selected in the first analyzer (i.e., quadrupole), fragmented in the collision cell, analyzed in the second analyzer (i.e., TOF), and detected as the MS/MS spectrum. 2D gel approach Two-dimensional gel electrophoresis (2DE) is an established technique for high-resolution profiling of complex protein mixtures. 2DE can resolve thousands of protein “spots” on a single gel and is considered a key method in proteomics research. It is widely applied in protein expression profiling experiments to identify changes in protein expression levels resulting from a disease state or drug treatment. 2DE is also a valuable tool to study post-translational modifications and identify protein isoforms. Small changes in the protein mass or isoelectric point (pI) translate into a detectable protein shift. 2D gel approach 2D electrophoresis consists of two separation steps: 1. In the first-dimension, proteins are separated according to their isoelectric point (pI)—the pH at which the protein net charge is zero—in a process known as isoelectric focusing (IEF). During conventional IEF, amphoteric molecules, like proteins, migrate until they reach a region where the pH of the matrix matches their isoelectric point and they “focus” into sharp bands. 2. The second dimension consists of a conventional SDS-PAGE electrophoretic separation where proteins are differentiated based on their molecular weights. 2-DE is a robust method in which the resolution acquired during the first-dimension separation is preserved during the second dimension. 2D gel approach In silico approach EMBL-EBI hosts up-to-date and accurate databases to enable rapid searching and retrieval of these data. The five major EMBL-EBI databases related to proteomic research I. UniProtKB database (https://www.uniprot.org/) II. IntAct (https://www.ebi.ac.uk/intact/home) III. Reactome (https://reactome.org/) IV. PRIDE (https://www.ebi.ac.uk/pride/) V. AlphaFold Protein Structure Database (https://alphafold.ebi.ac.uk/) These five databases (especially UniProtKB and AlphaFold) draw from gene sequence data (e.g. Ensembl) and annotation tools (e.g. InterPro) also hosted by EMBL-EBI. UniProtKB database The UniProt Knowledgebase (UniProtKB), the centrepiece of the UniProt Consortium’s activities, is an expertly and richly curated protein database, consisting of two sections called UniProtKB/Swiss-Prot and UniProtKB/TrEMBL. 1. UniProtKB/Swiss-Prot: contains high-quality expertly curated and non- redundant protein sequence records. UniProtKB/Swiss-Prot is the manually annotated and reviewed section of the UniProt Knowledgebase (UniProtKB). It is a high quality annotated and non-redundant protein sequence database, which brings together experimental results, computed features and scientific conclusions. 2. UniProtKB/TrEMBL: contains high-quality computationally analysed records enriched with automatic annotation and classification. These entries are largely UniProtKB database UniProtKB database UniProtKB database AlphaFold Protein Structure Database AlphaFold is an artificial intelligence (AI) program developed by DeepMind, a subsidiary of Alphabet, which performs predictions of protein structure. The program is designed as a deep learning system. AlphaFold 3 was co-developed by Google DeepMind and Isomorphic Labs, both subsidiaries of Alphabet. AlphaFold 3 is not limited to single-chain proteins, as it can also predict the structures of protein complexes with DNA, RNA, post-translational modifications and selected ligands and ions. AlphaFold 3 introduces the "Pairformer", a deep learning architecture inspired from the transformer, considered similar but simpler than the Evoformer introduced with AlphaFold 2. The raw predictions from the Pairformer module are passed to a diffusion model, which starts with a cloud of atoms and uses these predictions to iteratively progress towards a 3D depiction of the molecular structure. The AlphaFold server was created to provide free access to AlphaFold 3 for non-commercial research. AlphaFold Protein Structure Database Comparisons between RNA and proteomic data Sample-specific protein databases derived from RNA-Seq data can better approximate the real protein pools in cell and tissue samples and thus improve protein identification. Meanwhile, proteomics data provides essential confirmation of the validity and functional relevance of novel findings from RNA-Seq data.