Bioinformatics Lecture 2 PDF
Document Details

Uploaded by ManageableModernism
Assiut University
Fatma Hussein
Tags
Summary
This lecture provides an introduction to bioinformatics and biological databases. It covers the importance of databases in bioinformatics, including sequence and structure databases, and discusses resources such as NCBI, EMBL, and GenBank. The lecture concludes with some multiple-choice questions.
Full Transcript
Bioinformatics Lecture 2 Dr. Fatma Hussein Lecturer at Department of Mathematics- Faculty of Science- Assiut University Remember that Bioinformatics is the field where biology, computer science and information technology merge to a discipline Math solutions to biological...
Bioinformatics Lecture 2 Dr. Fatma Hussein Lecturer at Department of Mathematics- Faculty of Science- Assiut University Remember that Bioinformatics is the field where biology, computer science and information technology merge to a discipline Math solutions to biological problems. Bioinformatics scientific area involves, among many other things, databases, algorithms, modeling and simulations. Bioinformatics is a theoretical science, the resulting hypothesis always have to be experimentally corroborated. Bioinformatics databases This Photo by Unknown Author is licensed under CC BY Familiarize with Major Biological Databases Describe the Types of Data each database contains and their relevance to different areas of biological research. learning objectives Learn Database Search Techniques Examine Case Studies and Applications Explore Database Tools and Resources Biological Databases One of the branches of bioinformatics has to do with the creation, growth and maintenance of databases with biological information. A database is an organized collection of structured information, or data, typically stored electronically in a computer system. A database is usually controlled by a database management system (DBMS) In the context of computing and information technology, databases are central to storing and managing large volumes of data efficiently. The data is usually structured in tables consisting of rows and columns or easy- treatment frameworks that facilitate the search and manipulation of the information contained in that database. Why Databases? The purpose of databases is not Easy merely to collect and organize data, but to allow intelligent data retrieval. A query is a method to retrieve information from the database. Speed organization The organization of each record into predetermined fields, allows us to use queries on fields. Security 12 Classifications of Databases Primary or derived databases: experimental results directly into database. Secondary databases: results of analysis of primary databases. Biological data Biological data encompasses a vast array of information derived from living organisms, spanning molecular, cellular, organismal, and ecological levels of organization that come from different sources, such as scientific experiments, published literature, or computational analyses. Types of Biological Data: Nucleotide Metabolic Protein Genomic Population Sequence Pathways Sequence Data Data Classification of Biological databases sequence databases structure databases Nucleic acid proteins protein sequences Sequence databases In the field of bioinformatics, a sequence database is a type of biological database that is composed of a large collection of computerized ("digital") nucleic acid sequences, protein sequences, or other polymer sequences stored on a computer. Standard contents of a sequence database Tools for analysis Sequences BLAST Accession number References Primer-BLAST Taxonomic data B-Link Annotation/curation ORF finder Keywords Genome workbench Cross-references Documentation Nucleotide databases (Comprehensive Databases) International Nucleotide Sequence Database Collaboration (INSDC) ❑National Center for Biotechnology Information (NCBI) NCBI is a resource for molecular biology information. ❑European Molecular Biology Laboratory (EMBL) EMBL provides biomolecular databases and bioinformatics tools. ❑Data Bank of Japan (DDBJ) DDBJ is a DNA data bank in Japan. Protein Sequence databases ❑UniProt Universal Protein Resource ❑PFAM Proteins contain conserved regions Based on the conserved regions, proteins are classified into families ❑Gene Index project Project aimed at indexing genes and their variants in the various genome sequences Structure databases structure database is a database that is modeled around the various experimentally determined protein structures. The aim of most protein structure databases is to organize the protein structures, providing the biological community access to the experimental data in a useful way. PDB – Protein Data Bank Contains information about experimentally determined structures of proteins, nucleic acids, and complex assemblies CATH- Class Architecture Topology Homologous superfamily Classification of proteins based on domain structures SCOP – Structural Classification of Proteins Description of structural and evolutionary relationships between all the proteins with known structures Bioinformatics database One of the branches of bioinformatics involves the establishment, expansion, and upkeep of databases containing biological information. These databases serve as comprehensive repositories housing the entirety of our human biological knowledge. They are comprised of both computer hardware and software dedicated to effectively managing this vast amount of data. These databases contain many types of scientific information that come from different sources, such as laboratory experiments or computational analyses mainly based on the called omics areas. Normally, each database entry is described by a unique accession number and, moreover, the data is usually structured in tables or easy- treatment frameworks that facilitate the search and manipulation of the information contained in that database. Biological Databases Nucleic Acids Research (NAR) is an open-access scientific journal published since 1974 by the Oxford University Press. The journal covers research on nucleic acids, such as DNA and RNA, and related work. The journal publishes two yearly special issues; ▪ The first issue of each year is dedicated to biological databases, published in January since 1993. ▪ The other is devoted to papers describing web-based software resources of value to the biological community, published in July since 2003. Database Descriptions Highlight Database Updates the Role of Database Applications NAR Database Tools and Methods Databases are fundamental in computational biology, playing a pivotal role as foundational resources for a wide range of research endeavors. Specifically, protein structure databases are of utmost importance in this field. Protein structure databases are critical for many efforts in computational biology such as structure based drug design, both in developing the computational methods used and in providing a large experimental dataset used by some methods to provide insights about the function of a protein. Some of the most commonly used biological databases, which are specially designed for researchers, are: PubMed: an open access database comprising citations for life science journals, online books and biomedical literature. GenBank: an open access collection of nucleotide sequences. UniProt: an open access database of protein sequences. Protein Data Bank (PDB): an open access Information Portal to Biological Macromolecular Structures of proteins and nucleic acids. To be continued … Thank you Multiple Choice 1. Proteomics refers to the study of __________. A) Set of proteins in a specific region of the cell B) Biomolecules C) The entire set of expressed proteins in the cell D) Set of proteins 2. Which of the following are not the application of bioinformatics? A) Drug designing B) Biomolecules C) The entire set of expressed proteins in the cell D) None of these 3. A query is a method to retrieve information from the database. A) True B) False 4. Nucleic Acids Research (NAR) is a scientific journal that publishes articles on nucleic acids, such as DNA and RNA. a) True b) False 5. What is the term for drug identification through genomic study? A) Genomics B) Pharmacogenomics C) Pharmacogenetics D) Cheminformatics 6. THE THREE-DIMENSIONAL SHAPE OF A SINGLE-PIECE PEPTIDE CHAIN WITH A SECONDARY STRUCTURE IS DEFINED BY: A) Primary structure B) Secondary structure C) Tertiary structure D) Quaternary structure 7. WHAT TYPE OF BOND IS PRIMARILY RESPONSIBLE FOR MAINTAINING THE PRIMARY STRUCTURE OF A PROTEIN? A) Hydrogen bond B) Ionic bond C) Peptide bond D) Disulfide bond Thank you