Introduction to Bioinformatics PDF

Document Details

ValiantSerpentine8057

Uploaded by ValiantSerpentine8057

Nevena Ackovska

Tags

bioinformatics biology computer science molecular biology

Summary

This document is a lecture on introductory bioinformatics. It covers fundamental concepts, historical developments, and various aspects of molecular biology, including the structure of DNA.

Full Transcript

Introduction to Bioinformatics Lecture 1: Fundamentals of Bioinformatics Historical steps Prof. Dr. Nevena Ackovska Content Introductory terms Bioinformatics Historical events in bioinformatics Important technological breakthroughs Bioinformatics Bioinformatics – information techn...

Introduction to Bioinformatics Lecture 1: Fundamentals of Bioinformatics Historical steps Prof. Dr. Nevena Ackovska Content Introductory terms Bioinformatics Historical events in bioinformatics Important technological breakthroughs Bioinformatics Bioinformatics – information technologies and methodologies that support the research of life functions at the cellular level, especially at the level of molecular genetics. Other definitions More broadly – bioinformatics is the research, development, or application of computational tools and approaches that expand the use of biological, medical, behavioral and health data. A group of methods and tools for acquisition and visualization of such data. More narrowly – use of calculation methods and tools to analyze the sequence structures and products of biological macromolecules. Evolution of even the definitions! The area is developing very quickly A big problem to find a suitable definition that covers everything! What are we investigating? Detecting genes and other significant elements in DNA sequences Similarity between DNA sequences Detection of regulatory regions of gene expression Prediction of molecular structures and functions Why is bioinformatics interesting? Demand for professionals: few are adequately trained in both biology and computer science Genome sequencing, the analysis of genes with microarrays , led to a large quantity of data to be analyzed It leads to important discoveries It offers great research opportunities even for beginners The most important discoveries are still waiting to be discovered! Application Understanding Life and Evolution Discovery of causative genes diseases Timely diagnosis in medicine Finding new drugs that work on molecular level in organisms Understanding the interaction between genes Selective modification of features in living organisms ethical problems Two basic branches Development of databases and computing tools (practical part) Generating knowledge for a better understanding of living systems as a whole (theoretical part) Basic terms in molecular biology Cells DNA​ Chromosomes RNA​ Amino Acids Proteins Genome What is a cell? Basic building block of every living organism A human is made up of trillions of cells Functions They allow the body to have structure They extract nutrients from food They convert nutrients into energy Perform special functions ( organelles ) Living world – division of cell types Prokaryotes – do not have organelles or a nucleus (single- celled organisms) Archaea - do not have a nucleus, but have a more developed cellular system (live in extreme conditions) Eukaryotes – have organelles and a nucleus Not all unicellular organisms are prokaryotes, some are eukaryotes. Ex. Paramecium Cell types on planet Earth DNA DNA – deoxyribonucleic acid DNA: Deoxyribonucleic Acid It carries all the hereditary material 4 different chemical bases: Adenine (A) Cytosine (C) Guanine (G) Thymine (T) DNA Human DNA 3 billion bases, 99% similarity in all humans, The sequence determines the information for building and maintaining organisms. Similar to that: The order of letters forms words and sentences Musical notes form a melody... DNA Almost every cell in the human body contains the same DNA. Most of the DNA is located in the nucleus of the cell, but there is a small part of the DNA that is located in the mitochondria ( mitochondrial DNA). Mitochondria are structures that convert energy from food into a form that is usable by cells. What are chromosomes? Method to pack DNA molecules in cells Each chromosome is made of dense DNA wrapped around proteins called histones that maintain it this structure. They are not visible even under a microscope, except in the process of cell division when the packing becomes even denser. Chromosomes In eukaryotes , the nucleus contains DNA molecules, organized as chromosomes. In humans: 22 pairs of chromosomes (numbered by size) 1 pair of sex chromosomes The 22 pairs ( autosomes) are the same in both sexes. The 23rd pair ( sex) differs between the two sexes: Females – 2 copies of the X chromosome Males – 1 X and one Y chromosome RNA RNA – ribonucleic acid Chemically, very similar to DNA. The differences are in: RNA uses the sugar ribose instead of deoxyribose. RNA uses the base Uracil (U) instead of Thymine (T). U is also complementary to A. RNA tends to be single -stranded. Functional difference between RNA and DNA DNA one function, RNA multiple functions Amino acids Proteins The building blocks of proteins are amino acids (20 different ones) A short linear chain of less than 30 amino acids is called a peptide. A long chain of amino acids (sometimes up to 4000 elements) is called a polypeptide. Proteins are polypeptides with a 3-dimensional structure. Gen A physical and functional hereditary unit that transmits information from one generation to another. A DNA sequence that is necessary to synthesize a functional protein or RNA molecule. They vary in length 100 – 2M bases There are different representations of the same gene (with small differences in DNA nucleotides) - alleles. There are approximately 25,000 protein-coding genes in the human body. Genome The complete sequence of DNA (all types) found in each of its cells. The number of chromosomes and the size of the genome vary widely among different organisms. The size of the genome and the number of genes do not necessarily determine the complexity of the organism. In prokaryotes, the size of the genome is directly proportional to the complexity of the organism In eukaryotes, this does not apply – C-value paradox. The largest genome has been found in an amoeba with a size of 686,000 Mb (200 times larger than that of humans). Genome Much of the genome contains sequences that are assumed to serve no useful function. The non-coding parts of the DNA make up a large part of the genome in many eukaryotes. Older sources claim that 97% of the human genome is non- coding DNA. In more recently published articles - up to 98.7 % What is known so far macromolecules such as DNA, RNA, proteins can be mapped... in searchable text sequences. Today, learning about a specific DNA is a matter of search through existing databases for that sequence, rather than concrete work in a biochemical laboratory but, a wet bio laboratory must verify! General knowledge DNA and RNA sequences are strings of 4-letter alphabet Protein sequences are strings of 20- letter alphabet Objective – for each of the biological molecules, the following should be known: The sequence The structure The function How large is information in organisms discovered genome Genes found letters bits 2003 human 25000 10 9 2 32 =4 294 967 296 1998 nematode 1997 budding yeast 1997 E. coli 10 7 2 24 =16 777 216 1995 haemophilus 1700 influenzae 1990 cytomegalovirus 1982 phage  10 5 1977 phage  X174 11 2 16 =65 536 10 3 2 8 =256 How much should a life be designed? To design an organism that needs a host to reproduce (phage or virus), 10 3 to 10 5 letters To design a free-living organism that reproduces without a foreign host requires a description between a million (10 6 ) and a billion (10 9 ) letters long Historical events  1865 Basic Laws of Hereditary Information (Mendel)  1900 rediscovery of Mendel  1905 introduced the concept of "human inborn error" (Garrold )  1913 first linear gene map drawn (Sturtevant)  1944 the genetic material is from DNA! (Avery, MacLeod, McCarty)  1953 the structure of DNA is a double helix! (Watson, Crick) DNA Information processing in prokaryotes 1966 The genetic code published (Nirenberg, Khorana, Holley) 1972 insertion of a DNA segment into natural DNA (Cohen, Boyer) 1977 found the DNA sequencing method (Sanger, Maxam, Gilbert) Beginnings of understanding information processing in eukaryotes 1982 GenBank database established 1983 Annotated disease gene 1985 Polymerase Chain Reaction (PCR) 1986 developed the first instrument for DNA sequencing 1987 First map of human genome shown 1988 yeast artificial chromosome (YAC) The most important project  Human Genome Project ( HGP)  The problem was posed in the late 1980s to read all human DNA letters (about 3 billion) and preferably to find all genes (more than 25000).  It lasted for 12 years from 1988 to 2003.  The first version was published on 06.26. 2000.  The main tool was a computer  Biology and computer science came together in this project. Development of HGP  1990 human genome project started in USA  1996 first archaea genome sequenced  1996 first yeast genome  1997 Genome of Escherichia Coli  1998 The roundworm  1999 first human chromosome ( chromosome 22)  2000 wine fly genome  2000 genome of the first plant (mustard cress)  2001 Initial version of human genome published  2002 initial version: mouse, rat and rice  2003 official end of HGP Important technological breakthroughs Basic instrument - microscope Other instruments and methods introduced Method Description 1920s Ultracentrifuge Estimation of size and shape of the molecule 1930s Electrophoresis Separation of proteins or nucleic acids by size and/or charge 1930s Electron Direct visualization of cellular structure, microscope including nucleic acids 1940s Radioisotope Tracking the flow of a molecule through tracers metabolic pathways 1950s Diffraction of X- Precise measurement of the 3-D structure of a rays protein or nucleic acid 1950 Amino acid Determining the order of amino acids in sequencing proteins Other instruments and methods introduced Method Description 1960 Hybridization of Quantitative assessment of similarity between nucleic acids RNAs and/or DNAs 1970s Nucleotide Determining the sequences of bases in DNA Sequencing 1970s Recombinant DNA Genetic engineering of new genes technology 1980s DNA synthesis Synthesis of desired DNA sequence 1980s Monoclonal Highly specific reagents for protein detection antibody histochemistry 1980s Polymerase chain Producing large numbers (millions) of copies reaction ( PCR) of small DNA sequences Genomic Bioinformatics Genomic Bioinformatics finds all genes and relevant DNA sequences in creatures. The new era does not only read the sequences - the main task is to find their functions. For now, half of the genes found do not have a known function. Postgenomic bioinformatics The new era is called postgenomic bioinformatics , which searches for relationships between genes and life functions through synthesis. The main task is to find new knowledge from the sequenced genomes. Other important research activities It is no longer important just to find all the genes (genomes) of organisms It is important to find all the proteins (proteome) in one cell It is important to find all the metabolic processes (metabolome) in one cell Metabolism – sum of all chemical reactions involved in catabolism and anabolism. Catabolism – breakdown of complex molecules into simpler ones. Anabolism – synthesis of complex molecules from more simple ones.

Use Quizgecko on...
Browser
Browser