Molecular Genetics Lecture 4 PDF
Document Details
Uploaded by ComelyChrysanthemum5045
New Mansoura University
Amr M. Mowafy
Tags
Summary
This lecture covers molecular genetics, focusing on gene classification, evolution, and their role in various biological processes. The lecture details different types of genes, their functions, evolutionary conservation, and roles in diseases.
Full Transcript
Molecular Genetics Lecture 4 Prof. Amr M. Mowafy 1 - Gene classification based on purpose of study - Gene evolutions Some Genes Evolve Rapidly; Others Are Highly Conserved New Genes Are Generated from Preexisting Genes Gene Duplications Give Rise to Families of R...
Molecular Genetics Lecture 4 Prof. Amr M. Mowafy 1 - Gene classification based on purpose of study - Gene evolutions Some Genes Evolve Rapidly; Others Are Highly Conserved New Genes Are Generated from Preexisting Genes Gene Duplications Give Rise to Families of Related Genes Within a Single Cell The Function of a Gene Can Often Be Deduced from Its Sequence More Than 200 Gene Families Are Common to All Three Primary Branches of the Tree of Life 2 Gene classification based on purpose of study 3 Genes can be classified in various ways based on different criteria. Here are a few common ways in which genes are classified: 1.Function: Genes can be classified based on their function, such as structural genes that code for proteins, regulatory genes that control gene expression, and non-coding genes that do not code for proteins but have other regulatory functions, and pseudogenes (non-functional copies of genes). 2.Inheritance: Genes can be classified based on their inheritance patterns, such as autosomal genes located on autosomes (non-sex chromosomes) or sex-linked genes located on sex chromosomes. 4 3. Expression: Genes can be classified based on their expression patterns, such as housekeeping genes that are expressed constitutively in all cell types, tissue-specific genes that are expressed only in specific tissues, or inducible genes that are expressed in response to specific stimuli. 4. Evolutionary Conservation: Genes can be classified based on their evolutionary conservation across species, such as orthologs (genes in different species that evolved from a common ancestor) and paralogs (genes that arise from gene duplication within a species). 5. Location: Genes can be classified based on their genomic location, such as genes located on different chromosomes or genes clustered together in the same genomic region (gene families). 6. Role in Disease: Genes can be classified based on their 5 GENE EVOLUTION 6 Some Genes Evolve Rapidly; Others Are Highly Conserved Both in the storage and in the copying of genetic information, random accidents and errors occur, altering the nucleotide sequence —that is, creating mutations. Therefore, when a cell divides, its two daughters are often not quite identical to one another or to their parent. - On rare occasions, the error may represent a change for the better; - more probably, it will cause no significant difference in the cell's prospects; - and in many cases, the error will cause serious damage—for example, by disrupting the coding sequence for a key protein. 7 Changes due to mistakes of the first type will tend to be perpetuated, because the altered cell has an increased likelihood of reproducing itself. Changes due to mistakes of the second type—selectively neutral changes—may be perpetuated or not: in the competition for limited resources, it is a matter of chance whether the altered cell or its cousins will succeed. But changes that cause serious damage lead nowhere: the cell that suffers them dies, leaving no progeny. Through endless repetition of this cycle of error and trial— of mutation and natural selection—organisms evolve: their genetic specifications change, giving them new ways to exploit the environment more effectively, to survive in competition with others, and to reproduce successfully. 8 Clearly, some parts of the genome change more easily than others in the course of evolution. A segment of DNA that does not code for protein and has no significant regulatory role is free to change at a rate limited only by the frequency of random errors. In contrast, a gene that codes for a highly optimized essential protein or RNA molecule cannot alter so easily: when mistakes occur, the faulty cells are almost always eliminated. Genes of this latter sort are therefore highly conserved. Through 3.5 billion years or more of evolutionary history, many features of the genome have changed beyond all recognition; but the most highly conserved genes remain perfectly recognizable in all living species. 9 These latter genes are the ones that must be examined if we wish to trace family relationships between the most distantly related organisms in the tree of life. The studies that led to the classification of the living world into the three domains of bacteria, archaea, and eucaryotes were based chiefly on analysis of one of the ribosomal RNA subunits—the so- called 16S RNA, which is about 1500 nucleotides long. Because the process of translation is fundamental to all living cells, this component of the ribosome has been well conserved since early in the history of life on Earth (Figure 1-22). 10 Figure 1-22Genetic information conserved since the beginnings of life A part of the gene for the smaller of the two main RNA components of the ribosome is shown. Corresponding segments of nucleotide sequence from an archaean (Methanococcus jannaschii), a eubacterium (Escherichia coli) and a eucaryote (Homo sapiens) are aligned in parallel. Sites where the nucleotides are identical between species are indicated by a vertical line; the human sequence is repeated at the bottom of the alignment so that all three two-way comparisons can be seen. A dot halfway along the E. coli sequence denotes a site where a nucleotide has been either deleted from the eubacterial lineage in the course of evolution, or inserted in the other two lineages. Note that the sequences from these three organisms, representative of the three domains of the living world, all differ from one another to a roughly similar degree, while still retaining unmistakable similarities. 11 New Genes Are Generated from Preexisting Genes The raw material of evolution is the DNA sequence that already exists: there is no natural mechanism for making long stretches of new random sequence. In this sense, no gene is ever entirely new. Innovation can, however, occur in several ways (Figure 1-23): Intragenic mutation: an existing gene can be modified by mutations in its DNA sequence. Gene duplication: an existing gene can be duplicated so as to create a pair of closely related genes within a single cell. 12 Segment shuffling: two or more existing genes can be broken and rejoined to make a hybrid gene consisting of DNA segments that originally belonged to separate genes. Horizontal (intercellular) transfer: a piece of DNA can be transferred from the genome of one cell to that of another—even to that of another species. This process is in contrast with the usual vertical transfer of genetic information from parent to progeny. 13 Four modes of genetic innovation and their effects on the DNA sequence of an organism. 14 Each of these types of change leaves a characteristic trace in the DNA sequence of the organism, providing clear evidence that all four processes have occurred. 15 Gene Duplications Give Rise to Families of Related Genes Within a Single Cell A cell must duplicate its entire genome each time it divides into two daughter cells. However, accidents occasionally result in the duplication of just part of the genome, with retention of original and duplicate segments in a single cell. Once a gene has been duplicated in this way, one of the two gene copies is free to mutate and become specialized to perform a different function within the same cell. Repeated rounds of this process of duplication and divergence, over many millions of years, have enabled one gene to give rise to a whole family of genes within a single genome. Analysis of the DNA sequence of procaryotic genomes reveals many examples of such gene families: in Bacillus subtilis, for example, 47% of the genes have one or more obvious relatives (Figure 1-24). 16 When genes duplicate and diverge in this way, the individuals of one species become endowed with multiple variants of a primordial gene. This evolutionary process has to be distinguished from the genetic divergence that occurs when one species of organism splits into two separate lines of descent at a branch point in the family tree—when the human line of descent became separate from that of chimpanzees, for example. There, the genes gradually become different in the course of evolution, but they are likely to continue to have corresponding functions in the two sister species. Genes that are related in this way—that is, genes in two separate species that derive from the same ancestral gene in the last common ancestor of those two species—are said to be orthologs. Related genes that have resulted from a gene duplication event within a single genome—and are likely to have diverged in their function—are said to be paralogs. Genes that are related by descent in either way are called homologs, a general term used to cover both types of relationship 17 Paralogous genes and orthologous genes: two types of gene homology based on different evolutionary pathways. (A) and (B) The most basic possibilities. (C) A more complex pattern of events that can occur. 18 The Function of a Gene Can Often Be Deduced from Its Sequence Family relationships among genes are important not just for their historical interest, but because they lead to a spectacular simplification in the task of deciphering gene functions. Once the sequence of a newly discovered genehas been determined, it is now possible, by tapping a few keys on a computer, to search the entire database of known gene sequences for genes related to it. In many cases, the function of one or more of these homologs will have been already determined experimentally, and thus, since gene sequence determines gene function, one can frequently make a good guess at the function of the new gene: it is likely to be similar to that of the already-known homologs. 19 In this way, it becomes possible to decipher a great deal of the biology of an organism simply by analyzing the DNA sequence of its genome and using the information we already have about the functions of genes in other organisms that have been more intensively studied. Mycobacterium tuberculosis, the eubacterium that causes tuberculosis, is extremely difficult to study experimentally in the laboratory and provides an example of the power of comparative genomics. DNA sequencing has revealed that this organism has a genome of 4,411,529 nucleotide pairs, containing approximately 4000 genes. 20 Of these genes, 40% were immediately recognizable (when the genome was sequenced, in 1998) as homologs of known genes in other species, and could be tentatively assigned a function on that basis. Another 44% showed some informative similarity to other known genes—for example, containing a conserved protein domain within a longer amino acid sequence. Only 16% of the 4000 genes were totally unfamiliar. As we saw also for Bacillus subtilis(see Figure 1-24), about half the genes have sequences closely similar to those of other genes in the M. tuberculosis genome, showing that they must have arisen through relatively recent gene duplications. Compared with other bacteria, M. tuberculosis contains an exceptionally large number of genescoding for enzymes involved in the synthesis and degradation of lipid (fatty) molecules. This presumably reflects this bacterium's production of an unusual outer coat that is rich in these substances; the coat, and the enzymes that produce it, may explain how M. tuberculosis escapes destruction by the immune system of tuberculosis patients. 21 More Than 200 Gene Families Are Common to All Three Primary Branches of the Tree of Life Given the complete genome sequences of representative organisms from all three domains—archaea, eubacteria, and eucaryotes—one can search systematically for homologies that span this enormous evolutionary divide. In this way we can begin to take stock of the common inheritance of all living things. Because of all these vagaries of the evolutionary process, it seems that only a small proportion of ancestral gene families have been universally retained in a recognizable form. Thus, out of 2264 protein-coding gene families recently defined by comparing the genomes of 18 bacteria, 6 archaeans and 1 eucaryote (yeast), only 76 are truly ubiquitous (that is, represented in all the genomes analyzed). The great majority of these universal families include components of the translation and transcription systems. 22 A better—though still crude—idea of the latter can be obtained by tallying the gene families that have representatives in multiple, but not necessarily all, species from all three major kingdoms. Such an analysis reveals 239 ancient conserved families. With a single exception, these families can be assigned a function (at least in terms of general biochemical activity, but usually with more precision), with the largest number of shared gene families being involved in translation and ribosome production and in amino acid metabolism and transport (Table 1-2). 23 24 MOLECULAR BIOLOGY OF THE CELL https://www.ncbi.nlm.nih.gov/books/NBK26866/ 25