2024 Cell#6 Genes & Chromosomes Notes PDF
Document Details
Uploaded by TenaciousNephrite186
Burman University
Tags
Summary
These notes cover genes, chromatin, and chromosomes, focusing on eukaryotic gene structure and genomics. They compare prokaryotic and eukaryotic gene structures, detail major classes of nuclear eukaryotic DNA, and discuss chromosomal structure and function, including the histone code. The notes also describe tools of genomics and define key terms related to gene expression and DNA structure.
Full Transcript
Genes, Chromatin and Chromosomes, Chapter 7 Learning Objectives. At the end of this unit, you will be able to: 1. Describe the components of a gene. 2. Compare and contrast prokaryotic and eukaryotic gene structure. 3. Differentiate the major classes of nuclear eukaryotic DNA. 4. Descri...
Genes, Chromatin and Chromosomes, Chapter 7 Learning Objectives. At the end of this unit, you will be able to: 1. Describe the components of a gene. 2. Compare and contrast prokaryotic and eukaryotic gene structure. 3. Differentiate the major classes of nuclear eukaryotic DNA. 4. Describe the chromosomal structure and function of chromatin. 5. State what the “histone code” means, and how histone tail modifications affect chromatin structure. 6. List the tools of genomics, specifically for computer searches, comparison and analysis. 7. Define terms like coding versus noncoding DNA, exons, introns, LINES, SINES, transposons, retrotransposons, mobile elements, epigenetics, X-chromosome inactivation, chromo- and bromo-domains, euchromatin versus heterochromatin, genomics, paralogs, orthologs, ORFs and SNPs. 8. Compare organelle DNA to nuclei DNA structure. Text: Lodish et. al., Ch. 7 Verse: But God gives it a body as he has determined, and to each kind of seed he gives its own body. Not all flesh is the same: People have one kind of flesh, animals have another, birds another and fish another. 1 Cor 15:38-39 (NIV) The genome is the total genetic information carried by a cell or organism. I believe that the “seed” referenced in this verse represents the genome that was designed for each organism. Genomics, the study of genomes of different species, have shown that many similarities, and identities, exists. For example, the human genome is approximately 96% similar to the genomes of the greater apes. I believe when God created the species, He used the same genomic design for all creatures (i.e., similar organ systems, anatomic structures and others) but adapting different designs to allow for individuals to better survive in their biological niches. This would explain the large percentage of similarities that exist between humans, apes and other mammals. It is much like a contractor building homes to suit his/her customers. Some customers, that love to cook, may require a large and well-designed kitchen; others that enjoy soaking in a large tub may like a large marbled bathroom suite. To each its own. Let us enjoy this biological diversity in our world and celebrate the genomes that provide the blueprints for this diversity. BIOL374, Genes, Chromatin and Chromosomes, p.2 6.1 Eukaryotic Gene Structure § A gene is the entire nucleic acid sequence that is necessary for the synthesis of a functional gene product (polypeptide or RNA). § Genes are also DNA regions that code for RNA molecules such as tRNA and rRNA. § In eukaryotes, genes lie amidst a large expanse of nonfunctional, noncoding DNA and genes may also contain regions of noncoding DNA. A. Components of a gene: 1. The coding region codes for the amino acid/RNA sequence. 2. Transcriptional-control region (or promoter): – In prokaryotes this is the 5’-upstream region (200bp – 1kb long). – In eukaryotes this can be at the 5’ or 3’, but mostly 5’ and up to 50kb long. 3. Other critical non-coding regions including the 3’ cleavage, polyadenylation and splice sites, and introns (eukaryote; F5-27). B. Prokaryotic genes: – Bacterial operons produce polycistronic mRNAs, which are translated to yield several different proteins. – An operon is a single transcriptional unit that encodes proteins involved in related functions (F5-13a; Old text). – Single transcriptional control region, therefore, mutations in this region influence expression of all genes in the operon. – Lack introns. C. Eukaryotic genes: – Most eukaryotic mRNAs are monocistronic and contain lengthy introns. – More than 95% of an average 50 kb human gene is present in introns and noncoding 5’ and 3’ regions. – Several forms of transcriptional units in eukaryotic genes: 1. A simple transcriptional unit produces a single monocistronic mRNA, transcribed from a particular promoter, which is translated into a single protein [F7-3a]. 2. Complex Transcriptional Units A complex transcriptional unit is transcribed into a 1º transcript that can be processed into 2 or more different LW2024 BIOL374, Genes, Chromatin and Chromosomes, p.3 mRNAs depending on the choice of: a. splice sites (or internal exons) – sharing the same 5’ and 3’ exons but utilizing different internal exons (F3-7b top). b. polyadenylation sites – share the same 5’ exons but have different 3’ exons (F3-7b middle). c. promoters – different 5’ exons but common 3’exons (F8-7b bottom). § Cell Specific Expression of Genes – Many complex transcriptional units express one mRNA in one cell type and an alternative mRNA in a different cell type, e.g., fibronectin (F5-28). – This is often accomplished by using distinct cell-type-specific promoters, like in the last option (in F7-3b bottom). D. Five Major Classes of Nuclear Eukaryotic DNA (T7-1): 1. Protein-coding genes that encode proteins and functional RNAs, including gene families constitute ~40% of the human genome. – Protein-coding genes may be solitary or belong to a gene family. – Solitary protein-coding genes, whose sequence occurs only once in the haploid genome, account for ~25-50% of the protein-coding genes. – Duplicated genes constitute the 2nd group of protein-coding genes. These are genes with close but nonidentical sequences that are generally located within 5- 50 kb of one another. – Duplicated gene family A set of duplicated genes that encode proteins with similar but nonidentical amino acid sequences is called a gene family; the encoded, LW2024 BIOL374, Genes, Chromatin and Chromosomes, p.4 closely related, homologous proteins constitute a protein family. Generally located in a gene cluster, gene families are made up of a few to 30+ members, e.g., immunoglobulins, kinases, egg shell proteins in silkworms (50 genes). Different members of a family have similar, but subtly different properties suited to the particular type of cells in which they are expressed (e.g., hemoglobin in RBC and myoglobin in muscles both carry O2, have similar structure but different O2-binding affinities). – Duplicated genes arose by duplication of an ancestral gene and subsequent independent mutations leading to sequence drift (F7- 2b). – The presence of pseudogenes, nonfunctioning sequences with similar sequences to those of functional genes, is one of the best pieces of evidence for sequence drift (F7-4a). – Sequence drift generated sequences that either terminate translation or block mRNA processing, rendering such regions nonfunctional even when they are transcribed to mRNA. 2. Tandemly Repeated Genes. – Tandemly repeated genes, or arrays, encode rRNA, tRNA, snRNA, miRNA and histones. – Make up < 0.5 % of the human genome. – Unlike gene families, the repeated genes encode identical or nearly identical protein/functional RNA. – Usually joined head to tail with non-transcribed spacer regions between the transcribed regions. – For tRNA and histone genes, they often occur in clusters, but generally not in tandem arrays. – Gene products in high demand are encoded by multiple copies of genes. – In addition to highly repeated rRNA and tRNA genes, many other genes are transcribed into nonprotein-coding RNAs. – Some examples include snRNAs (small nuclear RNAs; RNA splicing), snoRNAs (small nucleolar RNAs; rRNA processing), miRNAs (micro RNAs; regulate stability and translation of some mRNAs) and others (T7-2). 3. Repetitious DNA. (See Sections 6.2 and 6.3 Notes) – ~50% of the human genome but these sequences overlap with the protein coding and intergenic regions (i.e., regions 1 and 5; thus, this results in categories in T7-1 adding to > 100% in the table). LW2024 BIOL374, Genes, Chromatin and Chromosomes, p.5 4. Long Noncoding RNA Genes – lncRNAs (long noncoding RNAs) have been discovered in mammalian nuclei. – They make up approximately 15% of the human genome. – Some may function to regulate expression of specific protein-coding genes, most do not have a known function yet. 5. Intergenic Regions (Unclassified Spacer DNA; see Section 6.2). 6.2 Chromosomal Organization of Genes and Noncoding DNA § Genomes of higher eukaryotes contain much nonfunctional DNA. § Density of genes varies greatly in different regions of the human chromosomal DNA, from “gene-rich” regions, such as the β-globin cluster, to large gene-poor “deserts”. § Synthesis of non-functional DNA requires time, raw materials and energy. There is a selection pressure to lose non-functional DNA in lower eukaryotes. § Only ~2 % of human DNA codes for exons (and only 1.2 % code for proteins). A. Noncoding (nonfunctional) DNA – Amongst eukaryotes, cellular DNA content does not correlate with phylogeny, e.g., vertebrates with the greatest amount of DNA per cell are amphibians; plants have about 2-5x (tulips 10x) more DNA than humans. – Variation in the amount of nonfunctional DNA in the genomes of various species is largely responsible for the lack of a consistent relationship between the amount of DNA in the haploid chromosomes of an animal or plant and its phylogenetic complexity. B. Human Exons and Introns – Exons range from 50 to 200 bp (except for the 3’ exons which tend to be longer). – Introns vary in length considerably. The median length is 3.3 kb (~90 bp to very long). – About half of human genomic DNA is transcribed into pre-mRNA precursors but some 95% of these sequences are in introns, which are removed by splicing. – There are several types of eukaryotic DNA, much of which is never transcribed. C. Repetitious DNA – Besides the protein-coding genes and tandemly repeated genes, eukaryotic DNA contain repetitious DNA (that also reside within protein coding and intergenic regions). – Constituting 6% of the human genome, simple-sequence DNA is composed of perfect or nearly perfect repeats of relatively short sequences. – Longer sequences, or interspersed repeats, made up the rest of the repetitious DNA (~45% of the human genome). LW2024 BIOL374, Genes, Chromatin and Chromosomes, p.6 – Simple-sequence, or aka Satellite, DNA: Short sequences (1- to 500 –bp) repeated in long tandem arrays (up 150 copies). Although some satellites can be 1-4 bp long (i.e., microsatellites), most compose of 14-500 bp repeats, in tandem repeats of 20-100 kb. Preferentially located in centromeres, telomeres, and specific locations within the arms of chromosomes [F7-6]. Called satellites because of their different location, compared to the bulk cellular DNA, in equilibrium buoyant-density centrifugation. Satellites with 1 to 13-bp repeats are called microsatellites. Microsatellites are occasionally found within transcriptional units. Expanded microsatellites (presumably due to daughter-strand slippage and inherited from parents) have been implicated in many neuromuscular diseases. In some cases, they behave like a recessive mutation because they interfere with function or the expression of the gene in which they occur. In some cases, the repeats occur within a gene, like in Huntington disease, where a triplet repeat result in long polymers of a single amino acid. In other cases, repeats in a noncoding region can interfere with the processing of a specific subset of mRNAs, e.g., myotonic dystrophy type 1, where an extended 5’-UTR results in the incorrect splicing of the pre-mRNAs. Simple-sequence DNA can be useful for identifying particular chromosomes by fluorescence in situ hybridization (FISH). Microsatellites possibly originated by “backward slippage” of a daughter strand on the template during DNA replication [F7-5]. Unequal crossing-over during meiosis can generate differences in lengths of simple-sequence DNA tandem arrays. This results in polymorphisms in different individuals. DNA fingerprinting depends on differences in length of simple-sequence DNAs [F7-7]. § Differences in the length of simple sequence repeats can be used in DNA fingerprinting (or RFLP, restriction fragment length polymorphism). D. Intergenic Regions (Unclassified spacer DNA) – Substantial portion of the genome at ~25%. – Between transcriptional units, and not repeated anywhere. – Transcriptional-control regions, 50-200 base pairs in length may exist in these regions. These control regions usually help to regulate transcription from distant promoters. – They may also contribute to the structure of the chromosomes. – Most genome scientists believe that these sequences insulate, or space, genes from one another. LW2024 BIOL374, Genes, Chromatin and Chromosomes, p.7 6.3 Transposable (Mobile) DNA § 2nd type of repetitious DNA. § Moderately-repeated, mobile DNA sequences are interspersed throughout the genomes of prokaryotes, higher plants and animals. § These sequences range in size from 100s to 1000s bp, and making up ~ 25-50% of the genome. § They have the unique ability to “move” in the genome, they are called mobile DNA elements or transposable elements. § A transposable element (or mobile DNA element) is a DNA sequence that can change its position within a genome, sometimes creating or reversing mutations and altering the cell's genetic identity and genome size. § The sequences are copied and inserted into a new site in the genome by the process of transposition. § In most cases, transposition of mobile elements shows no detrimental effects (if they transpose into a noncoding region). Occasionally, they can cause mutations. § These molecular parasites (sequences) appear to serve no useful function but exist only to maintain themselves (Francis Crick called them “selfish DNA”). § Two classes of mobile elements- transposons and retrotransposons: 1. Transposons or Insertion Sequences (IS) are elements that transpose to new sites directly as DNA. 2. Retrotransposons are those that are first transcribed into RNA, and then reverse transcribed into DNA before transposing. § Both types produce short direct repeats at the site of insertion, which flank the mobile element. The sequence of which depends on the target DNA but length depends on the type of mobile element [F7-8]. A. DNA Transposons § Most mobile elements in bacteria transpose directly as DNA. § In contrast, most mobile elements in eukaryotes are retrotransposons, but eukaryotic DNA transposons do exist (the original mobile elements discovered by Barbara McClintock are eukaryotic DNA transposons). 1. Bacterial Insertion Sequences – Also referred to as transposons, insertion sequences, or IS elements (F7-9). – Transposition of an IS element is very rare, ~1 in 105-107 cells/generation, because higher rates of transposition would probably result in too great a mutation rate (if IS inactivates an essential gene) killing the host cells and the IS element it carries. – IS elements can insert into plasmids and lysogenic viruses, and thus transferred to LW2024 BIOL374, Genes, Chromatin and Chromosomes, p.8 other cells. – The general structure of bacterial IS elements includes a protein-coding region that encodes transposase, an enzyme required for transposition of the IS element to a new site [F7-9]. – In the transposition of bacterial IS [F7-10], duplication of target DNA result in 5-11 bp direct repeats. 2. Eukaryotic DNA Transposons – Similar structure to bacterial IS elements but may contain introns. – Drosophila P element is used to create transgenic flies (although most Drosophila mobile elements are retrotransposons). – Ac and Ds elements in corn. 1st mobile DNA elements discovered (by Nobel Laureate Barbara McClintock). Wild-type corn kernels has a purple pigment (anthocyanin) that is synthesized by an enzyme pathway. If a mutation occurs in one of the enzymes in the pathway, the cells do not produce pigment and the result is white corn kernels. Mutations are caused by Ac and Ds elements: Ac elements (Activation), like IS elements, insert and disrupt gene sequences. They are fully functional transposons that can also remove themselves from the insertion site. This results in a highly reversible mutant phenotype. Ds elements (Dissociation) can also disrupt genes when inserted but are lacking part of the sequence encoding transposase. Therefore, they cannot transpose by themselves but require the presence of an AC element. – Transposition during the S phase can result in copy number changes [F7-11]. B. Retrotransposons § Transpose via an RNA intermediate transcribed from the mobile element by an RNA polymerase and then converted back into dsDNA by a reverse transcriptase before incorporating into a new location in the genome of the host cells. § Retrotransposons are divided into those containing and those lacking long terminal repeats (LTRs). 1. Viral retrotransposons – Viral retrotransposons contain LTRs and behave like retroviruses in the genome [F7-12]. – Also called LTR retrotransposons. LTRs are direct repeats of ~250-600 bp. – ~ 8% of the human genome. – Includes Ty elements in yeast; copia elements in drosophila. LW2024 BIOL374, Genes, Chromatin and Chromosomes, p.9 – Encode most of the proteins in common retroviruses, e.g., Ty912 and copia lack a portion of the env gene to form the envelope of retroviruses. – Lacking the envelope proteins, they cannot bud from their host cells. – However, they can transpose to new sites in the DNA of the host cells. – In the retroviral life cycle § The left LTR functions as a promoter that directs host-cell RNA pol II to initiate transcription at the 5’-end of the R sequence. § The right LTR directs host-cell RNA processing enzymes to cleave the primary transcript and add a poly(A) tail at the 3’-end of the R sequence. § The resulting retroviral RNA genome, which lacks a complete left and right LTRs, is packaged into a virion that buds from the cell [F7-13]. § After, retrovirus infects a cell, reverse transcriptase (encoded by the viral genome) synthesizes a dsDNA containing complete LTRs (F7-14). § Integrase, another viral-encoded enzyme (that is closely related to the transposases of DNA transposons), insert the ds retroviral DNA into the host cell genome. – ERVs § The most common LTR sequences in humans are endogenous retroviruses (ERVs). § Of the 443,000 ERV-related sequences, almost all consists of LTRs derived from full-length proviral DNA that had recombined, deleting the internal retroviral sequences during homologous recombination. 2. Nonviral retrotransposons – Lack LTR’s, therefore called nonviral retrotransposons. – Most abundant mobile elements in mammals. – Two classes: LINEs (long interspersed elements; ~6 kb) and SINEs (short interspersed elements; ~300 bp) families. – Also found in protozoans, insects, and plants. – LINEs § There are ~900,000 LINEs in the human genome, or 21% of the genome. § Composed of L1, L2 and L3 LINEs that have similar mechanism of transposition but differ in their sequence. LW2024 BIOL374, Genes, Chromatin and Chromosomes, p.10 § L1 LINE is the most common LINE element in the human genome [F7-16]. a. ORF1, 1kb, codes an RNA binding protein. b. ORF2, 4kb, codes a bifunctional protein with reverse transcriptase (homologous to reverse transcriptases of retroviruses and viral retrotransposons) and DNA endonuclease activities. c. A/T rich region is important for retrotransposition [F7-17]. – SINEs § 2nd most abundant mobile element, ~13% of total human DNA (~1.6 million copies). § ~300 bp and do not encode proteins. § Most contain a 3’ A/T-rich sequence similar to the LINEs. § SINEs are transcribed by RNA pol III, the same RNA pol that transcribes tRNAs, 5S rRNAs and other small stable RNAs. § Most likely the ORF1 and ORF2 proteins of LINEs mediate transposition of SINEs. § Alu SINEs, or Alu elements is the most common, ~ 1,100,000 copies. (Among the retrotransposons, ~40% are L1 LINEs and 60% SINEs, of which ~90% are Alu elements.) a. Each Alu sequence contains the Alu1 restriction site. b. Alu sequences can be inserted into introns, non-coding 5’ and 3’ regions C. Pseudogenes § Nonfunctional genes that arose by sequence drift (mutations) of duplicated genes. § Processed pseudogenes, lacking introns, are derived from nonfunctional copies of cellular mRNAs that were reverse transcribed and inserted into the genome. § Most processed pseudogenes are flanked by short direct repeats, suggesting they were generated by rare retrotransposition events. D. Mobile elements and gene evolution § Mobile DNA elements probably had a significant influence on gene evolution. § Spontaneous mutations may result from the insertion of a mobile DNA element into or near a transcription unit. § Homologous recombination between mobile DNA elements may contribute to gene duplication (F7-2b) and other rearrangements, including duplication of introns, recombination of introns to create new genes, and control of gene expression. § Mobile elements and mutations – In Drosophila, ~½ of the spontaneous mutations are a result from the insertion of a mobile element into or near a transcription unit. – In mammals, this number is smaller, ~10% in mice, and only ~0.1-0.2% in humans. LW2024 BIOL374, Genes, Chromatin and Chromosomes, p.11 – Nevertheless, mobile elements have been found in mutant alleles associated with several human genetic diseases. § Exon shuffling – Exon shuffling is used to refer to recombination between interspersed repeats in introns of two separate genes, generating new genes from a novel combination of preexisting exons [F7-18]. – Both DNA transposons [F7-19a] and LINE retrotransposons [F7-19b] can occasionally carry unrelated flanking sequences when they insert into new sites. This may result in exon shuffling or altered transcription control (e.g., if an enhancer is carried along with the DNA). 6.4 Genomics: Genome-wide Analysis of Gene Structure and Expression § The entire genomic sequences of humans and many experimental animals (E. coli, C. elegans, Arabidopsis thaliana, M. musculus, etc.) have been determined and stored in three primary databases: 1. GenBank at the National Institutes of Health (NIH), Bethestha, Maryland, USA and 2. EMBL Sequence Data Base at the European Molecular Biology Laboratory, Heidelberg, Germany. 3. Protein Data Bank, Osaka, Japan. § Genomics – Genomics is the study of whole genomes of organisms. – It includes the comparative analysis of the complete genomic sequences from different organisms. – Used to assess evolutionary relations between species and to predict the number and general types of proteins produced by an organism. – Genome is the total genetic information carried by a cell or organism. LW2024 BIOL374, Genes, Chromatin and Chromosomes, p.12 § BLAST (basic local alignment search tool) – The function of a protein that has not been isolated often can be predicted on the basis of similarity of its amino acid sequence to proteins of known function. – A computer algorithm known as BLAST rapidly searches databases of known protein sequences to find those with significant similarity to a new (query) sequence. – Proteins with common functional motifs may not be identified in a typical BLAST search. – These short sequences may be located by searches of motif databases. § Protein Relatedness – A protein family comprises multiple proteins that are related (believed to be derived from the same ancestral protein). – The corresponding genes constitute a gene family. – A phylogenetic tree can represent the relatedness between similar, or homologous, sequences. – Paralogs and Orthologs In a species, related genes and their encoded proteins are paralogous (or paralogs); those that derive from speciation are orthologous (or orthologs). Orthologs usually have a similar function. § ORF – Open reading frames (ORFs) are regions of genomic DNA containing at least 100 codons located between a start codon and stop codon. – Computer search of the entire bacterial and yeast genomic sequences for ORFs correctly identifies most protein-coding genes. – Additional data must be used to identify probable genes in the genomic sequences of humans and other higher eukaryotes because of the more complex gene structure in these organisms. § Ontogeny and Phylogeny – Ontogeny does not correlate with phylogeny, i.e., biological complexity is not directly related to the number of protein-coding genes. – For example, amphibians have more genes than humans, which has a more complex body plan and more complex behavior. § Variation between Individuals - SNPs – The DNA sequence between not-closely-related individual humans differs ~ 1-2% of the 3×109 bp in the human genome. – Most of these differences are single nucleotide polymorphisms (SNPs). – SNPs are usually not functionally significant because they mostly occur in long introns or between genes or result in silent codon changes in coding regions. LW2024 BIOL374, Genes, Chromatin and Chromosomes, p.13 § Variation between Individuals - Copy Number – The numbers of copies of DNA, from deletions and tandem duplications, vary in individuals in ~ 12% of the genome. – Deletions and duplications probably arose from unequal crossing over between chromosomes during meiotic recombination (F7-2). – In some cases, deletions result in only one allele present (one copy deleted in one of the homologous only). Likewise, duplications could result in more copies in one of the homologous chromosomes. 6.5 Structural Organization of Eukaryotic Chromosomes § Eukaryotic DNA associates with histones, the most abundant nuclear proteins, to form chromatin. § About half of the mass of chromosomes is attributed to proteins. § The metaphase chromosome is a folded and most compact form of the chromosome. A. Histones § Histones are a family of small basic proteins, rich in positively-charged basic amino acids (Arg & Lys), which interact with the negative charges on DNA. § 5 major types – H1, H2A, H2B, H3, & H4. § H2A, H2B, H3, & H4 are highly conserved in eukaryotes (i.e., in amino acid sequence and hence tertiary protein structure). H1 sequence is more divergent, and sometimes referred to as a linker histone as it interacts with both a nucleosome and linker DNA (F7-21). B. Chromatin exists in extended & condensed forms § “beads on a string” (F7-22a) – isolated at low [salt] in the absence of Mg2+. – Extended form where the string is a thin filament of “linker” DNA connecting beadlike structures called nucleosomes. – Nucleosomes are complexes of histones and DNA [F7-20] DNA wound around an octameric protein core made of 2 copies each of histones H2A, H2B, H3 and H4 (like thread around a spool) 147 bp wrapped in slightly less than 2 turns Linker DNA range from 50-150 bp 10 nm diameter nucleosomes are assembled right after DNA replication A single H1 histone is associated with each nucleosome and bound to the DNA as it enters and exits the nucleosome core (F7-21). LW2024 BIOL374, Genes, Chromatin and Chromosomes, p.14 § More condensed (F7-23) – Isolated at isotonic physiological buffers. – Structurally disordered fiber, 5 to 24 nm (F7-24a) Nucleosomes contact each other, some appearing to stack on each other (F7-23b), others are arranged in a helical conformation (F7-23c), and sometimes forming a chain that loops back on itself (F7-23d). Chromatin that is not transcribed exists as this thicker fiber. Transcribed DNA assume beads-on-a-string structure. C. Nonhistone proteins provide a structural scaffold for long chromatin loops § Megabases of DNA loops anchored to chromosome scaffold consisting of nonhistone proteins. § Scaffold is shaped like the metaphase chromosome and persists even after nuclease digestion of DNA [F6-35, old text]. § SMC Proteins – The structural maintenance of chromosomes (SMC) proteins are U-shaped dimeric complexes that links two chromatin fibers (F7-25a,b). – Long loops of chromatin are tethered at the base of each loop by several SMC complexes, forming a topological knot between transcription units (F7-25c). – “SMC knots” insulate transcription units from each other, so that proteins regulating transcription of one gene do not influence the transcription of a neighboring gene [F8-30, old text]. D. Metaphase Chromosomes [F8-36c, old text]. § The 30-nm fiber folds into a 100 - 130-nm chromonema fiber, which then folds into the 200 - 250-nm middle prophase chromatid, which then folds into the 500 - 750-nm diameter chromatids seen at metaphase. LW2024 BIOL374, Genes, Chromatin and Chromosomes, p.15 E. Interphase Chromosomes [F7-29]. § Chromosomes are less condensed in interphase than metaphase chromosomes. § During interphase, human chromosomes remain in nonoverlapping territories in the nucleus, called chromosome territories. F. Eukaryotic Chromosomes § Each eukaryotic chromosome is a single linear DNA molecule. § In humans, the longest DNA molecule is 2.8×108 bp, or almost 10 cm. § Chromatin contains small amounts of other proteins, e.g., DNA-binding transcription factors and HMG (high mobility group) proteins, in addition to histones and SMC proteins. G. Nucleosomes – Histone tails and condensation of DNA § N-terminal and two H2A/B C-terminal histone tails, involved in condensation of chromatin, are more “disordered” in the structure [F7-26a]. § Histone tails are required for the condensation of DNA from the beads-on-a-string conformation to the 5-24 mm fiber. § Histone tails are subject to multiple post- translational modifications like acetylation, methylation, phosphorylation and unbiquination. § Precise amino acids that are modified may provide a “histone code” that controls chromatin condensation [F7-26b]. § For example, positively charged Lys on the histone tails interact with linker DNA and negatively charged histone patches. § When these are acetylated by nuclear lysine acetyl transferase (or KATs; K for lysine), these interactions cannot form, resulting in the decondensing of DNA from the condensed fiber form. § Proteins involved in transcription, replication, etc, have easier access to chromatin with hyperacetylated tails than hypoacetylated tails. § Histone tails also bind to other proteins associated with chromatin changes during transcription and replication. § Highly condensed chromatin is referred to as heterochromatin, whereas those that are less condensed is euchromatin [F7-28]. § Histone deacetylases (HDACs) that remove acetyl groups of acetylated lysines from histone tails promote condensation of chromatin. LW2024 BIOL374, Genes, Chromatin and Chromosomes, p.16 § Similarly, methylations at Lys9 of H3 promote chromatin condensation. § The histone code of modified amino acids in the histone tails are read by proteins that bind to the modified tails – in turn promoting condensation or decondensation of chromatin. § Some bound proteins contain the chromodomain that bind to and recognize histone tails that are methylated. – One such protein is the heterochromatin protein 1 (HP1) that binds to histone H3 trimethylated on lysine 9. – HP1 also associates with itself and with the histone methyl transferase (H3K9 HMT; that methylates H3 lysine 9) via another domain called the chromoshadow domain commonly present in these proteins. – These interactions cause condensation of the 30-nm chromatin fiber and spreading of the hetero-chromatic structure along the chromosome until a boundary element is encountered [F7-28c]. § Boundary elements are regions in the chromosome where nonhistone proteins bind to block histone methylation on the other side of the boundary [F7-28c]. § Histone-tail modifications typical of euchromatin (e.g., hyperacetylation) are bound to bromodomains; thus bromodomain- containing proteins are associated with transcriptionally active chromatin. – The largest unit of the transcription factor TFIID contain 2 bromodomains. – Some bromodomain-containing proteins also have histone acetylase activity. H. Epigenetic Memory § Epigenetics is the study of changes in organisms caused by modification of gene expression rather than alteration of the genetic code itself. § Heterochromatin is reestablished following DNA replication by the presence of the “histone code” present in half of the DNA molecule (i.e., the parental strand). § The H3K9 HMT associated at methylated lysines on the parental stands will methylate the lysine on the new daughter strands of the newly assembled nucleosomes, regenerating the heterochromatin in both daughter chromosomes. § Consequently, heterochromatin is marked with an epigenetic code, which does not depend on the DNA base sequence, that maintains the repression of those associated genes. § The epigenetic code is also present on euchromatin (due to the presence of hyperacetylation on the parental strands). LW2024 BIOL374, Genes, Chromatin and Chromosomes, p.17 § KATs hyperacetylate the newly synthesized daughter strands § Consequently, the epigenetic code associated with euchromatin helps to maintain the transcriptional activity of genes in euchromatin through successive cell divisions. § X-chromosome inactivation in mammalian females. – One of the X chromosomes is randomly inactivated to preserve the same gene- dosage as the males (i.e., one copy of alleles is active). – During early embryonic development, either the Xm or the Xp is inactivated, so that half of the somatic cells will have Xm and the other half will have Xp. – Subsequently, all daughter cells maintain the same inactive X chromosome as their parent cells. – X-chromosome inactivation is a form of epigenetic control. – Histones associated with the inactive X chromosome have post-translational modifications characteristic of other regions of heterochromatin, i.e., hypoacetylation of lysines, di- or tri-methylation of H3 lysines. – Controlled by the X-inactivation center which encodes the XIST gene responsible for triggering the silencing of that X-chromosome expressing the XIST RNA (a form of lncRNA). – Polycomb protein complexes are also involved in X-inactivation. – Thus, the activity of genes on the X chromosome in female mammals is controlled by chromatin structure, rather than by the underlying DNA sequence. – The inactivated X chromosome (either Xp or Xm) is maintained as the inactive chromosome in the progeny of all future cell divisions because the histones are modified in a specific, repressing manner that is faithfully inherited through each cell division. 6.6 Morphology and Functional Elements of Eukaryotic Chromosomes § Microscopic observations on the number and size of chromosomes and their staining pattern has revealed important aspects of chromosome structure. § Condensed Metaphase chromosomes has 2 sister chromatids, which are attached at the centromere [F7-31]. A. Karyotype § The number, size and shapes of the metaphase chromosomes, which is species-specific, constitute the karyotype. § Chromosome painting, using fluorescence in situ hybridization (FISH), distinguishes each homologous pair by color (F7-33). These FISH-painted LW2024 BIOL374, Genes, Chromatin and Chromosomes, p.18 chromosomes are useful in revealing chromosome anomalies and karyotypes. § Karyotypes are species specific, e.g., Indian muntjac (F7-32a) and Reeves muntjac (F7-32b). B. Banding Patterns in Metaphase Chromosomes § Dyes selectively stain certain regions, producing banding patterns that are specific for individual chromosomes. § Giemsa, a permanent DNA dye, produce G bands (brief mild heat or proteolysis followed by Giemsa). § R bands (hot alkali treatment before Giemsa) is the pattern almost the reverse of G bands. § Banding patterns can be revealed by using fluorescent probes in chromosome painting. These probes are specific for sites scattered along the length of each chromosome to “paint” chromosomes. § Banding patterns allows one to locate the sites of chromosomal breaks and rearrangements, e.g., chromosomal translocation analyzed by banding patterns and multicolor FISH [F7-34]. § Looking at diagrams of chromosomes - P is short arm of chromosome and Q is long arm of chromosome. § e.g., HLA region (green) is between 6p22.1 and 6p21.2 (chromosome 6, short arm, subregion 2, band 1.2, respectively; right fig from old text). § Heterochromatin consists of chromosome regions that do not uncoil. Heterochromatin are dark-staining regions of condensed chromatin [F7-28a]. § Euchromatin are light-staining regions of less condensed chromatin. § Heterochromatin frequently occurs at centromeres and telomeres. C. Three functional elements are required for replication and stable inheritance of eukaryotic chromosomes. § In Yeast, ARS (autonomously replicating sequences) are replication origins for initiation of DNA replication. LW2024 BIOL374, Genes, Chromatin and Chromosomes, p.19 § In Yeast, CEN, or the centromere, is required for mitotic migration. § The two ends (telomeres or TEL) are repeated sequences with high G content [e.g., G4T2)n] added on by telomerase (or telomere terminal transferase). § CEN sequences are conserved (F7-38a). It is required for kinetochore-microtubule interactions (F7-38c). D. Telomerase § During DNA synthesis of the lagging strand, since the chromosome is a linear DNA molecule, loss of the 5’-end of each lagging strand occurs [F7-39]. § Telomerase, a protein-RNA complex, prevents this loss. § It has a special reverse transcriptase activity that completes replication of telomeres [F7-40]. 6.7 Organelle DNAs A. Mitochondrial DNA (mtDNA) § Mitochondria contain multiple mtDNA molecules, that encode rRNAs, tRNAs, and some mitochondrial proteins involved in the electron transport chain and ATP synthesis [F12-10; old text]. § Circular, ~ 16.5 kb in mammals; contains no introns and very little noncoding DNA. § The size and coding capacity of mtDNA varies considerably in different organisms. § The products of mitochondrial genes are not exported (mtDNA-encoded proteins are synthesized on mitochondrial ribosomes). § Human mtDNA – Transcription of outer (H) strand is clockwise and inner (L) strand is counterclockwise. The H (or heavy) strand encodes most of the mitochondrial genes. LW2024 BIOL374, Genes, Chromatin and Chromosomes, p.20 § Mitochondria growth and division are not coupled to nuclear division. – Organelles grow by the incorporation of proteins and lipids, most of which are synthesized as cytosolic precursors containing uptake-targeting sequences. – Process occurs continually during the interphase of the cell cycle. – As organelles increase in size, one or more daughters pinch off in a manner similar to bacterial cell division. § Mitochondria inheritance – In mammals and most other animals, the sperm contributes little (if any) cytoplasm to the zygote; virtually all of the mitochondria in the embryo are from the egg. – mtDNA in a cell depends on the number of mitochondria, and [mtDNA]s would also reflect the uniparent (maternal) inheritance of mitochondria, e.g., in mice, 99.99% maternal and 0.01% paternal. – mt genes thus show cytoplasmic inheritance [F12-9; old text]. – Petite mitochondria are smaller mitochondria because of defective oxidative phosphorylation due to a deletion in mtDNA. § Mutations in mtDNA cause several genetic diseases in humans. – Maternal cytoplasmic pattern of inheritance. – Can cause neuromuscular disorders because of the high demand for ATP in these tissues. – e.g., in Leber’s hereditary optic neuropathy, degeneration of the optic nerve occurs (eventual blindness) due to a missense mutation in the gene for a protein subunit of NADH-CoQ reductase. Patients have a mixture of wild-type and mutant mtDNA in their cells (heteroplasmy); the severity of the phenotype is greater when the fraction of mutant mtDNA is higher. § Mitochondrial genetic codes differ from the standard genetic code. – Animal and fungal mtDNA differ from bacteria/nuclear and also differs between different animals and fungi. – Codons encode alternative amino acids or stop codons. – Plant mtDNA use standard genetic code. B. Chloroplasts contain large circular DNAs, or cpDNA. § Circular, ~120-160 kb (size depends on plant species). § Encode ~120 proteins, mostly for photosynthesis. – rRNA, tRNAs, RNA polymerase and ribosome subunits – Photosynthetic electron transport complexes – 2 subunits of ribulose 1,5-bisphosphate carboxylase (fixation of CO2 during photosynthesis) § Uses the standard genetic code. LW2024