Essential Cell Biology, 5th Edition - Chapter 5: DNA and Chromosomes PDF
Document Details
Tags
Related
Summary
This chapter from Essential Cell Biology, 5th Edition, examines the structure and function of DNA and chromosomes. It explores how genetic information is stored, replicated, and regulated in eukaryotic cells. The chapter also covers the basics of DNA structure and its role in heredity and development.
Full Transcript
CHAPTER FIVE 5 DNA and Chromosomes Life depends on the ability of cells to store, retrieve, and translate the THE STRUCTURE OF DNA genetic instructions required to make and maintain a living organism. These instructi...
CHAPTER FIVE 5 DNA and Chromosomes Life depends on the ability of cells to store, retrieve, and translate the THE STRUCTURE OF DNA genetic instructions required to make and maintain a living organism. These instructions are stored within every living cell in its genes—the information-bearing elements that determine the characteristics of a spe- THE STRUCTURE OF cies as a whole and of the individuals within it. EUKARYOTIC CHROMOSOMES At the beginning of the twentieth century, when genetics emerged as a science, scientists became intrigued by the chemical nature of genes. THE REGULATION OF The information in genes is copied and transmitted from a cell to its CHROMOSOME STRUCTURE daughter cells millions of times during the life of a multicellular organ- ism, and passed from generation to generation through the reproductive cells—eggs and sperm. Genes survive this process of replication and transmission essentially unchanged. What kind of molecule could be capable of such accurate and almost unlimited replication, and also be able to direct the development of an organism and the daily life of a cell? What kind of instructions does the genetic information contain? How are these instructions physically organized so that the enormous amount of information required for the development and maintenance of even the simplest organism can be contained within the tiny space of a cell? The answers to some of these questions began to emerge in the 1940s, when it was discovered from studies in simple fungi that genetic informa- tion consists primarily of instructions for making proteins. As described in the previous chapter, proteins perform most of the cell’s functions: they serve as building blocks for cell structures; they form the enzymes that catalyze the cell’s chemical reactions; they regulate the activity of genes; and they enable cells to move and to communicate with one another. With hindsight, it is hard to imagine what other type of instructions the genetic information could have contained. 174 CHAPTER 5 DNA and Chromosomes The other crucial advance made in the 1940s was the recognition that deoxyribonucleic acid (DNA) is the carrier of the cell’s genetic informa- tion. But the mechanism whereby the information could be copied for transmission from one generation of cells to the next, and how proteins might be specified by instructions in DNA, remained completely myste- rious until 1953, when the structure of DNA was determined by James Watson and Francis Crick. The structure immediately revealed how DNA might be copied, or replicated, and it provided the first clues about how a molecule of DNA might encode the instructions for making proteins. Today, the fact that DNA is the genetic material is so fundamental to our understanding of life that it can be difficult to appreciate what an enor- mous intellectual gap this discovery filled. In this chapter, we begin by describing the structure of DNA. We see how, despite its chemical simplicity, the structure and chemical properties of DNA make it ideally suited for carrying genetic information. We then con- sider how genes and other important segments of DNA are arranged in the single, long DNA molecule that forms each chromosome in the cell. Finally, we discuss how eukaryotic cells fold these long DNA molecules into compact chromosomes inside the nucleus. This packing has to be single chromosome done in an orderly fashion so that the chromosomes can be apportioned correctly between the two daughter cells at each cell division. At the same time, chromosomal packaging must allow DNA to be accessed by the large number of proteins that replicate and repair it, and that deter- mine the activity of the cell’s many genes. This is the first of five chapters that deal with basic genetic mechanisms— the ways in which the cell maintains and makes use of the genetic information carried in its DNA. In Chapter 6, we discuss the mechanisms by which the cell accurately replicates and repairs its DNA. In Chapter 7, we consider gene expression—how genes are used to produce RNA and (A) dividing cell nondividing cell protein molecules. In Chapter 8, we describe how a cell controls gene expression to ensure that each of the many thousands of proteins encoded in its DNA is manufactured at the proper time and place. In Chapter 9, we discuss how present-day genes evolved, and, in Chapter 10, we consider some of the ways that DNA can be experimentally manipulated to study fundamental cell processes. An enormous amount has been learned about these subjects in the past 60 years. Much less obvious, but equally important, is the fact that our knowledge is very incomplete; thus a great deal still remains to be dis- (B) covered about how DNA provides the instructions to build living things. 10 μm Figure 5–1 Chromosomes become visible as eukaryotic cells prepare to divide. THE STRUCTURE OF DNA (A) Two adjacent plant cells photographed using a fluorescence microscope. The Long before biologists understood the structure of DNA, they had rec- DNA, which is labeled with a fluorescent ognized that inherited traits and the genes that determine them were dye (DAPI), is packaged into multiple associated with chromosomes. Chromosomes (named from the Greek chromosomes; these become visible as chroma, “color,” because of their staining properties) were discovered in distinct structures only when they condense ECB5 e5.01/5.01 in preparation for cell division, as can be the nineteenth century as threadlike structures in the nucleus of eukary- seen in the cell on the left. For clarity, a otic cells that become visible as the cells begin to divide (Figure 5–1). As single chromosome has been shaded biochemical analyses became possible, researchers learned that chro- (brown) in the dividing cell. The cell on the mosomes contain both DNA and protein. But which of these components right, which is not dividing, contains the encoded the organism’s genetic information was not immediately clear. identical chromosomes, but they cannot be distinguished as individual entities We now know that the DNA carries the genetic information of the cell and because the DNA is in a much more that the protein components of chromosomes function largely to pack- extended conformation at this phase in the age and control the enormously long DNA molecules. But biologists in cell’s division cycle. (B) Schematic diagram of the outlines of the two cells and their the 1940s had difficulty accepting DNA as the genetic material because of chromosomes. (A, courtesy of Peter Shaw.) the apparent simplicity of its chemistry (see How We Know, pp. 193–195). The Structure of DNA 175 DNA, after all, is simply a long polymer composed of only four types of nucleotide subunits, which are chemically very similar to one another. Then, early in the 1950s, Maurice Wilkins and Rosalind Franklin exam- ined DNA using x-ray diffraction analysis, a technique for determining the three-dimensional atomic structure of a molecule (see Panel 4−6, pp. 168–169). Their results provided one of the crucial pieces of evidence that led, in 1953, to Watson and Crick’s model of the double-helical struc- ture of DNA. This structure—in which two strands of DNA are wound around each other to form a helix—immediately suggested how DNA could encode the instructions necessary for life, and how these instruc- tions could be copied and passed along when cells divide. In this section, we examine the structure of DNA and explain in general terms how it is able to store hereditary information. A DNA Molecule Consists of Two Complementary Chains of Nucleotides A molecule of deoxyribonucleic acid (DNA) consists of two long poly- nucleotide chains. Each chain, or strand, is composed of four types of nucleotide subunits, and the two strands are held together by hydrogen bonds between the base portions of the nucleotides (Figure 5–2). As we saw in Chapter 2 (Panel 2–7, pp. 78–79), nucleotides are com- posed of a nitrogen-containing base and a five-carbon sugar, to which a phosphate group is attached. For the nucleotides in DNA, the sugar is deoxyribose (hence the name deoxyribonucleic acid) and the base can be either adenine (A), cytosine (C), guanine (G), or thymine (T). The (A) building blocks of DNA (B) DNA strand sugar phosphate + G 5′ 3′ G C A T sugar base G phosphate (guanine) nucleotide (C) double-stranded DNA (D) DNA double helix 3′ 3′ 5′ 5′ C G G C A T 0.34 nm T A T A A T Figure 5–2 DNA is made of four nucleotide building blocks. (A) Each T A A nucleotide is composed of a sugar phosphate covalently linked to a G C sugar–phosphate G C base—guanine (G) in this figure. backbone (B) The nucleotides are covalently linked C G C G together into polynucleotide chains, with a sugar–phosphate backbone from which C G C G the bases—adenine, cytosine, guanine, and thymine (A, C, G, and T)—extend. A (C) A DNA molecule is composed of two A T polynucleotide chains (DNA strands) held G C C G together by hydrogen bonds between the paired bases. The arrows on the DNA T A A T strands indicate the polarities of the two strands, which run antiparallel to each other 5′ 5′ (with opposite chemical polarities) in the 3′ 3′ DNA molecule. (D) Although the DNA is hydrogen-bonded shown straightened out in (C), in reality, it is base pairs wound into a double helix, as shown here. 176 CHAPTER 5 DNA and Chromosomes 5’ end of chain Figure 5–3 The nucleotide subunits within a DNA strand are held together by phosphodiester bonds. These bonds connect one sugar to the next. The chemical differences in the ester linkages—between O the 5ʹ carbon of one sugar and the 3ʹ carbon of the other—give rise –O P to the polarity of the resulting DNA strand. For simplicity, only two nucleotides are shown here. O base CH2 nucleotides are covalently linked together in a chain through the sugars O and phosphates, which form a backbone of alternating sugar–phosphate– sugar sugar–phosphate (see Figure 5–2B). Because only the base differs in 3’ each of the four types of subunits, each polynucleotide chain resembles O phosphodiester a necklace: a sugar–phosphate backbone strung with four types of tiny bond –O P O beads (the four bases A, C, G, and T). These same symbols (A, C, G, and T) are also commonly used to denote the four different nucleotides—that O base is, the bases with their attached sugar phosphates. 5’ CH2 O The nucleotide subunits within a DNA strand are held together by phos- 4’ sugar 1’ phodiester bonds that link the 5ʹ end of one sugar with the 3ʹ end of the next (Figure 5−3). Because the ester linkages to the sugar molecules on 3’ 2’ either side of the bond are different, each DNA strand has a chemical O polarity. If we imagine that each nucleotide has a phosphate “knob” and a hydroxyl “hole” (see Figure 5–2A), each strand, formed by interlocking 3’ end of chain knobs with holes, will have all of its subunits lined up in the same orienta- tion. Moreover, the two ends of the strand can be easily distinguished, as one will have a hole (the 3ʹ hydroxyl) and the other a knob (the 5ʹ phos- phate). This polarity in a DNA strand is indicated by referring to one end as ECB5 n5.200/5.06 the 3ʹ end and the other as the 5ʹ end (see Figure 5−3). The two polynucleotide chains in the DNA double helix are held together by hydrogen-bonding between the bases on the different strands. All the bases are therefore on the inside of the double helix, with the sugar– phosphate backbones on the outside (see Figure 5–2D). The bases do not pair at random, however; A always pairs with T, and G always pairs with C (Figure 5–4). In each case, a bulkier two-ring base (a purine, see Panel 2–7, pp. 78–79) is paired with a single-ring base (a pyrimidine). Each purine–pyrimidine pair is called a base pair, and this complemen- tary base-pairing enables the base pairs to be packed in the energetically most favorable arrangement along the interior of the double helix. In this arrangement, each base pair has the same width, thus holding the sugar– phosphate backbones an equal distance apart along the DNA molecule. For the members of each base pair to fit together within the double helix, the two strands of the helix must run antiparallel to each other—that is, be oriented with opposite polarities (see Figure 5–2C and D). The antiparallel sugar–phosphate strands then twist around each other to form a double helix containing 10 base pairs per helical turn (Figure 5–5). This twisting also contributes to the energetically favorable conformation of the DNA double helix. As a consequence of the base-pairing arrangement shown in Figure 5–4, each strand of a DNA double helix contains a sequence of nucleotides that is exactly complementary to the nucleotide sequence of its part- QUESTION 5–1 ner strand—an A always matches a T on the opposite strand, and a C always matches a G. This complementarity is of crucial importance when Which of the following statements it comes to both copying and maintaining the DNA structure, as we dis- are correct? Explain your answers. cuss in Chapter 6. An animated version of the DNA double helix can be A. A DNA strand has a polarity seen in Movie 5.1. because its two ends contain different bases. The Structure of DNA Provides a Mechanism for Heredity B. G-C base pairs are more stable than A-T base pairs. The fact that genes encode information that must be copied and trans- mitted accurately when a cell divides raised two fundamental issues: how The Structure of DNA 177 hydrogen 3′ bond 5′ end 5′ cytosine H guanine _ _ O O H P H N H O O O N C C C C C bases C O H C C N H N G C N O P O 3′ end _ N O O N C C 0.34 nm HO _ G O H N sugar– O O phosphate O G O P O H backbone _ O O O O C P H P O O O _ O thymine adenine O O O H O P CH3 O H N N _ T sugar C O O O A C C C C G O_ O O O P O H C T N H N A C N O O O P O _ C phosphodiester OH bonds N C C N _ O O H hydrogen bond 5′ end 3′ end 5′ 3′ 1 nm (A) (B) Figure 5–4 The two strands of the DNA double helix are held together by hydrogen bonds between complementary base pairs. (A) Schematic illustration showing how the shapes and chemical structures of the bases allow hydrogen bonds to form efficiently only between A and T and between G and C. The atoms that form the hydrogen bonds between these nucleotides (see Panel 2–3, pp. 70–71) can be brought close together without perturbing the double helix. As shown, two hydrogen bonds form between A and T, whereas three form between G and C. The bases can pair in this way only if the two polynucleotide chains that contain them are antiparallel—that is, oriented in opposite directions. (B) A short section of the double helix viewed from its side. Four base pairs are ECB5 e5.06/5.07 illustrated; note that they lie perpendicular to the axis of the helix, unlike the schematic shown in (A). As shown in Figure 5−3, the nucleotides are linked together covalently by phosphodiester bonds that connect the 3ʹ-hydroxyl (–OH) group of one sugar and the 5ʹ phosphate (–PO3) attached to the next (see Panel 2–7, pp. 78–79, to review how the carbon atoms in the sugar ring are numbered). This linkage gives each polynucleotide strand a chemical polarity; that is, its two ends are chemically different. The 3ʹ end carries an unlinked –OH group attached to the 3ʹ position on the sugar ring; the 5ʹ end carries a free phosphate group attached to the 5ʹ position on the sugar ring. can the information for specifying an organism be carried in chemical form, and how can the information be accurately copied? The structure of DNA provides the answer to both questions. Information is encoded in the order, or sequence, of the nucleotides along each DNA strand. Each base—A, C, T, or G—can be considered a letter in a four-letter alphabet that is used to spell out biological messages (Figure 5–6). Organisms differ from one another because their respective DNA molecules have different nucleotide sequences and, consequently, carry major different biological messages. But how is the nucleotide alphabet used to groove make up messages, and what do they spell out? Before the structure of DNA was determined, investigators had estab- lished that genes contain the instructions for producing proteins. Thus, it minor was clear that DNA messages must somehow be able to encode proteins. groove Consideration of the chemical character of proteins makes the problem Figure 5–5 A space-filling model shows the conformation of the DNA double helix. The two DNA strands wind around each other to form a right-handed helix (see Figure 4–14) with 10 bases per turn. Shown here are 1.5 turns of the DNA double helix. The coiling of the two strands around each other creates two grooves in the double helix. The wider groove is called the major groove and the smaller one the minor groove. The colors of the atoms are: N, blue; O, red; P, yellow; H, white; and C, black. (See Movie 5.1.) 2 nm 178 CHAPTER 5 DNA and Chromosomes (A) molecular biology is... easier to define. As discussed in Chapter 4, the function of a protein is determined by its three-dimensional structure, which in turn is deter- (B) mined by the sequence of the amino acids in its polypeptide chain. The linear sequence of nucleotides in a gene, therefore, must somehow spell (C) out the linear sequence of amino acids in a protein. (D) The exact correspondence between the 4-letter nucleotide alphabet of DNA and the 20-letter amino acid alphabet of proteins—the genetic (E) TTCGAGCGACCTAACCTATAG code—is not at all obvious from the structure of the DNA molecule. It Figure 5–6 Linear messages come in took more than a decade of clever experiments after the discovery of many forms. The languages shown are the double helix to work this code out. In Chapter 7, we describe the (A) English, (B) a musical score, (C) Morse genetic code in detail when we discuss gene expression—the process by code, (D) Japanese, and (E) DNA. which the nucleotide sequence of a gene is transcribed into the nucleotide ECB5 e5.08/5.09 sequence of an RNA molecule—and then, in most cases, translated into the amino acid sequence of a protein (Figure 5–7). The amount of information in an organism’s DNA is staggering: writ- ten out in the four-letter nucleotide alphabet, the nucleotide sequence of a very small protein-coding gene from humans occupies a quarter of a page of text, while the complete human DNA sequence would fill more than 1000 books the size of this one. Herein lies a problem that affects the architecture of all eukaryotic chromosomes: How can all this information be packed neatly into the cell nucleus? In the remainder of this chapter, we discuss the answer to this question. THE STRUCTURE OF EUKARYOTIC CHROMOSOMES Large amounts of DNA are required to encode all the information needed to make a single-celled bacterium, and far more DNA is needed to encode the information to make a multicellular organism like you. Each human cell contains about 2 meters (m) of DNA; yet the cell nucleus is only 5–8 μm in diameter. Tucking all this material into such a small space is the equivalent of trying to fold 40 km (24 miles) of extremely fine thread into a tennis ball. In eukaryotic cells, very long, double-stranded DNA molecules are pack- aged into chromosomes. These chromosomes not only fit handily inside the nucleus, but, after they are duplicated, they can be accurately appor- tioned between the two daughter cells at each cell division. The complex task of packaging DNA is accomplished by specialized proteins that bind to and fold the DNA, generating a series of coils and loops that provide increasingly higher levels of organization and prevent the DNA from becoming a tangled, unmanageable mess. Amazingly, this DNA is folded in a way that allows it to remain accessible to all of the enzymes and other proteins that replicate and repair it, and that cause the expression of its genes. gene A gene B gene C gene D Figure 5–7 Most genes contain DNA double information to make proteins. As we helix discuss in Chapter 7, protein-coding genes each produce a set of RNA molecules, which then direct the production of a specific protein molecule. Note that for a RNA A RNA B RNA C RNA D minority of genes, the final product is the RNA molecule itself, as shown here for gene C. In these cases, gene expression is complete once the nucleotide sequence of the DNA has been transcribed into the nucleotide sequence of its RNA. protein A protein B protein D The Structure of Eukaryotic Chromosomes 179 Bacteria typically carry their genes on a single, circular DNA molecule. This molecule is also associated with proteins that condense the DNA, but these bacterial proteins differ from the ones that package eukaryotic DNA. Although this prokaryotic DNA is called a bacterial “chromosome,” it does not have the same structure as eukaryotic chromosomes, and less is known about how it is packaged. Our discussion of chromo- some structure in this chapter will therefore focus entirely on eukaryotic chromosomes. Eukaryotic DNA Is Packaged into Multiple Chromosomes In eukaryotes, such as ourselves, nuclear DNA is distributed among a set of different chromosomes. The DNA in a human nucleus, for example, is parceled out into 23 or 24 different types of chromosome, depending on an individual’s sex (males, with their Y chromosome, have an extra type of chromosome that females do not). Each of these chromosomes con- sists of a single, enormously long, linear DNA molecule associated with proteins that fold and pack the fine thread of DNA into a more compact structure. This complex of DNA and protein is called chromatin. In addi- tion to the proteins involved in packaging the DNA, chromosomes also associate with many other proteins involved in DNA replication, DNA repair, and gene expression. With the exception of the gametes (sperm and eggs) and highly special- ized cells that lack DNA entirely (such as mature red blood cells), human cells each contain two copies of every chromosome, one inherited from the mother and one from the father. The maternal and paternal versions of each chromosome are called homologous chromosomes (homologs). The only nonhomologous chromosome pairs in humans are the sex chro- mosomes in males, where a Y chromosome is inherited from the father and an X chromosome from the mother. (Females inherit one X chromo- Figure 5–8 Each human chromosome can be “painted” a different color to some from each parent and have no Y chromosome.) Each full set of allow its unambiguous identification. The human chromosomes contains a total of approximately 3.2 × 109 nucleo- chromosomes shown here were isolated tide pairs of DNA—which together comprise the human genome. from a cell undergoing nuclear division (mitosis) and are therefore in a highly In addition to being different sizes, the different human chromosomes compact (condensed) state. Chromosome can be distinguished from one another by a variety of techniques. Each painting is carried out by exposing the chromosome can be “painted” a different color using sets of chromo- chromosomes to a collection of single- some-specific DNA molecules coupled to different fluorescent dyes stranded DNA molecules that have been (Figure 5–8A). An earlier and more traditional way of distinguishing one coupled to a combination of fluorescent dyes. For example, single-stranded chromosome from another involves staining the chromosomes with dyes DNA molecules that match sequences that bind to certain types of DNA sequences. These dyes mainly distin- in chromosome 1 are labeled with one guish between DNA that is rich in A-T nucleotide pairs and DNA that is specific dye combination, those that match G-C rich, and they produce a predictable pattern of bands along each type sequences in chromosome 2 with another, of chromosome. The resulting patterns allow each chromosome to be and so on. Because the labeled DNA can form base pairs (hybridize) only with its identified and numbered. specific chromosome (discussed in Chapter 10), each chromosome is differently colored. For such experiments, the chromosomes are treated so that the individual strands of its double-helical DNA partly separate 1 2 3 4 5 to enable base-pairing with the labeled, single-stranded DNA. (A) Micrograph showing the array of 6 7 8 9 10 11 12 chromosomes as they originally spilled from the lysed cell. (B) The same chromosomes artificially lined up in their numerical order. 13 14 15 16 17 18 This arrangement of the full chromosome set is called a karyotype. (Adapted from 19 20 21 22 X X N. McNeil and T. Ried, Expert Rev. Mol. Med. 2:1–14, 2000. With permission from (A) (B) 10 μm Cambridge University Press.) 180 CHAPTER 5 DNA and Chromosomes Figure 5–9 Abnormal chromosomes are associated with some inherited genetic disorders. (A) Two normal human chromosomes, chromosome 6 and chromosome 4, have been subjected to chromosome painting as described in Figure 5−8. (B) In an individual with a reciprocal chromosomal translocation, a segment of one chromosome has been swapped with a segment from the other. Such chromosomal translocations are a frequent event in cancer cells. (Courtesy of Zhenya Tang and the NIGMS Human Genetic Cell Repository at the Coriell Institute for Medical Research.) (A) chromosome 6 chromosome 4 An ordered display of the full set of 46 human chromosomes is called the human karyotype (Figure 5–8B). If parts of a chromosome are lost, or switched between chromosomes, these changes can be detected. Cytogeneticists analyze karyotypes to detect chromosomal abnormalities that are associated with some inherited disorders (Figure 5–9) and with certain types of cancer (as we see in Chapter 20). Chromosomes Organize and Carry Genetic Information (B) reciprocal chromosomal translocation The most important function of chromosomes is to carry genes—the functional units of heredity. A gene is often defined as a segment of DNA that contains the instructions for making a particular protein or RNA mol- ECB5 m4.12/5.12 ecule. Most of the RNA molecules encoded by genes are subsequently used to produce a protein. In some cases, however, the RNA molecule is the final product (see Figure 5–7). Like proteins, these RNA molecules have diverse functions in the cell, including structural, catalytic, and gene regulatory roles, as we discuss in later chapters. Together, the total genetic information carried by a complete set of the chromosomes present in a cell or organism constitutes its genome. Complete genome sequences have been determined for thousands of organisms, from E. coli to humans. As might be expected, some correla- tion exists between the complexity of an organism and the number of genes in its genome. For example, the total number of genes is about 500 for the simplest bacterium and about 24,000 for humans. Bacteria and some single-celled eukaryotes, including the budding yeast S. cerevi- siae, have especially compact genomes: the DNA molecules that make up their chromosomes are little more than strings of closely packed genes (Figure 5–10). However, chromosomes from many eukaryotes—includ- ing humans—contain, in addition to genes and the specific nucleotide sequences required for normal gene expression, a large excess of inter- spersed DNA (Figure 5–11). This extra DNA is sometimes erroneously called “junk DNA,” because its usefulness to the cell has not yet been dem- onstrated. Although this spare DNA does not code for protein, much of it may serve some other biological function. Comparisons of the genome sequences from many different species reveal that small portions of this extra DNA are highly conserved among related species, suggesting their importance for these organisms. segment of double-stranded DNA comprising 0.5% of the DNA of the yeast genome 5′ 3′ 3′ 5′ genes 10,000 nucleotide pairs Figure 5–10 In yeast, genes are closely packed along chromosomes. This figure shows a small region of the DNA double helix in one chromosome from the budding yeast S. cerevisiae. The S. cerevisiae genome contains about 12.5 million nucleotide pairs and 6600 genes—spread across 16 chromosomes. Note that, for each gene, only one of the two DNA strands actually encodes the information to make an RNA molecule. This coding region can fall on either strand, as indicated by the light red bars. However, each “gene” is considered to include both the “coding strand” and its complement. The high density of genes is characteristic of S. cerevisiae. ECB5 e5.12/5.13 The Structure of Eukaryotic Chromosomes 181 Figure 5–11 In many eukaryotes, genes include an excess of interspersed, noncoding DNA. Presented here is the nucleotide sequence of the human β-globin gene. This gene carries the information that specifies the amino acid sequence of one of the two types of subunits found in hemoglobin, a protein that carries oxygen in the blood. Only the sequence of the coding strand is shown here; the noncoding strand of the double helix carries the complementary sequence. Starting from its 5′ end, such a sequence is read from left to right, like any piece of English text. The segments of the DNA sequence that encode the amino acid sequence of β-globin are highlighted in yellow. We will see in Chapter 7 how this information is transcribed and translated to produce a full-length β-globin protein. In general, the more complex an organism, the larger is its genome. But this relationship does not always hold true. The human genome, for example, is 200 times larger than that of the yeast S. cerevisiae, but 30 times smaller than that of some plants and at least 60 times smaller than some species of amoeba (see Figure 1−41). Furthermore, how the DNA is apportioned over chromosomes also differs from one species to another. Humans have a total of 46 chromosomes (including both maternal and paternal sets), but a species of small deer has only 7, while some carp species have more than 100. Even closely related species with similar genome sizes can have very different chromosome numbers and sizes (Figure 5–12). Thus, although gene number is roughly correlated with species complexity, there is no simple relationship between gene num- ber, chromosome number, and total genome size. The genomes and chromosomes of modern species have each been shaped by a unique history of seemingly random genetic events, acted on by specific selec- tion pressures, as we discuss in Chapter 9. Specialized DNA Sequences Are Required for DNA Replication and Chromosome Segregation To form a functional chromosome, a DNA molecule must do more than simply carry genes: it must be able to be replicated, and the replicated copies must be separated and partitioned equally and reliably into the two daughter cells at each cell division. These processes occur through an ordered series of events, known collectively as the cell cycle. This cycle of cell growth and division is summarized—very briefly—in Figure 5–13 and will be discussed in detail in Chapter 18. Only two broad stages of the cell cycle need concern us in this chapter: interphase, when chro- mosomes are duplicated, and mitosis, the much more brief stage when the duplicated chromosomes are distributed, or segregated, to the two daughter nuclei. During interphase, chromosomes are extended as long, thin, tangled threads of DNA in the nucleus and cannot be easily distinguished in the light microscope (see Figure 5–1). We refer to chromosomes in this extended state as interphase chromosomes. It is during interphase that DNA replication takes place. As we discuss in Chapter 6, two specialized DNA sequences, found in all eukaryotes, ensure that this process occurs efficiently. One type of nucleotide sequence, called a replication origin, is where DNA replication begins; eukaryotic chromosomes contain many replication origins to allow the long DNA molecules to be replicated rap- idly (Figure 5–14). Another DNA sequence forms the telomeres that mark the ends of each chromosome. Telomeres contain repeated nucleotide sequences that are required for the ends of chromosomes to be fully rep- licated. They also serve as a protective cap that keeps the chromosome tips from being mistaken by the cell as broken DNA in need of repair. 182 CHAPTER 5 DNA and Chromosomes Y2 X Y1 X Y Chinese muntjac Indian muntjac Figure 5–12 Two closely related species can have similar genome sizes but very different chromosome numbers. In the evolution of the Indian muntjac deer, chromosomes that were initially separate, and that remain separate in the Chinese species, fused without having a major effect on the number of genes—or the animal. (Image left, courtesy of Deborah Carreno, Natural Wonders Photography; image right, courtesy of Beatrice Bourgery.) Eukaryotic chromosomes also contain a third type of specialized DNA e5.13/5.15called the centromere, that allows duplicated chromosomes ECB5sequence, to be separated during M phase (see Figure 5–14). During this stage of the cell cycle, the DNA coils up, adopting a more and more compact structure, ultimately forming highly compacted, or condensed, mitotic chromosomes (Figure 5–15). This is the state in which the duplicated chromosomes can be most easily visualized (see Figure 5–1). Once the chromosomes have condensed, the centromere allows the mitotic spin- dle to attach to each duplicated chromosome in a way that directs one copy of each chromosome to be segregated to each of the two daughter cells (see Figure 5–13). We describe the central role that centromeres play in cell division in Chapter 18. Interphase Chromosomes Are Not Randomly Distributed Within the Nucleus Interphase chromosomes are much longer and finer than mitotic chro- mosomes. They are nevertheless organized within the nucleus in several ways. First, although interphase chromosomes are constantly undergo- ing dynamic rearrangements, each tends to occupy a particular region, or territory, of the interphase nucleus (Figure 5–16). This loose organi- zation prevents interphase chromosomes from becoming extensively mitotic nuclear envelope spindle surrounding the nucleus GENE EXPRESSION MITOSIS CELL AND CHROMOSOME BEGINS DIVISION DUPLICATION interphase mitotic chromosome chromosome INTERPHASE M PHASE INTERPHASE Figure 5–13 The duplication and segregation of chromosomes occurs through an ordered cell cycle in proliferating cells. During interphase, the cell expresses many of its genes, and—during part of this phase—it duplicates its chromosomes. Once chromosome duplication is complete, the cell can enter M phase, during which nuclear division, or mitosis, occurs. In mitosis, the duplicated chromosomes condense, gene expression largely ceases, the nuclear envelope breaks down, and the mitotic spindle forms from microtubules and other proteins. The condensed chromosomes are then captured by the mitotic spindle, one complete set is pulled to each end of the cell, and a nuclear envelope forms around each chromosome set. In the final step of M phase, the cell divides to produce two daughter cells. Only two different chromosomes are shown here for simplicity. ECB5 e5.14/5.16 The Structure of Eukaryotic Chromosomes 183 INTERPHASE M PHASE INTERPHASE telomere replication origin CELL DIVISION + centromere portion of duplicated mitotic spindle chromosome chromosomes copies in separate cells Figure 5–14 Three DNA sequence elements are needed to produce a eukaryotic chromosome that can be duplicated and then segregated at mitosis. Each chromosome has multiple origins of replication, one centromere, and two telomeres. The sequence of events that a typical chromosome follows during the cell cycle is shown schematically. The DNA replicates ECB5 in interphase, beginning at the origins of e5.15/5.17 replication and proceeding bidirectionally from each origin along the chromosome. In M phase, the centromere attaches the compact, duplicated chromosomes to the mitotic spindle so that one copy will be distributed to each daughter cell when the cell divides. Prior to cell division, the centromere also helps to hold the duplicated chromosomes together until they are ready to be pulled apart. Telomeres contain DNA sequences that allow for the complete replication of chromosome ends. duplicated chromosome entangled, like spaghetti in a bowl. In addition, some chromosomal regions are physically attached to particular sites on the nuclear enve- lope—the pair of concentric membranes that surround the nucleus—or to the underlying nuclear lamina, the protein meshwork that supports the centromere envelope (discussed in Chapter 17). These attachments also help inter- phase chromosomes remain within their distinct territories. The most obvious example of chromosomal organization in the inter- phase nucleus is the nucleolus—a structure large enough to be seen in the light microscope (Figure 5−17A). During interphase, the parts of dif- ferent chromosomes that carry genes encoding ribosomal RNAs come chromatid together to form the nucleolus. In human cells, several hundred copies 1 μm (A) (B) of these genes are distributed in 10 clusters, located near the tips of five different chromosome pairs (Figure 5−17B). In the nucleolus, ribosomal Figure 5–15 A typical duplicated mitotic RNAs are synthesized and combine with proteins to form ribosomes, the chromosome is highly compact. Because ECB5 e5.16-5.18 cell’s protein-synthesizing machines. As we discuss in Chapter 7, riboso- DNA is replicated during interphase, mal RNAs play both structural and catalytic roles in the ribosome. each mitotic chromosome contains two identical duplicated DNA molecules (see Figure 5–14). Each of these very The DNA in Chromosomes Is Always Highly Condensed long DNA molecules, with its associated proteins, is called a chromatid; as soon as As we have seen, all eukaryotic cells, whether in interphase or mitosis, the two sister chromatids separate, they package their DNA tightly into chromosomes. Human chromosome 22, are considered individual chromosomes. for example, contains about 48 million nucleotide pairs; stretched out (A) A scanning electron micrograph of a end-to-end, its DNA would extend about 1.5 cm. Yet, during mitosis, chro- mitotic chromosome. The two chromatids are tightly joined together. The constricted mosome 22 measures only about 2 μm in length—that is, nearly 10,000 region reveals the position of the times more compact than the DNA would be if it were extended to its centromere. (B) A cartoon representation full length. This remarkable feat of compression is performed by proteins of a mitotic chromosome. (A, courtesy of that coil and fold the DNA into higher and higher levels of organization. Terry D. Allen.) 184 CHAPTER 5 DNA and Chromosomes Figure 5–16 Interphase chromosomes interphase cell occupy their own distinct territories 5 within the nucleus. DNA probes coupled 3 with different fluorescent markers are used to paint individual interphase chromosomes in a human cell. (A) Viewed in a fluorescence 11 microscope, the nucleus is seen to be 5 11 filled with a patchwork of discrete colors. (B) To highlight their distinct locations, 3 three sets of chromosomes are singled out: chromosomes 3, 5, and 11. Note that nuclear nucleus pairs of homologous chromosomes, such envelope 10 µm as the two copies of chromosome 3, are (A) (B) not generally located in the same position. (Adapted from M.R. Hübner and D.L. Spector, Annu. Rev. Biophys. 39:471−489, 2010.) Although the DNA of interphase chromosomes is packed tightly into the nucleus, it is about 20 times less condensed than that of mitotic chromo- somes (Figure 5–18). ECB5 n5.102/5.19 In the next sections, we introduce the specialized proteins that make this compression possible. Bear in mind, though, that chromosome structure is dynamic. Not only do chromosomes condense and decondense dur- ing the cell cycle, but chromosome packaging must be flexible enough to allow rapid, on-demand access to different regions of the interphase chromosome, unpacking enough to allow protein complexes access to specific, localized nucleotide sequences for DNA replication, DNA repair, or gene expression. Nucleosomes Are the Basic Units of Eukaryotic Chromosome Structure The proteins that bind to DNA to form eukaryotic chromosomes are tradi- tionally divided into two general classes: the histones and the nonhistone chromosomal proteins. Histones are present in enormous quantities (more than 60 million molecules of several different types in each human cell), and their total mass in chromosomes is about equal to that of the DNA itself. Nonhistone chromosomal proteins are also present in large numbers; they include hundreds of different chromatin-associated pro- teins. In contrast, only a handful of different histone proteins are present in eukaryotic cells. The complex of both classes of protein with nuclear Figure 5–17 The nucleolus is the most prominent structure in the interphase DNA is called chromatin. nucleus. (A) Electron micrograph of a thin Histones are responsible for the first and most fundamental level of chro- section through the nucleus of a human fibroblast. The nucleus is surrounded by the matin packing: the formation of the nucleosome. Nucleosomes convert nuclear envelope. Inside the nucleus, the the DNA molecules in an interphase nucleus into a chromatin fiber that chromatin appears as a diffuse speckled mass; regions that are especially dense are called heterochromatin (dark staining). chromatin 10 chromosomes each contribute a loop containing rRNA genes to Heterochromatin contains few genes and the nucleolus is located mainly around the periphery of nuclear the nucleus, immediately under the nuclear envelope envelope. The large, dark region within the heterochromatin nucleus is the nucleolus, which contains the genes for ribosomal RNAs. (B) Schematic nucleolus illustration showing how ribosomal RNA genes, which are clustered near the tips of five different human chromosomes (13, 14, 15, 21, and 22), come together to form the nucleolus, which is a biochemical subcompartment nucleolar produced by the aggregation of a set of RNAs and proteins macromolecules—DNA, RNAs, and proteins (see Figure 4–54). (A, courtesy of (A) (B) E.G. Jordan and J. McGovern.) 2 μm The Structure of Eukaryotic Chromosomes 185 Figure 5–18 DNA in interphase chromosomes is less compact than in mitotic chromosomes. (A) An electron micrograph showing an enormous tangle of chromatin (DNA with its associated proteins) spilling out of a lysed interphase nucleus. (B) For comparison, a compact, human mitotic chromosome is shown at the same scale. (A, courtesy of Victoria Foe; B, courtesy of Terry D. Allen.) is approximately one-third the length of the initial DNA. These chroma- tin fibers, when examined with an electron microscope, contain clusters of closely packed nucleosomes (Figure 5–19A). If this chromatin is then subjected to treatments that cause it to unfold partially, it can then be seen in the electron microscope as a series of “beads on a string” (Figure 5–19B). The string is DNA, and each bead is a nucleosome core particle, which consists of DNA wound around a core of histone proteins. To determine the structure of the nucleosome core particle, investigators treated chromatin in its unfolded, “beads-on-a-string” form with enzymes (A) interphase called nucleases, which cut the DNA by breaking the phosphodiester chromatin bonds between nucleotides. When this nuclease digestion is carried out 5 μm for a short time, only the exposed DNA between the core particles—the linker DNA—will be cleaved, allowing the core particles to be isolated. mitotic chromosome An individual nucleosome core particle consists of a complex of eight histone proteins—two molecules each of histones H2A, H2B, H3, and H4—along with a segment of double-stranded DNA, 147 nucleotide pairs long, that winds around this histone octamer (Figure 5–20). The high- (B) resolution structure of the nucleosome core particle was solved in 1997, revealing in atomic detail the disc-shaped histone octamer around which the DNA is tightly wrapped, making 1.7 turns in a left-handed coil (Figure ECB5 e5.19/5.21 5–21). The linker DNA between each nucleosome core particle can vary in length from a few nucleotide pairs up to about 80. Technically speak- ing, a “nucleosome” consists of a nucleosome core particle plus one of its adjacent DNA linkers, as shown in Figure 5–20; however, the term is often used to refer to the nucleosome core particle itself. (A) (B) 50 nm Figure 5–19 Nucleosomes can be seen in the electron microscope. (A) Chromatin isolated directly from an interphase nucleus can appear in the electron microscope as a chromatin fiber, composed of packed nucleosomes. (B) Another electron micrograph shows a length of a chromatin fiber that has been experimentally unpacked, or decondensed, after isolation to show the “beads-on-a-string” appearance of the nucleosomes. (A, courtesy of Barbara Hamkalo; B, courtesy of Victoria Foe.) ECB5 e5.20/5.22 186 CHAPTER 5 DNA and Chromosomes core histones Figure 5–20 Nucleosomes contain DNA wrapped around a protein linker DNA of nucleosome core of eight histone molecules. In a test tube, the nucleosome core particle can be released from chromatin by digestion of the linker DNA with a nuclease, which cleaves the exposed linker DNA but not the DNA wound tightly around the nucleosome core. When the DNA “beads-on-a-string” nucleosome includes around each isolated nucleosome core particle is released, its length is form of chromatin ~200 nucleotide pairs of DNA found to be 147 nucleotide pairs; this DNA wraps around the histone octamer that forms the nucleosome core nearly twice. NUCLEASE DIGESTS LINKER DNA All four of the histones that make up the octamer are relatively small proteins with a high proportion of positively charged amino acids (lysine and arginine). The positive charges help the histones bind tightly to the negatively charged sugar–phosphate backbone of DNA. These numer- ous electrostatic interactions explain in part why DNA of virtually any released sequence can bind to a histone octamer. Each of the histones in the nucleosome 11 nm core particle octamer also has a long, unstructured N-terminal amino acid “tail” that extends out from the nucleosome core particle (see the H3 tail in Figure DISSOCIATION 5–21). These histone tails are subject to several types of reversible, cova- WITH HIGH lent chemical modifications that control many aspects of chromatin CONCENTRATION OF SALT structure. The histones that form the nucleosome core are among the most highly conserved of all known eukaryotic proteins: there are only two differ- ences between the amino acid sequences of histone H4 from peas and histone 147-nucleotide-pair cows, for example. This extreme evolutionary conservation reflects the octamer DNA double helix vital role of histones in controlling eukaryotic chromosome structure. DISSOCIATION Chromosome Packing Occurs on Multiple Levels Although long strings of nucleosomes form on most chromosomal DNA, chromatin in the living cell rarely adopts the extended beads-on-a-string form seen in Figure 5–19B. Instead, the nucleosomes are further packed H2A H2B H3 H4 on top of one another to generate a more compact structure, such as the chromatin fiber shown in Figure 5–19A and Movie 5.2. This addi- tional packing of nucleosomes into a chromatin fiber depends on a fifth ECB5 e5.21/5.23 an H3 histone tail viewed face-on viewed from the edge DNA double helix histone H2A histone H2B histone H3 histone H4 Figure 5–21 The structure of the nucleosome core particle, as determined by x-ray diffraction analysis, reveals how DNA is tightly wrapped around a disc-shaped histone octamer. Two views of a nucleosome core particle are shown here. The two strands of the DNA double helix are shown in gray. A portion of an H3 histone tail (green) can be seen extending from the nucleosome core particle, but the tails of the other histones have been truncated. (From K. Luger et al., Nature 389:251–260, 1997.) ECB5 e5.22/5.24 The Structure of Eukaryotic Chromosomes 187 Figure 5−22 The chromatin in human chromosomes is folded into looped looped domain domains. These loops are established by special nonhistone chromosomal proteins that bind to specific DNA sequences, creating a clamp at the base of each loop. matching specific chromosome DNA sequences loop-forming clamp proteins histone called histone H1, which is thought to pull adjacent nucleosomes together into a regular repeating array. This “linker” histone changes the path the DNA takes as it exits the nucleosome core, allowing it to form a more condensed chromatin fiber. ECB5 n5.201/5.24.5 We saw earlier that, during mitosis, chromatin becomes so highly con- densed that individual chromosomes can be seen in the light microscope. QUESTION 5–2 How is a chromatin fiber folded to produce mitotic chromosomes? Although the answer is not yet known in detail, it is known that special- Assuming that the histone ized nonhistone chromosomal proteins fold the chromatin into a series octamer (shown in Figure 5–20) of loops (Figure 5−22). These loops are further condensed to produce the forms a cylinder 9 nm in diameter interphase chromosome. Finally, this compact string of loops is thought and 5 nm in height and that the human genome forms 32 million to undergo at least one more level of packing to form the mitotic chromo- nucleosomes, what volume of some (Figure 5−23). the nucleus (6 μm in diameter) is occupied by histone octamers? (Volume of a cylinder is πr 2h; volume short region of of a sphere is 4/3 πr 3.) What fraction 2 nm DNA double helix of the total volume of the nucleus do the histone octamers occupy? How does this compare with the volume of the nucleus occupied by “beads-on-a-string” 11 nm human DNA? form of chromatin chromatin fiber of packed 30 nm nucleosomes chromatin fiber 700 nm Figure 5−23 DNA packing occurs on folded into loops several levels in chromosomes. This schematic drawing shows some of the levels centromere thought to give rise to the highly condensed mitotic chromosome. Both histone H1 and a entire set of specialized nonhistone chromosomal mitotic 1400 nm proteins are known to help drive these chromosome condensations, including the chromosome NET RESULT: EACH DNA MOLECULE HAS BEEN loop-forming clamp proteins and the PACKAGED INTO A MITOTIC CHROMOSOME THAT abundant non-histone protein condensin IS 10,000-FOLD SHORTER THAN ITS FULLY (see Figure 18–18). However, the actual EXTENDED LENGTH structures are still uncertain. 188 CHAPTER 5 DNA and Chromosomes THE REGULATION OF CHROMOSOME STRUCTURE So far, we have discussed how DNA is packed tightly into chromatin. We now turn to the question of how this packaging can be adjusted to allow rapid access to the underlying DNA. The DNA in cells carries enormous amounts of coded information, and cells must be able to retrieve this information as needed. In this section, we discuss how a cell can alter its chromatin structure to expose localized regions of DNA and allow access to specific proteins and protein complexes, particularly those involved in gene expression and in DNA replication and repair. We then discuss how chromatin structure is established and maintained—and how a cell can pass on some forms of this structure to its descendants, helping different cell types to sustain their identity. Although many of the details remain to be deciphered, the regulation and inheritance of chromatin structure play crucial roles in the development of eukaryotic organisms. Changes in Nucleosome Structure Allow Access to DNA Eukaryotic cells have several ways to adjust rapidly the local structure of their chromatin. One way takes advantage of a set of ATP-dependent chromatin-remodeling complexes. These protein machines use the energy of ATP hydrolysis to change the position of the DNA wrapped around nucleosomes (Figure 5−24). By interacting with both the histone octamer and the DNA wrapped around it, chromatin-remodeling com- plexes can locally alter the arrangement of the nucleosomes, rendering the DNA more accessible (or less accessible) to other proteins in the cell. During mitosis, many of these complexes are inactivated, which may help mitotic chromosomes maintain their tightly packed structure. Another way of altering chromatin structure relies on the reversible chemical modification of histones, catalyzed by a large number of dif- ferent histone-modifying enzymes. The tails of all four of the core histones are particularly subject to these covalent modifications, which include the addition (and removal) of acetyl, phosphate, or methyl groups ATP-dependent chromatin-remodeling complex ATP ADP MOVEMENT OF DNA 10 nm (A) (B) Figure 5−24 Chromatin-remodeling complexes locally reposition the DNA wrapped around nucleosomes. (A) The complexes use energy derived from ATP hydrolysis to loosen the nucleosomal DNA and push it along the histone octamer. In this way, the enzyme can expose or hide a sequence of DNA, controlling its availability to other DNA-binding proteins. The blue stripes have been added to show how the DNA shifts its position. Many cycles of ATP hydrolysis are required to produce such a shift. (B) The structure of a chromatin-remodeling complex, showing how the enzyme cradles a nucleosome core particle, including a histone octamer (orange) and the DNA wrapped around it (light green). This large complex, purified from yeast, contains 15 subunits, including one that hydrolyzes ATP and four that recognize specific covalently modified histones. (B, adapted from A.E. Leschziner et al., Proc. Natl. Acad. Sci. USA 104:4913−4918, 2007.) ECB5 e5.26-5.26 The Regulation of Chromosome Structure 189 (A) (B) H4 tail H2B tail histone H3 tail modification functional outcome H2A tail H3 tail trimethyl H2A tail M H4 tail M M heterochromatin formation, H2B tail gene silencing K 9 H3 tail Ac Ac trimethyl or Ac Ac or M M or Ac or M M M M P M M M M P M histone M Ac ARTKQTARKSTGGKAPRKQLATKAARKSAPATGGVK H3 gene expression 2 4 9 10 14 17 18 23 26 2728 36 K K 4 9 (Figure 5−25A). These and other modifications can have important con- Figure 5−25 The pattern of modification sequences for the packing of the chromatin fiber. Acetylation of lysines, of histone tails can determine how a stretch of chromatin is handled by the cell. for instance, can reduce the affinity of the tails for adjacent nucleosomes, (A) Schematic drawing showing the positions thereby loosening chromatin structure and allowing access to particular of the histone tails that extend from each nuclear proteins. nucleosome core particle. Each histone can be modified by the covalent attachment of a Most importantly, however, these modifications generally serve as dock- ECB5 e5.27/5.27 number of different chemical groups, mainly ing sites on the histone tails for a variety of regulatory proteins. Different to the tails. The tail of histone H3, for example, patterns of modifications attract specific sets of non-histone chro- can receive acetyl groups (Ac), methyl groups mosomal proteins to a particular stretch of chromatin. Some of these (M), or phosphate groups (P). The numbers proteins promote chromatin condensation, whereas others promote denote the positions of the modified amino acids in the histone tail, with each amino acid chromatin expansion and thus facilitate access to the DNA. Specific designated by its one-letter code. Note that combinations of tail modifications, and the proteins that bind to them, some amino acids, such as the lysine (K) at have different functional outcomes for the cell: one pattern, for example, positions 9, 14, 23, and 27, can be modified might mark a particular stretch of chromatin as newly replicated; another by acetylation or methylation (but not by might indicate that the genes in that stretch of chromatin are being both at once). Lysines, in addition, can be modified with either one, two, or three methyl actively expressed; still others are associated with genes that are silenced groups; trimethylation, for example, is shown (Figure 5−25B). in (B). Note that histone H3 contains 135 amino acids, most of which are in its globular Both ATP-dependent chromatin-remodeling complexes and histone-mod- portion (represented by the wedge); most ifying enzymes are tightly regulated. These enzymes are often brought to modifications occur on the N-terminal tail, for particular chromatin regions by interactions with proteins that bind to a which 36 amino acids are shown. (B) Different specific nucleotide sequence in the DNA—or in an RNA transcribed from combinations of histone tail modifications can this DNA (a topic we return to in Chapter 8). Histone-modifying enzymes confer a specific meaning on the stretch of chromatin on which they occur, as indicated. work in concert with the chromatin-remodeling complexes to condense Only a few of these functional outcomes are and relax stretches of chromatin, allowing local chromatin structure to known. change rapidly according to the needs of the cell. Interphase Chromosomes Contain both Highly