DNA, Chromosomes, and Genomes PDF
Document Details
Uploaded by DiplomaticHydra
Yeditepe Üniversitesi
Tags
Related
- DMD5025/CHS5042 Nucleic Acids → Chromosomes → Genome PDF
- Week 2 - DNA, RNA, Chromosomes Review PDF - York University
- Lecture 2 Our genetic material DNA chromosomes and genome PDF
- Lecture 24 Chapter 24 Genes & Chromosomes Chapter 25 DNA Metabolism Chapter 26 RNA Metabolism PDF
- Genetics/DNA/RNA PDF
- DNA, Chromosomen en Genomen PDF
Summary
This document provides an overview of basic genetic mechanisms, including the structure, function, and evolution of DNA, chromosomes, and genomes. It details how genetic information is stored, retrieved, and translated, and highlights the crucial role of proteins in cellular functions. The document also touches on the historical context of genetic discovery, from the identification of DNA as the likely carrier of genetic information to the mapping of genomes.
Full Transcript
PART I II III IV V BASIC GENETIC MECHANISMS CHAPTER DNA, Chromosomes, and Genomes...
PART I II III IV V BASIC GENETIC MECHANISMS CHAPTER DNA, Chromosomes, and Genomes 4 Life depends on the ability of cells to store, retrieve, and translate the genetic IN THIS CHAPTER instructions required to make and maintain a living organism. This hereditary information is passed on from a cell to its daughter cells at cell division, and from THE STRUCTURE AND one generation of an organism to the next through the organism’s reproductive FUNCTION OF DNA cells. The instructions are stored within every living cell as its genes, the infor- mation-containing elements that determine the characteristics of a species as a CHROMOSOMAL DNA AND whole and of the individuals within it. ITS PACKAGING IN THE As soon as genetics emerged as a science at the beginning of the twentieth cen- CHROMATIN FIBER tury, scientists became intrigued by the chemical structure of genes. The informa- tion in genes is copied and transmitted from cell to daughter cell millions of times CHROMATIN STRUCTURE AND during the life of a multicellular organism, and it survives the process essentially FUNCTION unchanged. What form of molecule could be capable of such accurate and almost unlimited replication and also be able to exert precise control, directing multi- THE GLOBAL STRUCTURE OF cellular development as well as the daily life of every cell? What kind of instruc- CHROMOSOMES tions does the genetic information contain? And how can the enormous amount of information required for the development and maintenance of an organism fit HOW GENOMES EVOLVE within the tiny space of a cell? The answers to several of these questions began to emerge in the 1940s. At this time researchers discovered, from studies in simple fungi, that genetic infor- mation consists largely of instructions for making proteins. Proteins are phenom- enally versatile macromolecules that perform most cell functions. As we saw in Chapter 3, they serve as building blocks for cell structures and form the enzymes that catalyze most of the cell’s chemical reactions. They also regulate gene expres- sion (Chapter 7), and they enable cells to communicate with each other (Chapter 15) and to move (Chapter 16). The properties and functions of cells and organisms are determined to a great extent by the proteins that they are able to make. Painstaking observations of cells and embryos in the late nineteenth century had led to the recognition that the hereditary information is carried on chro- mosomes—threadlike structures in the nucleus of a eukaryotic cell that become visible by light microscopy as the cell begins to divide (Figure 4–1). Later, when biochemical analysis became possible, chromosomes were found to consist of deoxyribonucleic acid (DNA) and protein, with both being present in roughly the same amounts. For many decades, the DNA was thought to be merely a structural 174 Chapter 4: DNA, Chromosomes, and Genomes Figure 4–1 Chromosomes in cells. (A) Two adjacent plant cells photographed through a light microscope. The DNA has been stained with a fluorescent dye (DAPI) that binds to it. The DNA is present in chromosomes, which become visible as distinct structures in the light microscope only when they become compact, sausage-shaped structures in preparation for cell division, as shown on the left. The cell on the right, which is not dividing, contains identical chromosomes, but they cannot be clearly distinguished at this phase in the cell’s life cycle, because they are in a more extended conformation. (B) Schematic diagram of the outlines of the two cells along with their chromosomes. (A, courtesy of Peter Shaw.) (A) nondividing cell dividing cell element. However, the other crucial advance made in the 1940s was the identifica- tion of DNA as the likely carrier of genetic information. This breakthrough in our understanding of cells came from studies of inheritance in bacteria (Figure 4–2). But still, as the 1950s began, both how proteins could be specified by instructions in the DNA and how this information might be copied for transmission from cell to cell seemed completely mysterious. The puzzle was suddenly solved in 1953, when James Watson and Francis Crick derived the mechanism from their model of DNA structure. As outlined in Chapter 1, the determination of the double-he- lical structure of DNA immediately solved the problem of how the information (B) in this molecule might be copied, or replicated. It also provided the first clues as 10 μm to how a molecule of DNA might use the sequence of its subunits to encode the instructions for making proteins. Today, the fact that DNA is the genetic material is so fundamental to biological thought that it is difficult to appreciate the enor- mous intellectual gap that was filled by this breakthrough discovery. We begin this chapter by describing the structure of DNA. We see how, despite its chemical simplicity, the structure and chemical properties of DNA make it ideally suited as the raw material of genes. We then consider how the many pro- MBoC6 m4.01/4.01 teins in chromosomes arrange and package this DNA. The packing has to be done in an orderly fashion so that the chromosomes can be replicated and apportioned correctly between the two daughter cells at each cell division. And it must also allow access to chromosomal DNA, both for the enzymes that repair DNA damage and for the specialized proteins that direct the expression of its many genes. In the past two decades, there has been a revolution in our ability to deter- mine the exact order of subunits in DNA molecules. As a result, we now know the smooth pathogenic bacterium S strain S strain cells causes pneumonia Figure 4–2 The first experimental demonstration that DNA is the genetic RANDOM MUTATION FRACTIONATION OF CELL-FREE material. These experiments, carried out EXTRACT INTO CLASSES OF in the 1920s (A) and 1940s (B), showed rough nonpathogenic PURIFIED MOLECULES R strain that adding purified DNA to a bacterium mutant bacterium changed the bacterium’s properties and that this change was faithfully passed RNA protein DNA lipid carbohydrate on to subsequent generations. Two live R strain cells grown in closely related strains of the bacterium presence of either heat-killed Streptococcus pneumoniae differ from S strain cells or cell-free each other in both their appearance under molecules tested for transformation of R strain cells extract of S strain cells the microscope and their pathogenicity. TRANSFORMATION One strain appears smooth (S) and causes death when injected into mice, and the other appears rough (R) and is nonlethal. some R strain cells are (A) An initial experiment shows that some transformed to S strain substance present in the S strain can cells, whose daughters R R S R R change (or transform) the R strain into the are pathogenic and S strain strain strain strain strain strain S strain and that this change is inherited by cause pneumonia subsequent generations of bacteria. (B) This experiment, in which the R strain CONCLUSION: Molecules that can CONCLUSION: The molecule that has been incubated with various classes carry heritable information are carries the heritable information present in S strain cells. is DNA. of biological molecules purified from the S strain, identifies the active substance (A) (B) as DNA. THE STRUCTURE AND FUNCTION OF DNA 175 sequence of the 3.2 billion nucleotide pairs that provide the information for pro- ducing a human adult from a fertilized egg, as well as having the DNA sequences for thousands of other organisms. Detailed analyses of these sequences are pro- viding exciting insights into the process of evolution, and it is with this subject that the chapter ends. This is the first of four chapters that deal with basic genetic mechanisms—the ways in which the cell maintains, replicates, and expresses the genetic informa- tion carried in its DNA. In the next chapter (Chapter 5), we shall discuss the mech- anisms by which the cell accurately replicates and repairs DNA; we also describe how DNA sequences can be rearranged through the process of genetic recombi- nation. Gene expression—the process through which the information encoded in DNA is interpreted by the cell to guide the synthesis of proteins—is the main topic of Chapter 6. In Chapter 7, we describe how this gene expression is controlled by the cell to ensure that each of the many thousands of proteins and RNA molecules encrypted in its DNA is manufactured only at the proper time and place in the life of a cell. THE STRUCTURE AND FUNCTION OF DNA Biologists in the 1940s had difficulty in conceiving how DNA could be the genetic material. The molecule seemed too simple: a long polymer composed of only four types of nucleotide subunits, which resemble one another chemically. Early in the 1950s, DNA was examined by x-ray diffraction analysis, a technique for determin- ing the three-dimensional atomic structure of a molecule (discussed in Chapter 8). The early x-ray diffraction results indicated that DNA was composed of two strands of the polymer wound into a helix. The observation that DNA was dou- ble-stranded provided one of the major clues that led to the Watson–Crick model for DNA structure that, as soon as it was proposed in 1953, made DNA’s potential for replication and information storage apparent. A DNA Molecule Consists of Two Complementary Chains of Nucleotides A deoxyribonucleic acid (DNA) molecule consists of two long polynucleotide chains composed of four types of nucleotide subunits. Each of these chains is known as a DNA chain, or a DNA strand. The chains run antiparallel to each other, and hydrogen bonds between the base portions of the nucleotides hold the two chains together (Figure 4–3). As we saw in Chapter 2 (Panel 2–6, pp. 100–101), nucleotides are composed of a five-carbon sugar to which are attached one or more phosphate groups and a nitrogen-containing base. In the case of the nucle- otides in DNA, the sugar is deoxyribose attached to a single phosphate group (hence the name deoxyribonucleic acid), and the base may be either adenine (A), cytosine (C), guanine (G), or thymine (T). The nucleotides are covalently linked together in a chain through the sugars and phosphates, which thus form a “back- bone” of alternating sugar–phosphate–sugar–phosphate. Because only the base differs in each of the four types of nucleotide subunit, each polynucleotide chain in DNA is analogous to a sugar-phosphate necklace (the backbone), from which hang the four types of beads (the bases A, C, G, and T). These same symbols (A, C, G, and T) are commonly used to denote either the four bases or the four entire nucleotides—that is, the bases with their attached sugar and phosphate groups. The way in which the nucleotides are linked together gives a DNA strand a chemical polarity. If we think of each sugar as a block with a protruding knob (the 5ʹ phosphate) on one side and a hole (the 3ʹ hydroxyl) on the other (see Figure 4–3), each completed chain, formed by interlocking knobs with holes, will have all of its subunits lined up in the same orientation. Moreover, the two ends of the chain will be easily distinguishable, as one has a hole (the 3ʹ hydroxyl) and the other a knob (the 5ʹ phosphate) at its terminus. This polarity in a DNA chain is indicated by referring to one end as the 3ʹ end and the other as the 5ʹ end, names derived from the orientation of the deoxyribose sugar. With respect to DNA’s 176 Chapter 4: DNA, Chromosomes, and Genomes building blocks of DNA DNA strand Figure 4–3 DNA and its building blocks. DNA is made of four types of nucleotides, sugar which are linked covalently into a phosphate polynucleotide chain (a DNA strand) with + G 5′ 3′ a sugar-phosphate backbone from which the bases (A, C, G, and T) extend. A DNA G C A T sugar- base G molecule is composed of two antiparallel phosphate (guanine) nucleotide DNA strands held together by hydrogen bonds between the paired bases. The double-stranded DNA DNA double helix arrowheads at the ends of the DNA strands 3′ indicate the polarities of the two strands. In 3′ the diagram at the bottom left of the figure, 5′ 5′ the DNA molecule is shown straightened G C out; in reality, it is twisted into a double G C helix, as shown on the right. For details, see Figure 4–5 and Movie 4.1. T A T A A T A T A T A C G sugar-phosphate G C backbone G C C G G C C G A T A C G C G A T A T 5′ 5′ 3′ 3′ hydrogen-bonded base pairs information-carrying capacity, the chain of nucleotides in a DNA strand, being MBoC6 m4.03/4.03 both directional and linear, can be read in much the same way as the letters on this page. The three-dimensional structure of DNA—the DNA double helix—arises from the chemical and structural features of its two polynucleotide chains. Because these two chains are held together by hydrogen-bonding between the bases on the different strands, all the bases are on the inside of the double helix, and the sugar-phosphate backbones are on the outside (see Figure 4–3). In each case, a bulkier two-ring base (a purine; see Panel 2–6, pp. 100–101) is paired with a sin- gle-ring base (a pyrimidine): A always pairs with T, and G with C (Figure 4–4). This complementary base-pairing enables the base pairs to be packed in the ener- getically most favorable arrangement in the interior of the double helix. In this arrangement, each base pair is of similar width, thus holding the sugar-phosphate backbones a constant distance apart along the DNA molecule. To maximize the efficiency of base-pair packing, the two sugar-phosphate backbones wind around each other to form a right-handed double helix, with one complete turn every ten base pairs (Figure 4–5). The members of each base pair can fit together within the double helix only if the two strands of the helix are antiparallel—that is, only if the polarity of one strand is oriented opposite to that of the other strand (see Figures 4–3 and 4–4). A consequence of DNA’s structure and base-pairing requirements is that each strand of a DNA molecule contains a sequence of nucleotides that is exactly com- plementary to the nucleotide sequence of its partner strand. THE STRUCTURE AND FUNCTION OF DNA 177 3′ Figure 4–4 Complementary base pairs 5′ in the DNA double helix. The shapes and chemical structures of the bases H O allow hydrogen bonds to form efficiently N C C N only between A and T and between G and C, because atoms that are able to N C A N H N T C H form hydrogen bonds (see Panel 2–3, pp. 94–95) can then be brought close C C C C together without distorting the double helix. C As indicated, two hydrogen bonds form N N H O CH3 H between A and T, while three form between adenine thymine G and C. The bases can pair in this way H only if the two polynucleotide chains that contain them are antiparallel to each other. sugar– H phosphate backbone N H O N C C N N C G N H N C C H C C C C C N H O H N H guanine H cytosine hydrogen 5′ 3′ bond 1 nm (A) The Structure of DNA Provides a Mechanism for Heredity The discovery of the structure of DNA immediately suggested answers to the two most fundamental questions about heredity. First, how could the information to specify an organism be carriedMBoC6 m4.04/4.04 in a chemical form? And second, how could this information be duplicated and copied from generation to generation? The answer to the first question came from the realization that DNA is a linear polymer of four different kinds of monomer, strung out in a defined sequence like the letters of a document written in an alphabetic script. The answer to the second question came from the double-stranded nature of the structure: because each strand of DNA contains a sequence of nucleotides that is exactly complementary to the nucleotide sequence of its partner strand, each strand can act as a template, or mold, for the synthesis of a new complemen- tary strand. In other words, if we designate the two DNA strands as S and Sʹ, strand _ 5′ end O P O O Figure 4–5 The DNA double helix. 5′ (A) A space-filling model of 1.5 turns of bases the DNA double helix. Each turn of DNA is C O O made up of 10.4 nucleotide pairs, and the minor P O center-to-center distance between adjacent groove 3′ end _ nucleotide pairs is 0.34 nm. The coiling of O O O the two strands around each other creates _ G O 3′ 0.34 nm two grooves in the double helix: the wider major O O G O groove P O groove is called the major groove, and the _ O O smaller the minor groove, as indicated. O O C (B) A short section of the double helix P P O O O _ O viewed from its side, showing four base O O O O P pairs. The nucleotides are linked together _ T sugar covalently by phosphodiester bonds that O O O A G O_ join the 3ʹ-hydroxyl (–OH) group of one 5′ O P O sugar to the 5ʹ-hydroxyl group of the next O O O O sugar. Thus, each polynucleotide strand O P O _ C has a chemical polarity; that is, its two hydrogen bond 3′ O phosphodiester ends are chemically different. The 5ʹ end bond 2 nm of the DNA polymer is by convention often 5′ end 3′ end illustrated carrying a phosphate group, (A) (B) while the 3ʹ end is shown with a hydroxyl. 178 Chapter 4: DNA, Chromosomes, and Genomes template S strand Figure 4–6 DNA as a template for its 5′ 3′ own duplication. Because the nucleotide A successfully pairs only with T, and G pairs with C, each strand of DNA can act S strand 3′ 5′ as a template to specify the sequence of 5′ 3′ new S′ strand nucleotides in its complementary strand. In this way, double-helical DNA can be copied precisely, with each parental DNA 3′ 5′ new S strand helix producing two identical daughter DNA S′ strand 5′ 3′ helices. parental DNA double helix 3′ 5′ template S′ strand daughter DNA double helices S can serve as a template for making a new strand Sʹ, while strand Sʹ can serve as a template for making a new strand S (Figure 4–6). Thus, the genetic information in DNA can be accurately copied by the beautifully simple process in which strand S separates from strand Sʹ, and each separated strand then serves as a template for the production of a new complementary partner strand that is identical to its former partner. The ability of each strand of a DNA molecule to act as a template for producing MBoC6 m4.08/4.08 a complementary strand enables a cell to copy, or replicate, its genome before passing it on to its descendants. We shall describe the elegant machinery that the cell uses to perform this task in Chapter 5. Organisms differ from one another because their respective DNA molecules have different nucleotide sequences and, consequently, carry different biological messages. But how is the nucleotide alphabet used to make messages, and what do they spell out? As discussed above, it was known well before the structure of DNA was deter- mined that genes contain the instructions for producing proteins. If genes are made of DNA, the DNA must therefore somehow encode proteins (Figure 4–7). As discussed in Chapter 3, the properties of a protein, which are responsible for its biological function, are determined by its three-dimensional structure. This struc- ture is determined in turn by the linear sequence of the amino acids of which it is composed. The linear sequence of nucleotides in a gene must therefore somehow spell out the linear sequence of amino acids in a protein. The exact correspon- dence between the four-letter nucleotide alphabet of DNA and the twenty-letter amino acid alphabet of proteins—the genetic code—is not at all obvious from the DNA structure, and it took over a decade after the discovery of the double helix before it was worked out. In Chapter 6, we will describe this code in detail in the course of elaborating the process of gene expression, through which a cell converts the nucleotide sequence of a gene first into the nucleotide sequence of an RNA molecule, and then into the amino acid sequence of a protein. The complete store of information in an organism’s DNA is called its genome, and it specifies all the RNA molecules and proteins that the organism will ever synthesize. (The term genome is also used to describe the DNA that carries this information.) The amount of information contained in genomes is staggering. The nucleotide sequence of a very small human gene, written out in the four-letter nucleotide alphabet, occupies a quarter of a page of text (Figure 4–8), while the complete sequence of nucleotides in the human genome would fill more than a gene A gene B gene C thousand books the size of this one. In addition to other critical information, it includes roughly 21,000 protein-coding genes, which (through alternative splic- DNA GENE double EXPRESSION ing; see p. 415) give rise to a much greater number of distinct proteins. helix In Eukaryotes, DNA Is Enclosed in a Cell Nucleus protein A protein B protein C As described in Chapter 1, nearly all the DNA in a eukaryotic cell is sequestered in Figure 4–7 The relationship between a nucleus, which in many cells occupies about 10% of the total cell volume. This genetic information carried in DNA and compartment is delimited by a nuclear envelope formed by two concentric lipid proteins. (Discussed in Chapter 1.) MBoC6 m4.06/4.06 CHROMOSOMAL DNA AND ITS PACKAGING IN THE CHROMATIN FIBER 179 Figure 4–7 The nucleotide sequence of the human β-globin gene. By convention, a nucleotide sequence is written from its 5ʹ end to its 3ʹ end, and it should be read from left to right in successive lines down the page as though it were normal English text. This gene carries the information for the amino acid sequence of one of the two types of subunits of the hemoglobin molecule; a different gene, the α-globin gene, carries the information for the other. (Hemoglobin, the protein that carries oxygen in the blood, has four subunits, two of each type.) Only one of the two strands of the DNA double helix containing the β-globin gene is shown; the other strand has the exact complementary sequence. The DNA sequences highlighted in yellow show the three regions of the gene that specify the amino acid sequence for the β-globin protein. We shall see in Chapter 6 how the cell splices these three sequences together at the level of messenger RNA in order to synthesize a full-length β-globin protein. bilayer membranes (Figure 4–9). These membranes are punctured at intervals by large nuclear pores, through which molecules move between the nucleus and the cytosol. The nuclear envelope is directly connected to the extensive system of intracellular membranes called the endoplasmic reticulum, which extend out from it into the cytoplasm. And it is mechanically supported by a network of inter- mediate filaments called the nuclear lamina—a thin feltlike mesh just beneath the inner nuclear membrane (see Figure 4–9B). The nuclear envelope allows the many proteins that act on DNA to be concen- trated where they are needed in the cell, and, as we see in subsequent chapters, it also keeps nuclear and cytosolic enzymes separate, a feature that is crucial for the proper functioning of eukaryotic cells. Summary Genetic information is carried in the linear sequence of nucleotides in DNA. Each molecule of DNA is a double helix formed from two complementary antiparallel strands of nucleotides held together by hydrogen bonds between G-C and A-T base pairs. Duplication of the genetic information occurs by the use of one DNA strand as a template for the formation of a complementary strand. The genetic information stored in an organism’s DNA contains the instructions for all the RNA molecules and proteins that the organism will ever synthesize and is said to comprise its genome. In eukaryotes, DNA is contained in the cell nucleus, a large membrane-bound com- partment. CHROMOSOMAL DNA AND ITS PACKAGING IN THE CHROMATIN FIBER The most important function of DNA is to carry genes, the information that spec- ifies all the RNA molecules and proteins that make up an organism—including information about when, in what types of cells, and in what quantity each RNA molecule and protein is to be made. The nuclear DNA of eukaryotes is divided up into chromosomes, and in this section we see how genes are typically arranged on each chromosome. In addition, we describe the specialized DNA sequences that are required for a chromosome to be accurately duplicated as a separate entity and passed on from one generation to the next. We also confront the serious challenge of DNA packaging. If the double helices comprising all 46 chromosomes in a human cell could be laid end to end, they would reach approximately 2 meters; yet the nucleus, which contains the DNA, is only about 6 μm in diameter. This is geometrically equivalent to packing 40 km (24 miles) of extremely fine thread into a tennis ball. The complex task of packaging DNA is accomplished by specialized proteins that bind to the DNA and fold it, generating a series of organized coils and loops that provide increasingly higher levels of organization, and prevent the DNA from becoming an unmanageable tangle. Amazingly, although the DNA is very tightly compacted, it nevertheless remains accessible to the many enzymes in the cell that replicate it, repair it, and use its genes to produce RNA molecules and proteins. 180 Chapter 4: DNA, Chromosomes, and Genomes heterochromatin endoplasmic reticulum heterochromatin DNA and associated proteins (chromatin), nuclear plus many RNA and envelope protein molecules nucleolus nucleolus centrosome microtubule nuclear lamina nuclear pore outer nuclear membrane nuclear envelope (A) (B) inner nuclear membrane 2 µm Figure 4–9 A cross-sectional view of a typical cell nucleus. (A) Electron micrograph of a thin section through the nucleus of a human fibroblast. (B) Schematic drawing, showing that the nuclear envelope consists of two membranes, the outer one being continuous with the endoplasmic reticulum (ER) membrane (see also Figure 12–7). The space inside the endoplasmic reticulum (the ER lumen) is colored yellow; it is continuous with the space between the two nuclear membranes. The lipid bilayers of the inner and outer nuclear membranes are connected at each nuclear pore. A sheetlike network of intermediate filaments (brown) inside the nucleus forms the nuclear lamina (brown), providing mechanical support for the nuclear envelope (for details, see Chapter 12). The dark-staining heterochromatin contains specially condensed regions of DNA that will be discussed later. (A, courtesy of E.G. Jordan and J. McGovern.) Eukaryotic DNA Is Packaged into a Set of Chromosomes MBoC6 m4.09/4.09 Each chromosome in a eukaryotic cell consists of a single, enormously long linear DNA molecule along with the proteins that fold and pack the fine DNA thread into a more compact structure. In addition to the proteins involved in packaging, chro- mosomes are also associated with many other proteins (as well as numerous RNA molecules). These are required for the processes of gene expression, DNA repli- cation, and DNA repair. The complex of DNA and tightly bound protein is called chromatin (from the Greek chroma, “color,” because of its staining properties). Bacteria lack a special nuclear compartment, and they generally carry their genes on a single DNA molecule, which is often circular (see Figure 1–24). This DNA is also associated with proteins that package and condense it, but they are different from the proteins that perform these functions in eukaryotes. Although the bacterial DNA with its attendant proteins is often called the bacterial “chro- mosome,” it does not have the same structure as eukaryotic chromosomes, and less is known about how the bacterial DNA is packaged. Therefore, our discussion of chromosome structure will focus almost entirely on eukaryotic chromosomes. With the exception of the gametes (eggs and sperm) and a few highly special- ized cell types that cannot multiply and either lack DNA altogether (for example, red blood cells) or have replicated their DNA without completing cell division (for example, megakaryocytes), each human cell nucleus contains two copies of each chromosome, one inherited from the mother and one from the father. The mater- nal and paternal chromosomes of a pair are called homologous chromosomes (homologs). The only nonhomologous chromosome pairs are the sex chromo- somes in males, where a Y chromosome is inherited from the father and an X chromosome from the mother. Thus, each human cell contains a total of 46 chro- mosomes—22 pairs common to both males and females, plus two so-called sex chromosomes (X and Y in males, two Xs in females). These human chromosomes can be readily distinguished by “painting” each one a different color using a tech- nique based on DNA hybridization (Figure 4–10). In this method (described in detail in Chapter 8), a short strand of nucleic acid tagged with a fluorescent dye serves as a “probe” that picks out its complementary DNA sequence, lighting up the target chromosome at any site where it binds. Chromosome painting is most CHROMOSOMAL DNA AND ITS PACKAGING IN THE CHROMATIN FIBER 181 Figure 4–10 The complete set of human chromosomes. These chromosomes, from a female, were isolated from a cell undergoing nuclear division (mitosis) and are therefore highly compacted. 1 2 3 4 5 Each chromosome has been “painted” a different color to permit its unambiguous identification under the fluorescence microscope, using a technique called 6 7 8 9 10 11 12 “spectral karyotyping.” Chromosome painting can be performed by exposing the chromosomes to a large collection of 13 14 15 16 17 18 DNA molecules whose sequence matches known DNA sequences from the human 19 20 21 22 genome. The set of sequences matching X X each chromosome is coupled to a different (A) (B) combination of fluorescent dyes. DNA 10 µm molecules derived from chromosome 1 are labeled with one specific dye combination, frequently done at the stage in the cell cycle called mitosis, when chromosomes those from chromosome 2 with another, are especially compacted and easy to visualize (see below). and so on. Because the labeled DNA can Another more traditional way to distinguish one chromosome from another form base pairs, or hybridize, only to the is to stain them with dyes that reveal a striking and reproducible pattern of bands chromosome from which it was derived, each chromosome becomes labeled along each mitotic chromosome (Figure 4–11). These banding patterns presum- with a different combination of dyes. For ably reflect variations in chromatin structure, but their basis is not well under- such experiments, the chromosomes are stood. Nevertheless, the pattern of bands on each type of chromosome is unique, subjected to treatments that separate MBoC6 to and it provided the initial means n4.444/4.10 identify and number each human chromo- the two strands of double-helical DNA in some reliably. a way that permits base-pairing with the single-stranded labeled DNA, but keeps the overall chromosome structure relatively intact. (A) The chromosomes visualized as they originally spilled from the lysed cell. (B) The same chromosomes artificially lined up in their numerical order. This arrangement of the full chromosome set is called a karyotype. (Adapted from N. McNeil and T. Ried, Expert Rev. Mol. Med. 2:1–14, 2000. With permission from Cambridge University Press.) Figure 4–11 The banding patterns of human chromosomes. Chromosomes 1–22 are numbered in approximate order of size. A typical human cell contains two 10 11 of each of these chromosomes, plus two 9 12 sex chromosomes—two X chromosomes 8 7 in a female, one X and one Y chromosome 3 4 6 in a male. The chromosomes used to 5 make these maps were stained at an early 1 stage in mitosis, when the chromosomes 2 are incompletely compacted. The horizontal red line represents the position of the centromere (see Figure 4–19), which appears as a constriction on mitotic chromosomes. The red knobs on chromosomes 13, 14, 15, 21, and 22 indicate the positions of genes that code Y for the large ribosomal RNAs (discussed 21 19 in Chapter 6). These banding patterns are 20 22 obtained by staining chromosomes with 16 18 Giemsa stain, and they can be observed 17 50 million under the light microscope. (Adapted from nucleotide pairs 15 1 µm U. Francke, Cytogenet. Cell Genet. 31:24– 14 13 32, 1981. With permission from the author.) X 182 Chapter 4: DNA, Chromosomes, and Genomes Figure 4–12 Aberrant human chromosomes. (A) Two normal human chromosomes, 4 and 6. (B) In an individual carrying a balanced chromosomal translocation, the DNA double helix in one chromosome has crossed over with the DNA double helix in the other chromosome due to an abnormal recombination event. The chromosome painting technique used on the chromosomes in each of the sets allows the identification of even short pieces of chromosomes that have become translocated, a frequent event in cancer cells. (Courtesy of Zhenya Tang and the NIGMS Human Genetic Cell Repository at the Coriell Institute for Medical Research: GM21880.) The display of the 46 human chromosomes at mitosis is called the human (A) chromosome 6 chromosome 4 karyotype. If parts of chromosomes are lost or are switched between chromo- somes, these changes can be detected either by changes in the banding patterns or—with greater sensitivity—by changes in the pattern of chromosome painting (Figure 4–12). Cytogeneticists use these alterations to detect inherited chromo- some abnormalities and to reveal the chromosome rearrangements that occur in cancer cells as they progress to malignancy (discussed in Chapter 20). Chromosomes Contain Long Strings of Genes Chromosomes carry genes—the functional units of heredity. A gene is often (B) reciprocal chromosomal translocation defined as a segment of DNA that contains the instructions for making a particu- lar protein (or a set of closely related proteins), but this definition is too narrow. Genes that code for protein are indeed the majority, and most of the genes with MBoC6 n4.546/4.12 clear-cut mutant phenotypes fall under this heading. In addition, however, there are many “RNA genes”—segments of DNA that generate a functionally significant RNA molecule, instead of a protein, as their final product. We shall say more about the RNA genes and their products later. As might be expected, some correlation exists between the complexity of an organism and the number of genes in its genome (see Table 1–2, p. 29). For example, some simple bacteria have only 500 genes, compared to about 30,000 for humans. Bacteria, archaea, and some single-celled eukaryotes, such as yeast, have concise genomes, consisting of little more than strings of closely packed genes. However, the genomes of multicellular plants and animals, as well as many other eukaryotes, contain, in addition to genes, a large quantity of interspersed DNA whose function is poorly understood (Figure 4–13). Some of this additional DNA is crucial for the proper control of gene expression, and this may in part explain why there is so much of it in multicellular organisms, whose genes have to be switched on and off according to complicated rules during development (dis- cussed in Chapters 7 and 21). Differences in the amount of DNA interspersed between genes, far more than differences in numbers of genes, account for the astonishing variations in genome size that we see when we compare one species with another (see Figure 1–32). For example, the human genome is 200 times larger than that of the yeast Saccharo- myces cerevisiae, but 30 times smaller than that of some plants and amphibians and 200 times smaller than that of a species of amoeba. Moreover, because of dif- ferences in the amount of noncoding DNA, the genomes of closely related organ- Figure 4–13 The arrangement of isms (bony fish, for example) can vary several hundredfold in their DNA content, genes in the genome of S. cerevisiae even though they contain roughly the same number of genes. Whatever the excess compared to humans. (A) S. cerevisiae is a budding yeast widely used for brewing (A) Saccharomyces cerevisiae and baking. The genome of this single- celled eukaryote is distributed over 16 chromosomes. A small region of one chromosome has been arbitrarily selected 0 10 20 30 kilobases to show its high density of genes. (B) A region of the human genome of equal (B) human length to the yeast segment in (A). The human genes are much less densely packed and the amount of interspersed 0 10 20 30 kilobases DNA sequence is far greater. Not shown in this sample of human DNA is the fact that gene genome-wide repeat most human genes are much longer than yeast genes (see Figure 4–15). CHROMOSOMAL DNA AND ITS PACKAGING IN THE CHROMATIN FIBER 183 Y2 X Y1 X Y Chinese muntjac Indian muntjac DNA may do, it seems clear that it is not a great handicap for a eukaryotic cell to Figure 4–14 Two closely related species carry a large amount of it. of deer with very different chromosome numbers. In the evolution of the Indian How the genome is divided into chromosomes also differs from one eukaryotic muntjac, initially separate chromosomes species to the next. For example, while the cells of humans have 46 chromosomes, fused, without having a major effect on the those of some small deer have only 6, while those of the common carp contain animal. These two species contain a similar over 100. Even closely related species with similar genome sizes can have very number of genes. (Chinese muntjac photo courtesy of Deborah Carreno, Natural different numbers and sizes of chromosomes (Figure 4–14). Thus, there is no sim- Wonders Photography.) ple relationship between chromosome number, complexity of the organism, and total genome size. Rather, the genomes and chromosomes of modern-day species have each been shaped by a unique history of seemingly random genetic events, acted on by poorly understood selection pressures over long evolutionary times. MBoC6 m4.14/4.14 Figure 4–15 The organization of genes on The Nucleotide Sequence of the Human Genome Shows How a human chromosome. (A) Chromosome Our Genes Are Arranged 22, one of the smallest human chromosomes, contains 48 × 106 nucleotide pairs and With the publication of the full DNA sequence of the human genome in 2004, it makes up approximately 1.5% of the human became possible to see in detail how the genes are arranged along each of our genome. Most of the left arm of chromosome chromosomes (Figure 4–15). It will be many decades before the information con- 22 consists of short repeated sequences tained in the human genome sequence is fully analyzed, but it has already stimu- of DNA that are packaged in a particularly compact form of chromatin (heterochromatin) lated new experiments that have had major effects on the content of every chapter discussed later in this chapter. (B) A tenfold in this book. expansion of a portion of chromosome 22, with about 40 genes indicated. Those in dark (A) human chromosome 22 in its mitotic conformation, composed of two double-stranded DNA molecules, each 48 × 106 nucleotide pairs long brown are known genes and those in red are predicted genes. (C) An expanded portion of (B) showing four genes. (D) The intron–exon arrangement of a typical gene is shown after a further tenfold expansion. Each exon heterochromatin (red) codes for a portion of the protein, while ×10 the DNA sequence of the introns (gray) is relatively unimportant, as discussed in detail in Chapter 6. 10% of chromosome arm ~40 genes The human genome (3.2 × 109 nucleotide (B) pairs) is the totality of genetic information belonging to our species. Almost all of this genome is distributed over the 22 different ×10 autosomes and 2 sex chromosomes (see 1% of chromosome arm containing 4 genes Figures 4–10 and 4–11) found within the nucleus. A minute fraction of the human (C) genome (16,569 nucleotide pairs—in multiple copies per cell) is found in the mitochondria ×10 (introduced in Chapter 1, and discussed in detail in Chapter 14). The term human one gene of 3.4 × 104 nucleotide pairs genome sequence refers to the complete nucleotide sequence of DNA in the 24 (D) nuclear chromosomes and the mitochondria. exon intron gene expression Being diploid, a human somatic cell nucleus regulatory DNA sequences contains roughly twice the haploid amount of RNA DNA, or 6.4 × 109 nucleotide pairs, when not duplicating its chromosomes in preparation protein for division. (Adapted from International Human Genome Sequencing Consortium, Nature 409:860–921, 2001. With permission folded protein from Macmillan Publishers Ltd.) MBoC6 m4.15/4.15 184 Chapter 4: DNA, Chromosomes, and Genomes TABLE 4–1 Some Vital Statistics for the Human Genome Human genome DNA length 3.2 × 109 nucleotide pairs* Number of genes coding for proteins Approximately 21,000 Largest gene coding for protein 2.4 × 106 nucleotide pairs Mean size for protein-coding genes 27,000 nucleotide pairs Smallest number of exons per gene 1 Largest number of exons per gene 178 Mean number of exons per gene 10.4 Largest exon size 17,106 nucleotide pairs Mean exon size 145 nucleotide pairs Number of noncoding RNA genes Approximately 9000** Number of pseudogenes*** More than 20,000 Percentage of DNA sequence in exons (protein-coding 1.5% sequences) Percentage of DNA in other highly conserved 3.5% sequences**** Percentage of DNA in high-copy-number repetitive Approximately 50% elements * The sequence of 2.85 billion nucleotides is known precisely (error rate of only about 1 in 100,000 nucleotides). The remaining DNA primarily consists of short sequences that are tandemly repeated many times over, with repeat numbers differing from one individual to the next. These highly repetitive blocks are hard to sequence accurately. ** This number is only a very rough estimate. *** A pseudogene is a DNA sequence closely resembling that of a functional gene, but containing numerous mutations that prevent its proper expression or function. Most (A) pseudogenes arise from the duplication of a functional gene followed by the accumulation of damaging mutations in one copy. **** These conserved functional regions include DNA encoding 5ʹ and 3ʹ UTRs (untranslated regions of mRNA), DNA specifying structural and functional RNAs, and DNA with conserved protein-binding sites. The first striking feature of the human genome is how little of it (only a few percent) codes for proteins (Table 4–1 and Figure 4–16). It is also notable that nearly half of the chromosomal DNA is made up of mobile pieces of DNA that have gradually inserted themselves in the chromosomes over evolutionary time, multiplying like parasites in the genome (see Figure 4–62). We discuss these trans- posable elements in detail in later chapters. A second notable feature of the human genome is the large average gene size—about 27,000 nucleotide pairs. As discussed above, a typical gene carries in its linear sequence of nucleotides the information for the linear sequence of the (B) amino acids of a protein. Only about 1300 nucleotide pairs are required to encode a protein of average size (about 430 amino acids in humans). Most of the remain- Figure 4–16 Scale of the human genome. ing sequence in a gene consists of long stretches of noncoding DNA that interrupt If drawn with a 1 mm space between each nucleotide pair, as in (A), the human genome the relatively short segments of DNA that code for protein. As will be discussed in would extend 3200 km (approximately detail in Chapter 6, the coding sequences are called exons; the intervening (non- 2000 miles), far enough to stretch across coding) sequences in genes are called introns (see Figure 4–15 and Table 4–1). the center of Africa,m4.16/4.16 MBoC6 the site of our human The majority of human genes thus consist of a long string of alternating exons and origins (red line in B). At this scale, there would be, on average, a protein-coding introns, with most of the gene consisting of introns. In contrast, the majority of gene every 150 m. An average gene would genes from organisms with concise genomes lack introns. This accounts for the extend for 30 m, but the coding sequences much smaller size of their genes (about one-twentieth that of human genes), as in this gene would add up to only just over well as for the much higher fraction of coding DNA in their chromosomes. a meter. CHROMOSOMAL DNA AND ITS PACKAGING IN THE CHROMATIN FIBER 185 In addition to introns and exons, each gene is associated with regulatory DNA sequences, which are responsible for ensuring that the gene is turned on or off at the proper time, expressed at the appropriate level, and only in the proper type of cell. In humans, the regulatory sequences for a typical gene are spread out over tens of thousands of nucleotide pairs. As would be expected, these regulatory sequences are much more compressed in organisms with concise genomes. We discuss how regulatory DNA sequences work in Chapter 7. Research in the last decade has surprised biologists with the discovery that, in addition to 21,000 protein-coding genes, the human genome contains many thousands of genes that encode RNA molecules that do not produce proteins, but instead have a variety of other important functions. What is thus far known about these molecules will be presented in Chapters 6 and 7. Last, but not least, the nucleotide sequence of the human genome has revealed that the archive of infor- mation needed to produce a human seems to be in an alarming state of chaos. As one commentator described our genome, “In some ways it may resemble your garage/bedroom/refrigerator/life: highly individualistic, but unkempt; little evi- dence of organization; much accumulated clutter (referred to by the uninitiated as ‘junk’); virtually nothing ever discarded; and the few patently valuable items indiscriminately, apparently carelessly, scattered throughout.” We shall discuss how this is thought to have come about in the final sections of this chapter entitled “How Genomes Evolve.” Figure 4–17 A simplified view of the eukaryotic cell cycle. During interphase, the cell is actively expressing its genes Each DNA Molecule That Forms a Linear Chromosome Must and is therefore synthesizing proteins. Contain a Centromere, Two Telomeres, and Replication Origins Also, during interphase and before cell division, the DNA is replicated and each To form a functional chromosome, a DNA molecule must be able to do more than chromosome is duplicated to produce two simply carry genes: it must be able to replicate, and the replicated copies must be closely paired sister DNA molecules (called sister chromatids). A cell with only one type separated and reliably partitioned into daughter cells at each cell division. This of chromosome, present in maternal and process occurs through an ordered series of stages, collectively known as the cell paternal copies, is illustrated here. Once cycle, which provides for a temporal separation between the duplication of chro- DNA replication is complete, the cell can mosomes and their segregation into two daughter cells. The cell cycle is briefly enter M phase, when mitosis occurs and summarized in Figure 4–17, and it is discussed in detail in Chapter 17. Briefly, the nucleus is divided into two daughter nuclei. During this stage, the chromosomes during a long interphase, genes are expressed and chromosomes are replicated, condense, the nuclear envelope breaks with the two replicas remaining together as a pair of sister chromatids. Through- down, and the mitotic spindle forms from out this time, the chromosomes are extended and much of their chromatin exists microtubules and other proteins. The as long threads in the nucleus so that individual chromosomes cannot be easily condensed mitotic chromosomes are distinguished. It is only during a much briefer period of mitosis that each chro- captured by the mitotic spindle, and one complete set of chromosomes is then mosome condenses so that its two sister chromatids can be separated and dis- pulled to each end of the cell by separating tributed to the two daughter nuclei. The highly condensed chromosomes in a the members of each sister-chromatid pair. dividing cell are known as mitotic chromosomes (Figure 4–18). This is the form A nuclear envelope re-forms around each in which chromosomes are most easily visualized; in fact, the images of chromo- chromosome set, and in the final step of M phase, the cell divides to produce two somes shown so far in the chapter are of chromosomes in mitosis. daughter cells. Most of the time in the cell Each chromosome operates as a distinct structural unit: for a copy to be passed cycle is spent in interphase; M phase is on to each daughter cell at division, each chromosome must be able to replicate, brief in comparison, occupying only about and the newly replicated copies must subsequently be separated and partitioned an hour in many mammalian cells. paternal interphase chromosome mitotic maternal interphase chromosome spindle GENE EXPRESSION MITOSIS CELL AND CHROMOSOME DIVISION DUPLICATION nuclear envelope mitotic surrounding the nucleus chromosome INTERPHASE M PHASE INTERPHASE 186 Chapter 4: DNA, Chromosomes, and Genomes correctly into the two daughter cells. These basic functions are controlled by three types of specialized nucleotide sequences in the DNA, each of which binds spe- cific proteins that guide the machinery that replicates and segregates chromo- somes (Figure 4–19). Experiments in yeasts, whose chromosomes are relatively small and easy to manipulate, have identified the minimal DNA sequence elements responsible for each of these functions. One type of nucleotide sequence acts as a DNA repli- cation origin, the location at which duplication of the DNA begins. Eukaryotic chromosomes contain many origins of replication to ensure that the entire chro- mosome can be replicated rapidly, as discussed in detail in Chapter 5. After DNA replication, the two sister chromatids that form each chromosome remain attached to one another and, as the cell cycle proceeds, are condensed further to produce mitotic chromosomes. The presence of a second specialized DNA sequence, called a centromere, allows one copy of each duplicated and con- densed chromosome to be pulled into each daughter cell when a cell divides. A protein complex called a kinetochore forms at the centromere and attaches the duplicated chromosomes to the mitotic spindle, allowing them to be pulled apart (discussed in Chapter 17). The third specialized DNA sequence forms telomeres, the ends of a chromo- 1 µm some. Telomeres contain repeated nucleotide sequences that enable the ends of chromosomes to be efficiently replicated. Telomeres also perform another func- Figure 4–18 A mitotic chromosome. tion: the repeated telomere DNA sequences, together with the regions adjoining A mitotic chromosome is a condensed them, form structures that protect the end of the chromosome from being mis- duplicated chromosome in which the taken by the cell for a broken DNA molecule in need of repair. We discuss both this two new chromosomes, called sister type of repair and the structure and function of telomeres in Chapter 5. chromatids, are still linked together (see Figure 4–17). The constricted region In yeast cells, the three types of sequences required to propagate a chromo- indicates the position of the centromere. some are relatively short (typically less than 1000 base pairs each) and therefore (Courtesy of Terry m4.20/4.18 MBoC6 D. Allen.) use only a tiny fraction of the information-carrying capacity of a chromosome. Although telomere sequences are fairly simple and short in all eukaryotes, the DNA sequences that form centromeres and replication origins in more complex organisms are much longer than their yeast counterparts. For example, experi- ments suggest that a human centromere can contain up to a million nucleotide pairs and that it may not require a stretch of DNA with a defined nucleotide sequence. Instead, as we shall discuss later in this chapter, a human centromere is thought to consist of a large, regularly repeating protein–nucleic acid structure that can be inherited when a chromosome replicates. Figure 4–19 The three DNA sequences INTERPHASE MITOSIS INTERPHASE required to produce a eukaryotic chromosome that can be replicated and telomere then segregated accurately at mitosis. Each chromosome has multiple origins of replication, one centromere, and two telomeres. Shown here is the sequence of replication events that a typical chromosome follows origin during the cell cycle. The DNA replicates CELL in interphase, beginning at the origins of DIVISION replication and proceeding bidirectionally + from the origins across the chromosome. In M phase, the centromere attaches the centromere duplicated chromosomes to the mitotic spindle so that a copy of the entire genome is distributed to each daughter cell during mitosis; the special structure that attaches the centromere to the spindle is a protein complex called the kinetochore (dark portion of green). The centromere also helps to hold mitotic spindle the duplicated chromosomes together replicated duplicated chromosome chromosomes until they are ready to be moved apart. in separate The telomeres form special caps at each daughter cells chromosome end. CHROMOSOMAL DNA AND ITS PACKAGING IN THE CHROMATIN FIBER 187 DNA Molecules Are Highly Condensed in Chromosomes All eukaryotic organisms have special ways of packaging DNA into chromosomes. For example, if the 48 million nucleotide pairs of DNA in human chromosome 22 could be laid out as one long perfect double helix, the molecule would extend for about 1.5 cm if stretched out end to end. But chromosome 22 measures only about 2 μm in length in mitosis (see Figures 4–10 and 4–11), representing an end- to-end compaction ratio of over 7000-fold. This remarkable feat of compression is performed by proteins that successively coil and fold the DNA into higher and higher levels of organization. Although much less condensed than mitotic chro- mosomes, the DNA of human interphase chromosomes is still tightly packed. In reading these sections it is important to keep in mind that chromosome structure is dynamic. We have seen that each chromosome condenses to an extreme degree in the M phase of the cell cycle. Much less visible, but of enormous interest and importance, specific regions of interphase chromosomes decon- dense to allow access to specific DNA sequences for gene expression, DNA repair, and replication—and then recondense when these processes are completed. The packaging of chromosomes is therefore accomplished in a way that allows rapid localized, on-demand access to the DNA. In the next sections, we discuss the spe- cialized proteins that make this type of packaging possible. Nucleosomes Are a Basic Unit of Eukaryotic Chromosome Structure The proteins that bind to the DNA to form eukaryotic chromosomes are tradi- tionally divided into two classes: the histones and the non-histone chromosomal proteins, each contributing about the same mass to a chromosome as the DNA. The complex of both classes of protein with the nuclear DNA of eukaryotic cells is known as chromatin (Figure 4–20). Histones are responsible for the first and most basic level of chromosome packing, the nucleosome, a protein–DNA complex discovered in 1974. When interphase nuclei are broken open very gently and their contents examined under the electron microscope, most of the chromatin appears to be in the form of a fiber with a diameter of about 30 nm (Figure 4–21A). If this chromatin is sub- jected to treatments that cause it to unfold partially, it can be seen under the elec- tron microscope as a series of “beads on a string” (Figure 4–21B). The string is DNA, and each bead is a “nucleosome core particle” that consists of DNA wound around a histone core (Movie 4.2). The structural organization of nucleosomes was determined after first isolat- ing them from unfolded chromatin by digestion with particular enzymes (called nucleases) that break down DNA by cutting between the nucleosomes. After digestion for a short period, the exposed DNA between the nucleosome core par- ticles, the linker DNA, is degraded. Each individual nucleosome core particle con- sists of a complex of eight histone proteins—two molecules each of histones H2A, chromatin DNA Figure 4–20 Chromatin. As illustrated, chromatin consists of DNA bound to both histone and non-histone proteins. The mass of histone protein present is about equal to the total mass of non-histone protein, but—as schematically indicated here—the latter class is composed of an enormous number of different species. In total, a chromosome is about one-third histone non-histone proteins DNA and two-thirds protein by mass. 188 Chapter 4: DNA, Chromosomes, and Genomes (A) Figure 4–21 Nucleosomes as seen in the electron microscope. (A) Chromatin isolated directly from an interphase nucleus appears in the electron microscope as a thread about 30 nm thick. (B) This electron micrograph shows a length of chromatin that has been experimentally unpacked, or decondensed, after isolation to show (B) the nucleosomes. (A, courtesy of Barbara Hamkalo; B, courtesy of Victoria Foe.) 50 nm H2B, H3, and H4—and double-stranded DNA that is 147 nucleotide pairs long. The histone octamer forms a protein core around which the double-stranded DNA is wound (Figure 4–22). The region of linker DNA that separates each nucleosome core particle from MBoC6 the next can vary in length from m4.22/4.20 a few nucleotide pairs up to about 80. (The term nucleosome technically refers to a nucleosome core particle plus one of its adjacent DNA linkers, but it is often used synonymously with nucleosome core particle.) On average, therefore, nucleosomes repeat at intervals of about 200 nucleotide pairs. For example, a diploid human cell with 6.4 × 109 nucleotide pairs contains core histones approximately 30 million nucleosomes. The formation of nucleosomes converts a linker DNA of nucleosome DNA molecule into a chromatin thread about one-third of its initial length. The Structure of the Nucleosome Core Particle Reveals How DNA nucleosome includes “beads-on-a-string” Is Packaged form of chromatin ~200 nucleotide pairs of DNA The high-resolution structure of a nucleosome core particle, solved in 1997, NUCLEASE revealed a disc-shaped histone core around which the DNA was tightly wrapped DIGESTS in a left-handed coil of 1.7 turns (Figure 4–23). All four of the histones that make LINKER DNA up the core of the nucleosome are relatively small proteins (102–135 amino acids), and they share a structural motif, known as the histone fold, formed from three α helices connected by two loops (Figure 4–24). In assembling a nucleosome, the histone folds first bind to each other to form H3–H4 and H2A–H2B dimers, and the H3–H4 dimers combine to form tetramers. An H3–H4 tetramer then further combines with two H2A–H2B dimers to form the compact octamer core, around released which the DNA is wound. nucleosome 11 nm