Chapter 2 Introduction To The Human Genome PDF
Document Details
Uploaded by MeaningfulAntagonist
Ilia Chavchavadze State University
Tags
Related
- Genetics Chapter 1 & 2 - Introduction to Human Genome
- Human Genetics: Relevance in Dentistry PDF
- Lecture 2 Our genetic material DNA chromosomes and genome PDF
- Understanding the Basics of Genetic Medicine Lecture 1 PDF
- Principles of Clinical Cytogenetics and Genome Analysis PDF
- Precision Medicine PDF
Summary
This document is an introduction to the human genome, covering the organization, variation, and transmission of genetic material. It also explores chromosome and genome analysis in clinical settings, highlighting its diagnostic and therapeutic applications.
Full Transcript
C H A P T E R 2 Introduction to the Human Genome Unders...
C H A P T E R 2 Introduction to the Human Genome Understanding the organization, variation, and trans- CHROMOSOME AND GENOME ANALYSIS IN mission o the human genome is central to appreciating CLINICAL MEDICINE the role o genetics in medicine, as well as the emerging principles o genomic and personalized medicine. With Chromosome and genome analysis has become an impor- the availability o the sequence o the human genome tant diagnostic procedure in clinical medicine. As described more ully in subsequent chapters, these applications and a growing awareness o the role o genome varia- include the ollowing: tion in disease, it is now possible to begin to exploit the Clinical diagnosis. Numerous medical conditions, impact o that variation on human health on a broad including some that are common, are associated with scale. The comparison o individual genomes under- changes in chromosome number or structure and scores the frst major take-home lesson o this book— require chromosome or genome analysis or diagnosis and genetic counseling (see Chapters 5 and 6). every individual has his or her own unique constitution Gene identifcation. A major goal o medical genetics of gene products, produced in response to the combined and genomics today is the identifcation o specifc inputs of the genome sequence and one’s particular set genes and elucidating their roles in health and disease. of environmental exposures and experiences. As pointed This topic is reerred to repeatedly but is discussed in out in the previous chapter, this realization reects what detail in Chapter 10. Cancer genomics. Genomic and chromosomal changes Garrod termed chemical individuality over a century in somatic cells are involved in the initiation and pro- ago and provides a conceptual oundation or the prac- gression o many types o cancer (see Chapter 15). tice o genomic and personalized medicine. Disease treatment. Evaluation o the integrity, compo- Advances in genome technology and the resulting sition, and dierentiation state o the genome is criti- explosion in knowledge and inormation stemming cal or the development o patient-specifc pluripotent stem cells or therapeutic use (see Chapter 13). rom the Human Genome Project are thus playing an Prenatal diagnosis. Chromosome and genome analy- increasingly transormational role in integrating and sis is an essential procedure in prenatal diagnosis (see applying concepts and discoveries in genetics to the Chapter 17). practice o medicine. THE HUMAN GENOME AND THE CHROMOSOMAL BASIS OF HEREDITY which at this point we consider simply and most broadly Appreciation o the importance o genetics to medicine as unctional units o genetic inormation, are encoded requires an understanding o the nature o the heredi- in the DNA o the genome, organized into a number o tary material, how it is packaged into the human rod-shaped organelles called chromosomes in the genome, and how it is transmitted rom cell to cell nucleus o each cell. The inuence o genes and genetics during cell division and rom generation to generation on states o health and disease is proound, and its roots during reproduction. The human genome consists o large are ound in the inormation encoded in the DNA that amounts o the chemical deoxyribonucleic acid (DNA) makes up the human genome. that contains within its structure the genetic inorma- Each species has a characteristic chromosome com- tion needed to speciy all aspects o embryogenesis, plement (karyotype) in terms o the number, morphol- development, growth, metabolism, and reproduction— ogy, and content o the chromosomes that make up its essentially all aspects o what makes a human being a genome. The genes are in linear order along the chro- unctional organism. Every nucleated cell in the body mosomes, each gene having a precise position or locus. carries its own copy o the human genome, which con- A gene map is the map o the genomic location o the tains, depending on how one defnes the term, approxi- genes and is characteristic o each species and the indi- mately 20,000 to 50,000 genes (see Box later). Genes, viduals within a species. 3 4 THOMPSON & THOMPSON GENETICS IN MEDICINE The study o chromosomes, their structure, and their inormation; that is, they typically have the same genes inheritance is called cytogenetics. The science o human in the same order. At any specifc locus, however, the cytogenetics dates rom 1956, when it was frst estab- homologues either may be identical or may vary slightly lished that the normal human chromosome number is in sequence; these dierent orms o a gene are called 46. Since that time, much has been learned about human alleles. One member o each pair o chromosomes is chromosomes, their normal structure and composition, inherited rom the ather, the other rom the mother. and the identity o the genes that they contain, as well Normally, the members o a pair o autosomes are as their numerous and varied abnormalities. microscopically indistinguishable rom each other. In With the exception o cells that develop into gametes emales, the sex chromosomes, the two X chromosomes, (the germline), all cells that contribute to one’s body are are likewise largely indistinguishable. In males, however, called somatic cells (soma, body). The genome con- the sex chromosomes dier. One is an X, identical to the tained in the nucleus o human somatic cells consists o Xs o the emale, inherited by a male rom his mother 46 chromosomes, made up o 24 dierent types and and transmitted to his daughters; the other, the Y chro- arranged in 23 pairs (Fig. 2-1). O those 23 pairs, 22 are mosome, is inherited rom his ather and transmitted to alike in males and emales and are called autosomes, his sons. In Chapter 6, as we explore the chromosomal originally numbered in order o their apparent size rom and genomic basis o disease, we will look at some the largest to the smallest. The remaining pair comprises exceptions to the simple and almost universal rule that the two dierent types o sex chromosomes: an X and human emales are XX and human males are XY. a Y chromosome in males and two X chromosomes in In addition to the nuclear genome, a small but impor- emales. Central to the concept o the human genome, tant part o the human genome resides in mitochondria each chromosome carries a dierent subset o genes in the cytoplasm (see Fig. 2-1). The mitochondrial chro- that are arranged linearly along its DNA. Members mosome, to be described later in this chapter, has a o a pair o chromosomes (reerred to as homologous number o unusual eatures that distinguish it rom the chromosomes or homologues) carry matching genetic rest o the human genome. Somatic cell Mitochondrial Nuclear chromosomes chromosomes CAGGTCTTAGCCATTCGAATCGTACGCTAGCA CAGGTCTTAGCCATTCGAATCGTACGCTAGCA CAGGTCTTAGCCATTCGAATCGTACGCTAGCA CAGGTCTTAGCCATTCGAATCGTACGCTAGCA ATTCTTATAATCGTACGCTAGCAATTCTTATGGA ATTCTTATAATCGTACGCTAGCAATTCTTATGGA ATTCTTATAATCGTACGCTAGCAATTCTTATGGA ATTCTTATAATCGTACGCTAGCAATTCTTATGGA AACTGTGAATAGGCTTATAACAGGTCAGGTCT AACTGTGAATAGGCTTATAACAGGTCAGGTCT AACTGTGAATAGGCTTATAACAGGTCAGGTCT AACTGTGAATAGGCTTATAACAGGTCAGGTCT TAGCCATTCGAATCGTACGCTAGCAATTCTTAT TAGCCATTCGAATCGTACGCTAGCAATTCTTAT TAGCCATTCGAATCGTACGCTAGCAATTCTTAT TAGCCATTCGAATCGTACGCTAGCAATTCTTAT AATCGTACGCTAGCAATTCTTATGGAAACTGTG AATCGTACGCTAGCAATTCTTATGGAAACTGTG AATCGTACGCTAGCAATTCTTATGGAAACTGTG AATCGTACGCTAGCAATTCTTATGGAAACTGTG AATAGGCTTATAACAGGTCAGGTCTTAGCCATT AATAGGCTTATAACAGGTCAGGTCTTAGCCATT AATAGGCTTATAACAGGTCAGGTCTTAGCCATT AATAGGCTTATAACAGGTCAGGTCTTAGCCATT CGAATCGTACGCTAGCAATTCTTATAATCGTAC CGAATCGTACGCTAGCAATTCTTATAATCGTAC CGAATCGTACGCTAGCAATTCTTATAATCGTAC CGAATCGTACGCTAGCAATTCTTATAATCGTAC GCTAGCAATTCTTATGGAAACTGTGAATAGGCT GCTAGCAATTCTTATGGAAACTGTGAATAGGCT GCTAGCAATTCTTATGGAAACTGTGAATAGGCT GCTAGCAATTCTTATGGAAACTGTGAATAGGCT TATAACAGGTCAGGTCTTAGCCATTCGAATCGT TATAACAGGTCAGGTCTTAGCCATTCGAATCGT TATAACAGGTCAGGTCTTAGCCATTCGAATCGT TATAACAGGTCAGGTCTTAGCCATTCGAATCGT ACGCTAGCAATTCTTATAATCGTACGCTAGCAA ACGCTAGCAATTCTTATAATCGTACGCTAGCAA ACGCTAGCAATTCTTATAATCGTACGCTAGCAA ACGCTAGCAATTCTTATAATCGTACGCTAGCAA TTCTTATGGAAACTGTGAATAGGCTTATAACAG TTCTTATGGAAACTGTGAATAGGCTTATAACAG TTCTTATGGAAACTGTGAATAGGCTTATAACAG TTCTTATGGAAACTGTGAATAGGCTTATAACAG GTCAGGTCTTAGCCATTCGAATCGTACGCTAGC GTCAGGTCTTAGCCATTCGAATCGTACGCTAGC GTCAGGTCTTAGCCATTCGAATCGTACGCTAGC GTCAGGTCTTAGCCATTCGAATCGTACGCTAGC AATTCTTATAATCGTACGCTAGCAATTCTTATGG AATTCTTATAATCGTACGCTAGCAATTCTTATGG AATTCTTATAATCGTACGCTAGCAATTCTTATGG AATTCTTATAATCGTACGCTAGCAATTCTTATGG AAACTGTGAATAGGCTTATAACAGGTCAGGTCT AAACTGTGAATAGGCTTATAACAGGTCAGGTCT AAACTGTGAATAGGCTTATAACAGGTCAGGTCT AAACTGTGAATAGGCTTATAACAGGTCAGGTCT TAGCCATTCGAATCGTACGCTAGCAATTCTTATA TAGCCATTCGAATCGTACGCTAGCAATTCTTATA TAGCCATTCGAATCGTACGCTAGCAATTCTTATA TAGCCATTCGAATCGTACGCTAGCAATTCTTATA ATCGTACGCTAGCAATTCTTATGGAAACTGTAA ATCGTACGCTAGCAATTCTTATGGAAACTGTAA ATCGTACGCTAGCAATTCTTATGGAAACTGTAA ATCGTACGCTAGCAATTCTTATGGAAACTGTAA TAGGCTTATAACAGGTCAGGTCTTAGCCATTCG TAGGCTTATAACAGGTCAGGTCTTAGCCATTCG TAGGCTTATAACAGGTCAGGTCTTAGCCATTCG TAGGCTTATAACAGGTCAGGTCTTAGCCATTCG AATCGTACGCTAGCAATTCTTATAATCGTACGCT AATCGTACGCTAGCAATTCTTATAATCGTACGCT AATCGTACGCTAGCAATTCTTATAATCGTACGCT AATCGTACGCTAGCAATTCTTATAATCGTACGCT AGCAATTCTTATGGAAACTGTGAATAGGCTTATA AGCAATTCTTATGGAAACTGTGAATAGGCTTATA AGCAATTCTTATGGAAACTGTGAATAGGCTTATA AGCAATTCTTATGGAAACTGTGAATAGGCTTATA ACAGGTCAGGTCTTAGCCATTCGAATCGTACG ACAGGTCAGGTCTTAGCCATTCGAATCGTACG ACAGGTCAGGTCTTAGCCATTCGAATCGTACG ACAGGTCAGGTCTTAGCCATTCGAATCGTACG CTAGCAATTCTTATAATCGTACGCTAGCAATTCT CTAGCAATTCTTATAATCGTACGCTAGCAATTCT CTAGCAATTCTTATAATCGTACGCTAGCAATTCT CTAGCAATTCTTATAATCGTACGCTAGCAATTCT TATGGAAACTGTGAATAGGCTTATAACAGGTCA TATGGAAACTGTGAATAGGCTTATAACAGGTCA TATGGAAACTGTGAATAGGCTTATAACAGGTCA TATGGAAACTGTGAATAGGCTTATAACAGGTCA GGTCTTAGCCATTCGAATCGTACGCTAGCAATT GGTCTTAGCCATTCGAATCGTACGCTAGCAATT GGTCTTAGCCATTCGAATCGTACGCTAGCAATT GGTCTTAGCCATTCGAATCGTACGCTAGCAATT CTTATAATCGTACGCTAGCAATTCTTATGGAAAC CTTATAATCGTACGCTAGCAATTCTTATGGAAAC CTTATAATCGTACGCTAGCAATTCTTATGGAAAC CTTATAATCGTACGCTAGCAATTCTTATGGAAAC TGTGAATAGGCTTATAACAGGTCAGGTCTTAGC TGTGAATAGGCTTATAACAGGTCAGGTCTTAGC TGTGAATAGGCTTATAACAGGTCAGGTCTTAGC TGTGAATAGGCTTATAACAGGTCAGGTCTTAGC CATTCGAATCGTACGCTAGCAATTCTTATAATCG CATTCGAATCGTACGCTAGCAATTCTTATAATCG CATTCGAATCGTACGCTAGCAATTCTTATAATCG CATTCGAATCGTACGCTAGCAATTCTTATAATCG...CTAGCAATTCTTATAATCGTACGCTAG TCTTATGGAAACTGTGAATAGGCTTATAACAGGAG GTCTTAGCCATTCGAATCGTACGCTAGC... Human Genome Sequence Figure 2-1 The human genome, encoded on both nuclear and mitochondrial chromosomes. See Sources & Acknowledgments. CHAPTER 2 — INTRODUCTION TO THE HUMAN GENOME 5 Purines Pyrimidines NH2 O C C N CH3 N C HN C CH HC C C CH N O N N H H O Adenine (A) Thymine (T) _ 5' Base O P O CH2 O _ O C C H H H H O NH2 3' C C OH H C N C HN C N CH Phosphate Deoxyribose CH C C C CH H2N N O N N H H Guanine (G) Cytosine (C) Figure 2-2 The four bases of DNA and the general structure of a nucleotide in DNA. Each o the our bases bonds with deoxyribose (through the nitrogen shown in magenta) and a phosphate group to orm the corresponding nucleotides. GENES IN THE HUMAN GENOME Although the ultimate catalogue o human genes remains an elusive target, we recognize two general types o gene, What is a gene? And how many genes do we have? These those whose product is a protein and those whose product questions are more difcult to answer than it might seem. is a unctional RNA. The word gene, frst introduced in 1908, has been used The number o protein-coding genes—recognized by in many dierent contexts since the essential eatures o eatures in the genome that will be discussed in Chapter heritable “unit characters” were frst outlined by Mendel 3—is estimated to be somewhere between 20,000 and over 150 years ago. To physicians (and indeed to Mendel 25,000. In this book, we typically use approximately and other early geneticists), a gene can be defned by its 20,000 as the number, and the reader should rec- observable impact on an organism and on its statistically ognize that this is both imprecise and perhaps an determined transmission rom generation to generation. To underestimate. medical geneticists, a gene is recognized clinically in the In addition, however, it has been clear or several decades context o an observable variant that leads to a character- that the ultimate product o some genes is not a protein istic clinical disorder, and today we recognize approximately at all but rather an RNA transcribed rom the DNA 5000 such conditions (see Chapter 7). sequence. There are many dierent types o such RNA The Human Genome Project provided a more systematic genes (typically called noncoding genes to distinguish basis or delineating human genes, relying on DNA sequence them rom protein-coding genes), and it is currently esti- analysis rather than clinical acumen and amily studies mated that there are at least another 20,000 to 25,000 alone; indeed, this was one o the most compelling ratio- noncoding RNA genes around the human genome. nales or initiating the project in the late 1980s. However, Thus overall—and depending on what one means by the even with the fnished sequence product in 2003, it was term—the total number o genes in the human genome is o apparent that our ability to recognize eatures o the the order o approximately 20,000 to 50,000. However, the sequence that point to the existence or identity o a gene reader will appreciate that this remains a moving target, was sorely lacking. Interpreting the human genome sequence subject to evolving defnitions, increases in technological and relating its variation to human biology in both health capabilities and analytical precision, advances in inormat- and disease is thus an ongoing challenge or biomedical ics and digital medicine, and more complete genome research. annotation. DNA Structure: A Brief Review composed o three types o units: a fve-carbon sugar, Beore the organization o the human genome and its deoxyribose; a nitrogen-containing base; and a phos- chromosomes is considered in detail, it is necessary to phate group (Fig. 2-2). The bases are o two types, review the nature o the DNA that makes up the genome. purines and pyrimidines. In DNA, there are two purine DNA is a polymeric nucleic acid macromolecule bases, adenine (A) and guanine (G), and two pyrimidine 6 THOMPSON & THOMPSON GENETICS IN MEDICINE A B _ 5' end O Hydrogen 3.4 Å _ 5' bonds O P O 3' O C Base 1 G H2C 5' O C C H H H H 3' C C C O H G 34 Å _ O P O O Base 2 T H2C 5' O A C C H H H H 3' C C O H C _ G O P O O 3' Base 3 H2C 5' 5' O C C H H H H 3' C C 3' end OH H 20 Å Figure 2-3 The structure of DNA. A, A portion o a DNA polynucleotide chain, showing the 3′-5′ phosphodiester bonds that link adjacent nucleotides. B, The double-helix model o DNA, as pro- posed by Watson and Crick. The horizontal “rungs” represent the paired bases. The helix is said to be right-handed because the strand going rom lower let to upper right crosses over the opposite strand. The detailed portion o the fgure illustrates the two complementary strands o DNA, showing the AT and GC base pairs. Note that the orientation o the two strands is antiparallel. See Sources & Acknowledgments. bases, thymine (T) and cytosine (C). Nucleotides, each C. The specifc nature o the genetic inormation encoded composed o a base, a phosphate, and a sugar moiety, in the human genome lies in the sequence o C’s, A’s, polymerize into long polynucleotide chains held together G’s, and T’s on the two strands o the double helix along by 5′-3′ phosphodiester bonds ormed between adjacent each o the chromosomes, both in the nucleus and in deoxyribose units (Fig. 2-3A). In the human genome, mitochondria (see Fig. 2-1). Because o the complemen- these polynucleotide chains exist in the orm o a double tary nature o the two strands o DNA, knowledge o helix (Fig. 2-3B) that can be hundreds o millions o the sequence o nucleotide bases on one strand auto- nucleotides long in the case o the largest human matically allows one to determine the sequence o bases chromosomes. on the other strand. The double-stranded structure o The anatomical structure o DNA carries the chemi- DNA molecules allows them to replicate precisely by cal inormation that allows the exact transmission o separation o the two strands, ollowed by synthesis o genetic inormation rom one cell to its daughter cells two new complementary strands, in accordance with and rom one generation to the next. At the same time, the sequence o the original template strands (Fig. 2-4). the primary structure o DNA specifes the amino acid Similarly, when necessary, the base complementarity sequences o the polypeptide chains o proteins, as allows efcient and correct repair o damaged DNA described in the next chapter. DNA has elegant eatures molecules. that give it these properties. The native state o DNA, as elucidated by James Watson and Francis Crick in 1953, is a double helix (see Fig. 2-3B). The helical struc- Structure of Human Chromosomes ture resembles a right-handed spiral staircase in which The composition o genes in the human genome, as well its two polynucleotide chains run in opposite directions, as the determinants o their expression, is specifed in held together by hydrogen bonds between pairs o bases: the DNA o the 46 human chromosomes in the nucleus T o one chain paired with A o the other, and G with plus the mitochondrial chromosome. Each human CHAPTER 2 — INTRODUCTION TO THE HUMAN GENOME 7 chromosome consists of a single, continuous DNA several classes o specialized proteins. Except during cell double helix; that is, each chromosome is one long, division, chromatin is distributed throughout the nucleus double-stranded DNA molecule, and the nuclear genome and is relatively homogeneous in appearance under the consists, thereore, o 46 linear DNA molecules, totaling microscope. When a cell divides, however, its genome more than 6 billion nucleotide pairs (see Fig. 2-1). condenses to appear as microscopically visible chromo- Chromosomes are not naked DNA double helices, somes. Chromosomes are thus visible as discrete struc- however. Within each cell, the genome is packaged as tures only in dividing cells, although they retain their chromatin, in which genomic DNA is complexed with integrity between cell divisions. The DNA molecule o a chromosome exists in chro- 5' matin as a complex with a amily o basic chromosomal 3' G C proteins called histones. This undamental unit interacts G C C G with a heterogeneous group o nonhistone proteins, A T which are involved in establishing a proper spatial and G A C T unctional environment to ensure normal chromosome T A behavior and appropriate gene expression. G A C T Five major types o histones play a critical role in the C T G A proper packaging o chromatin. Two copies each o the our core histones H2A, H2B, H3, and H4 constitute an G A T C octamer, around which a segment o DNA double helix T A T A winds, like thread around a spool (Fig. 2-5). Approxi- C 3' 5' G G C G C mately 140 base pairs (bp) o DNA are associated with A A each histone core, making just under two turns around T T T A T A T A T A the octamer. Ater a short (20- to 60-bp) “spacer” G C G C G C G C segment o DNA, the next core DNA complex orms, and so on, giving chromatin the appearance o beads on C G C G A G T C A G T C a string. Each complex o DNA with core histones is A T T A A T T A called a nucleosome (see Fig. 2-5), which is the basic T A T A structural unit o chromatin, and each o the 46 human 5' 3' 5' 3' chromosomes contains several hundred thousand to Figure 2-4 Replication o a DNA double helix, resulting in two well over a million nucleosomes. A fth histone, H1, identical daughter molecules, each composed o one parental appears to bind to DNA at the edge o each nucleosome, strand and one newly synthesized strand. in the internucleosomal spacer region. The amount o ~30 nm ~10 nm 2 nm Each loop Cell in early contains interphase ~100-200 kb of DNA ~140 bp of DNA Portion of an interphase Histone chromosome octamer Interphase Solenoid Nucleosome fiber Double helix nucleus ("beads on a string") Figure 2-5 Hierarchical levels o chromatin packaging in a human chromosome. 8 THOMPSON & THOMPSON GENETICS IN MEDICINE DNA associated with a core nucleosome, together with mitochondria, although the vast majority o proteins the spacer region, is approximately 200 bp. within the mitochondria are, in act, the products o In addition to the major histone types, a number o nuclear genes. Mutations in mitochondrial genes have specialized histones can substitute or H3 or H2A and been demonstrated in several maternally inherited as coner specifc characteristics on the genomic DNA at well as sporadic disorders (Case 33) (see Chapters 7 that location. Histones can also be modifed by chemical and 12). changes, and these modifcations can change the proper- ties o nucleosomes that contain them. As discussed urther in Chapter 3, the pattern o major and special- The Human Genome Sequence ized histone types and their modifcations can vary rom With a general understanding o the structure and clini- cell type to cell type and is thought to speciy how DNA cal importance o chromosomes and the genes they is packaged and how accessible it is to regulatory mol- carry, scientists turned attention to the identifcation o ecules that determine gene expression or other genome specifc genes and their location in the human genome. unctions. From this broad eort emerged the Human Genome During the cell cycle, as we will see later in this Project, an international consortium o hundreds o chapter, chromosomes pass through orderly stages o laboratories around the world, ormed to determine condensation and decondensation. However, even when and assemble the sequence o the 3.3 billion base pairs chromosomes are in their most decondensed state, in a o DNA located among the 24 types o human stage o the cell cycle called interphase, DNA packaged chromosome. in chromatin is substantially more condensed than it Over the course o a decade and a hal, powered by would be as a native, protein-ree, double helix. Further, major developments in DNA-sequencing technology, the long strings o nucleosomes are themselves com- large sequencing centers collaborated to assemble pacted into a secondary helical structure, a cylindrical sequences o each chromosome. The genomes actually “solenoid” fber (rom the Greek solenoeides, pipe- being sequenced came rom several dierent individu- shaped) that appears to be the undamental unit o als, and the consensus sequence that resulted at the chromatin organization (see Fig. 2-5). The solenoids are conclusion o the Human Genome Project was reported themselves packed into loops or domains attached at in 2003 as a “reerence” sequence assembly, to be used intervals o approximately 100,000 bp (equivalent to as a basis or later comparison with sequences o indi- 100 kilobase pairs [kb], because 1 kb = 1000 bp) to a vidual genomes. This reerence sequence is maintained protein scaffold within the nucleus. It has been specu- in publicly accessible databases to acilitate scientifc lated that these loops are the unctional units o the discovery and its translation into useul advances or genome and that the attachment points o each loop are medicine. Genome sequences are typically presented in specifed along the chromosomal DNA. As we shall see, a 5′ to 3′ direction on just one o the two strands o the one level o control o gene expression depends on how double helix, because—owing to the complementary DNA and genes are packaged into chromosomes and nature o DNA structure described earlier—i one on their association with chromatin proteins in the knows the sequence o one strand, one can iner the packaging process. sequence o the other strand (Fig. 2-6). The enormous amount o genomic DNA packaged into a chromosome can be appreciated when chromo- somes are treated to release the DNA rom the underly- Organization of the Human Genome ing protein scaold (see Fig. 2-1). When DNA is released Chromosomes are not just a random collection o di- in this manner, long loops o DNA can be visualized, erent types o genes and other DNA sequences. Regions and the residual scaolding can be seen to reproduce o the genome with similar characteristics tend to be the outline o a typical chromosome. clustered together, and the unctional organization o the genome reects its structural organization and The Mitochondrial Chromosome sequence. Some chromosome regions, or even whole As mentioned earlier, a small but important subset o chromosomes, are high in gene content (“gene rich”), genes encoded in the human genome resides in the cyto- whereas others are low (“gene poor”) (Fig. 2-7). The plasm in the mitochondria (see Fig. 2-1). Mitochondrial clinical consequences o abnormalities o genome struc- genes exhibit exclusively maternal inheritance (see ture reect the specifc nature o the genes and sequences Chapter 7). Human cells can have hundreds to thou- involved. Thus abnormalities o gene-rich chromosomes sands o mitochondria, each containing a number o or chromosomal regions tend to be much more severe copies o a small circular molecule, the mitochondrial clinically than similar-sized deects involving gene-poor chromosome. The mitochondrial DNA molecule is only parts o the genome. 16 kb in length (just a tiny raction o the length o As a result o knowledge gained rom the Human even the smallest nuclear chromosome) and encodes Genome Project, it is apparent that the organization o only 37 genes. The products o these genes unction in DNA in the human genome is both more varied and CHAPTER 2 — INTRODUCTION TO THE HUMAN GENOME 9 5´... G G A T T T C T A G G T A A C T C A G T C G A... 3´ Double Helix 3´... C C T A A A G A T C C A T T G A G T C A G C T... 5´ Reference... G G AT T T C T A G G T A A C T C A G T C G A... Sequence Individual 1... G G A T T T C T A G G T A A C T C A G T C G A... Individual 2... G G A T T T C C A G G T A A C T C A G T C G A... Individual 3... G G A T T T C C A G G T A A C T C A G T C G A... Individual 4... G G A T T T C T A G G T A A C T C A G T A G A... Individual 5... G G A T - - C T A G G T A A C T C A G T C G A... Figure 2-6 A portion of the reference human genome sequence. By convention, sequences are presented rom one strand o DNA only, because the sequence o the complementary strand can be inerred rom the double-stranded nature o DNA (shown above the reerence sequence). The sequence o DNA rom a group o individuals is similar but not identical to the reerence, with single nucleotide changes in some individuals and a small deletion o two bases in another. 1 2000 Gene-rich chromosomes Genome Average: 19 ~6.7 genes/Mb 1500 11 2 Number of Genes 17 12 6 3 1000 16 14 7 5 9 X 4 10 20 8 15 500 22 13 21 18 Gene-poor chromosomes Y 0 0 50 100 150 200 250 Chromosome Size (Mb) Figure 2-7 Size and gene content of the 24 human chromosomes. Dotted diagonal line corre- sponds to the average density o genes in the genome, approximately 6.7 protein-coding genes per megabase (Mb). Chromosomes that are relatively gene rich are above the diagonal and trend to the upper let. Chromosomes that are relatively gene poor are below the diagonal and trend to the lower right. See Sources & Acknowledgments. 10 THOMPSON & THOMPSON GENETICS IN MEDICINE more complex than was once appreciated. O the bil- The dierent types o such tandem repeats are collec- lions o base pairs o DNA in any genome, less than tively called satellite DNAs, so named because many o 1.5% actually encodes proteins. Regulatory elements the original tandem repeat amilies could be separated that inuence or determine patterns o gene expression by biochemical methods rom the bulk o the genome during development or in tissues were believed to as distinct (“satellite”) ractions o DNA. account or only approximately 5% o additional Tandem repeat amilies vary with regard to their sequence, although more recent analyses o chromatin location in the genome and the nature o sequences that characteristics suggest that a much higher proportion o make up the array. In general, such arrays can stretch the genome may provide signals that are relevant to several million base pairs or more in length and consti- genome unctions. Only approximately hal o the total tute up to several percent o the DNA content o an linear length o the genome consists o so-called single- individual human chromosome. Some tandem repeat copy or unique DNA, that is, DNA whose linear order sequences are important as tools that are useul in clini- o specifc nucleotides is represented only once (or at cal cytogenetic analysis (see Chapter 5). Long arrays o most a ew times) around the entire genome. This repeats based on repetitions (with some variation) o a concept may appear surprising to some, given that there short sequence such as a pentanucleotide are ound in are only our dierent nucleotides in DNA. But, con- large genetically inert regions on chromosomes 1, 9, and sider even a tiny stretch o the genome that is only 10 16 and make up more than hal o the Y chromosome bases long; with our types o bases, there are over a (see Chapter 6). Other tandem repeat amilies are based million possible sequences. And, although the order o on somewhat longer basic repeats. For example, the bases in the genome is not entirely random, any particu- α-satellite amily o DNA is composed o tandem arrays lar 16-base sequence would be predicted by chance o an approximately 171-bp unit, ound at the centro- alone to appear only once in any given genome. mere o each human chromosome, which is critical or The rest o the genome consists o several classes o attachment o chromosomes to microtubules o the repetitive DNA and includes DNA whose nucleotide spindle apparatus during cell division. sequence is repeated, either perectly or with some varia- In addition to tandem repeat DNAs, another major tion, hundreds to millions o times in the genome. class o repetitive DNA in the genome consists o related Whereas most (but not all) o the estimated 20,000 sequences that are dispersed throughout the genome protein-coding genes in the genome (see Box earlier in rather than clustered in one or a ew locations. Although this chapter) are represented in single-copy DNA, many DNA amilies meet this general description, two sequences in the repetitive DNA raction contribute to in particular warrant discussion because together they maintaining chromosome structure and are an impor- make up a signifcant proportion o the genome and tant source o variation between dierent individuals; because they have been implicated in genetic diseases. some o this variation can predispose to pathological Among the best-studied dispersed repetitive elements events in the genome, as we will see in Chapters 5 are those belonging to the so-called Alu family. The and 6. members o this amily are approximately 300 bp in length and are related to each other although not identi- Single-Copy DNA Sequences cal in DNA sequence. In total, there are more than a Although single-copy DNA makes up at least hal o the million Alu amily members in the genome, making up DNA in the genome, much o its unction remains a at least 10% o human DNA. A second major dispersed mystery because, as mentioned, sequences actually repetitive DNA amily is called the long interspersed encoding proteins (i.e., the coding portion o genes) nuclear element (LINE, sometimes called L1) amily. constitute only a small proportion o all the single-copy LINEs are up to 6 kb in length and are ound in approx- DNA. Most single-copy DNA is ound in short stretches imately 850,000 copies per genome, accounting or (several kilobase pairs or less), interspersed with nearly 20% o the genome. Both o these amilies are members o various repetitive DNA amilies. The orga- plentiul in some regions o the genome but relatively nization o genes in single-copy DNA is addressed in sparse in others—regions rich in GC content tend to be depth in Chapter 3. enriched in Alu elements but depleted o LINE sequences, whereas the opposite is true o more AT-rich regions o Repetitive DNA Sequences the genome. Several dierent categories o repetitive DNA are rec- ognized. A useul distinguishing eature is whether the Repetitive DNA and Disease. Both Alu and LINE repeated sequences (“repeats”) are clustered in one or a sequences have been implicated as the cause o muta- ew locations or whether they are interspersed with tions in hereditary disease. At least a ew copies o the single-copy sequences along the chromosome. Clustered LINE and Alu amilies generate copies o themselves repeated sequences constitute an estimated 10% to 15% that can integrate elsewhere in the genome, occasionally o the genome and consist o arrays o various short causing insertional inactivation o a medically impor- repeats organized in tandem in a head-to-tail ashion. tant gene. The requency o such events causing genetic CHAPTER 2 — INTRODUCTION TO THE HUMAN GENOME 11 disease in humans is unknown, but they may account in uture chapters, any and all o these types o variation or as many as 1 in 500 mutations. In addition, aberrant can inuence biological unction and thus must be recombination events between dierent LINE repeats or accounted or in any attempt to understand the contri- Alu repeats can also be a cause o mutation in some bution o genetics to human health. genetic diseases (see Chapter 12). An important additional type o repetitive DNA ound in many dierent locations around the genome TRANSMISSION OF THE GENOME includes sequences that are duplicated, oten with The chromosomal basis o heredity lies in the copying extraordinarily high sequence conservation. Duplica- o the genome and its transmission rom a cell to its tions involving substantial segments o a chromosome, progeny during typical cell division and rom one gen- called segmental duplications, can span hundreds o eration to the next during reproduction, when single kilobase pairs and account or at least 5% o the genome. copies o the genome rom each parent come together When the duplicated regions contain genes, genomic in a new embryo. rearrangements involving the duplicated sequences can To achieve these related but distinct orms o genome result in the deletion o the region (and the genes) inheritance, there are two kinds o cell division, mitosis between the copies and thus give rise to disease (see and meiosis. Mitosis is ordinary somatic cell division Chapters 5 and 6). by which the body grows, dierentiates, and eects tissue regeneration. Mitotic division normally results in two daughter cells, each with chromosomes and genes VARIATION IN THE HUMAN GENOME identical to those o the parent cell. There may be dozens With completion o the reerence human genome or even hundreds o successive mitoses in a lineage o sequence, much attention has turned to the discovery somatic cells. In contrast, meiosis occurs only in cells o and cataloguing o variation in sequence among dier- the germline. Meiosis results in the ormation o repro- ent individuals (including both healthy individuals and ductive cells (gametes), each o which has only 23 those with various diseases) and among dierent popu- chromosomes—one o each kind o autosome and either lations around the globe. As we will explore in much an X or a Y. Thus, whereas somatic cells have the more detail in Chapter 4, there are many tens o millions diploid (diploos, double) or the 2n chromosome com- o common sequence variants that are seen at signifcant plement (i.e., 46 chromosomes), gametes have the requency in one or more populations; any given indi- haploid (haploos, single) or the n complement (i.e., 23 vidual carries at least 5 million o these sequence vari- chromosomes). Abnormalities o chromosome number ants. In addition, there are countless very rare variants, or structure, which are usually clinically signifcant, can many o which probably exist in only a single or a ew arise either in somatic cells or in cells o the germline individuals. In act, given the number o individuals in by errors in cell division. our species, essentially each and every base pair in the human genome is expected to vary in someone some- where around the globe. It is or this reason that the The Cell Cycle original human genome sequence is considered a “reer- A human being begins lie as a ertilized ovum (zygote), ence” sequence or our species, but one that is actually a diploid cell rom which all the cells o the body (esti- identical to no individual’s genome. mated to be approximately 100 trillion in number) are Early estimates were that any two randomly selected derived by a series o dozens or even hundreds o individuals would have sequences that are 99.9% iden- mitoses. Mitosis is obviously crucial or growth and tical or, put another way, that an individual genome dierentiation, but it takes up only a small part o the would carry two different versions (alleles) o the human lie cycle o a cell. The period between two successive genome sequence at some 3 to 5 million positions, with mitoses is called interphase, the state in which most o dierent bases (e.g., a T or a G) at the maternally and the lie o a cell is spent. paternally inherited copies o that particular sequence Immediately ater mitosis, the cell enters a phase, position (see Fig. 2-6). Although many o these allelic called G1, in which there is no DNA synthesis (Fig. 2-8). dierences involve simply one nucleotide, much o the Some cells pass through this stage in hours; others spend variation consists o insertions or deletions o (usually) a long time, days or years, in G1. In act, some cell types, short sequence stretches, variation in the number o such as neurons and red blood cells, do not divide at all copies o repeated elements (including genes), or inver- once they are ully dierentiated; rather, they are per- sions in the order o sequences at a particular position manently arrested in a distinct phase known as G0 (“G (locus) in the genome (see Chapter 4). zero”). Other cells, such as liver cells, may enter G0 but, The total amount o the genome involved in such ater organ damage, return to G1 and continue through variation is now known to be substantially more than the cell cycle. originally estimated and approaches 0.5% between any The cell cycle is governed by a series o checkpoints two randomly selected individuals. As will be addressed that determine the timing o each step in mitosis. In 12 THOMPSON & THOMPSON GENETICS IN MEDICINE integrity is illustrated by a range o clinical conditions that result rom deects in elements o the telomere or kinetochore or cell cycle machinery or rom inaccurate replication o even small portions o the genome (see Box). Some o these conditions will be presented in G1 Telomere greater detail in subsequent chapters. (10-12 hr) Centromere M G2 S CLINICAL CONSEQUENCES OF ABNORMALITIES (2-4 hr) (6-8 hr) Telomere AND VARIATION IN CHROMOSOME STRUCTURE AND MECHANICS Medically relevant conditions arising rom abnormal structure or unction o chromosomal elements during cell division include the ollowing: Sister chromatids A broad spectrum o congenital abnormalities in chil- dren with inherited deects in genes encoding key Figure 2-8 A typical mitotic cell cycle, described in the text. The components o the mitotic spindle checkpoint at the telomeres, the centromere, and sister chromatids are indicated. kinetochore A range o birth defects and developmental disorders addition, checkpoints monitor and control the accuracy due to anomalous segregation o chromosomes with o DNA synthesis as well as the assembly and attach- multiple or missing centromeres (see Chapter 6) A variety o cancers associated with overreplication ment o an elaborate network o microtubules that (amplifcation) or altered timing o replication o spe- acilitate chromosome movement. I damage to the cifc regions o the genome in S phase (see Chapter 15) genome is detected, these mitotic checkpoints halt cell Roberts syndrome o growth retardation, limb short- cycle progression until repairs are made or, i the damage ening, and microcephaly in children with abnormali- is excessive, until the cell is instructed to die by pro- ties o a gene required or proper sister chromatid alignment and cohesion in S phase grammed cell death (a process called apoptosis). Premature ovarian failure as a major cause o emale During G1, each cell contains one diploid copy o the inertility due to mutation in a meiosis-specifc gene genome. As the process o cell division begins, the cell required or correct sister chromatid cohesion enters S phase, the stage o programmed DNA synthesis, The so-called telomere syndromes, a number o degen- ultimately leading to the precise replication o each erative disorders presenting rom childhood to adult- hood in patients with abnormal telomere shortening chromosome’s DNA. During this stage, each chromo- due to deects in components o telomerase some, which in G1 has been a single DNA molecule, is And, at the other end o the spectrum, common gene duplicated and consists o two sister chromatids (see variants that correlate with the number o copies o Fig. 2-8), each o which contains an identical copy o the repeats at telomeres and with lie expectancy and the original linear DNA double helix. The two sister longevity chromatids are held together physically at the centro- mere, a region o DNA that associates with a number o specifc proteins to orm the kinetochore. This com- plex structure serves to attach each chromosome to By the end o S phase, the DNA content o the cell the microtubules o the mitotic spindle and to govern has doubled, and each cell now contains two copies o chromosome movement during mitosis. DNA synthesis the diploid genome. Ater S phase, the cell enters a brie during S phase is not synchronous throughout all chro- stage called G2. Throughout the whole cell cycle, the cell mosomes or even within a single chromosome; rather, gradually enlarges, eventually doubling its total mass along each chromosome, it begins at hundreds to beore the next mitosis. G2 is ended by mitosis, which thousands o sites, called origins of DNA replication. begins when individual chromosomes begin to condense Individual chromosome segments have their own char- and become visible under the microscope as thin, acteristic time o replication during the 6- to 8-hour S extended threads, a process that is considered in greater phase. The ends o each chromosome (or chromatid) are detail in the ollowing section. marked by telomeres, which consist o specialized repet- The G1, S, and G2 phases together constitute inter- itive DNA sequences that ensure the integrity o the phase. In typical dividing human cells, the three phases chromosome during cell division. Correct maintenance take a total o 16 to 24 hours, whereas mitosis lasts only o the ends o chromosomes requires a special enzyme 1 to 2 hours (see Fig. 2-8). There is great variation, called telomerase, which ensures that the very ends o however, in the length o the cell cycle, which ranges each chromosome are replicated. rom a ew hours in rapidly dividing cells, such as those The essential nature o these structural elements o o the dermis o the skin or the intestinal mucosa, to chromosomes and their role in ensuring genome months in other cell types. CHAPTER 2 — INTRODUCTION TO THE HUMAN GENOME 13 now become independent daughter chromosomes, Mitosis which move to opposite poles o the cell. During the mitotic phase o the cell cycle, an elaborate Telophase. Now, the chromosomes begin to decon- apparatus ensures that each o the two daughter cells dense rom their highly contracted state, and a receives a complete set o genetic inormation. This nuclear membrane begins to re-orm around each o result is achieved by a mechanism that distributes one the two daughter nuclei, which resume their inter- chromatid o each chromosome to each daughter cell phase appearance. To complete the process o cell (Fig. 2-9). The process o distributing a copy o each division, the cytoplasm cleaves by a process known chromosome to each daughter cell is called chromosome as cytokinesis. segregation. The importance o this process or normal There is an important dierence between a cell enter- cell growth is illustrated by the observation that many ing mitosis and one that has just completed the process. tumors are invariably characterized by a state o genetic A cell in G2 has a ully replicated genome (i.e., a 4n imbalance resulting rom mitotic errors in the distribu- complement o DNA), and each chromosome consists tion o chromosomes to daughter cells. o a pair o sister chromatids. In contrast, ater mitosis, The process o mitosis is continuous, but fve stages, the chromosomes o each daughter cell have only one illustrated in Figure 2-9, are distinguished: prophase, copy o the genome. This copy will not be duplicated prometaphase, metaphase, anaphase, and telophase. until a daughter cell in its turn reaches the S phase o Prophase. This stage is marked by gradual condensa- the next cell cycle (see Fig. 2-8). The entire process o tion o the chromosomes, ormation o the mitotic mitosis thus ensures the orderly duplication and distri- spindle, and ormation o a pair o centrosomes, rom bution o the genome through successive cell divisions. which microtubules radiate and eventually take up positions at the poles o the cell. Prometaphase. Here, the nuclear membrane dis- The Human Karyotype solves, allowing the chromosomes to disperse within The condensed chromosomes o a dividing human cell the cell and to attach, by their kinetochores, to micro- are most readily analyzed at metaphase or prometa- tubules o the mitotic spindle. phase. At these stages, the chromosomes are visible Metaphase. At this stage, the chromosomes are maxi- under the microscope as a so-called chromosome spread; mally condensed and line up at the equatorial plane each chromosome consists o its sister chromatids, o the cell. although in most chromosome preparations, the two Anaphase. The chromosomes separate at the centro- chromatids are held together so tightly that they are mere, and the sister chromatids o each chromosome rarely visible as separate entities. Cell in G2 Onset of mitosis S phase Centrosomes Interphase Cells in G1 Decondensed chromatin Prophase Cytokinesis Telophase Prometaphase Anaphase Metaphase Microtubules Figure 2-9 Mitosis. Only two chromosome pairs are shown. For details, see text. 14 THOMPSON & THOMPSON GENETICS IN MEDICINE (“the human karyotype”) and, as a verb, to the process o preparing such a standard fgure (“to karyotype”). Unlike the chromosomes seen in stained preparations under the microscope or in photographs, the chromo- somes o living cells are uid and dynamic structures. During mitosis, the chromatin o each interphase chro- mosome condenses substantially (Fig. 2-12). When maximally condensed at metaphase, DNA in chromo- somes is approximately 1/10,000 o its ully extended state. When chromosomes are prepared to reveal bands (as in Figs. 2-10 and 2-11), as many as 1000 or more bands can be recognized in stained preparations o all the chromosomes. Each cytogenetic band thereore con- tains as many as 50 or more genes, although the density o genes in the genome, as mentioned previously, is variable. Meiosis Meiosis, the process by which diploid cells give rise to Figure 2-10 A chromosome spread prepared from a lymphocyte haploid gametes, involves a type o cell division that is culture that has been stained by the Giemsa-banding (G-banding) unique to germ cells. In contrast to mitosis, meiosis technique. The darkly stained nucleus adjacent to the chromo- consists o one round o DNA replication ollowed by somes is rom a dierent cell in interphase, when chromosomal material is diuse throughout the nucleus. See Sources & two rounds o chromosome segregation and cell divi- Acknowledgments. sion (see meiosis I and meiosis II in Fig. 2-13). As out- lined here and illustrated in Figure 2-14, the overall As stated earlier, there are 24 dierent types o human sequence o events in male and emale meiosis is the chromosome, each o which can be distinguished cyto- same; however, the timing o gametogenesis is very di- logically by a combination o overall length, location o erent in the two sexes, as we will describe more ully the centromere, and sequence content, the latter reected later in this chapter. by various staining methods. The centromere is appar- Meiosis I is also known as the reduction division ent as a primary constriction, a narrowing or pinching-in because it is the division in which the chromosome o the sister chromatids due to ormation o the kineto- number is reduced by hal through the pairing o homo- chore. This is a recognizable cytogenetic landmark, logues in prophase and by their segregation to dierent dividing the chromosome into two arms, a short arm cells at anaphase o meiosis I. Meiosis I is also notable designated p (or petit) and a long arm designated q. because it is the stage at which genetic recombination Figure 2-10 shows a prometaphase cell in which the (also called meiotic crossing over) occurs. In this process, chromosomes have been stained by the Giemsa-staining as shown or one pair o chromosomes in Figure 2-14, (G-banding) method (also see Chapter 5). Each chromo- homologous segments o DNA are exchanged between some pair stains in a characteristic pattern o alternating nonsister chromatids o each pair o homologous chro- light and dark bands (G bands) that correlates roughly mosomes, thus ensuring that none o the gametes pro- with eatures o the underlying DNA sequence, such as duced by meiosis will be identical to another. The base composition (i.e., the percentage o base pairs that conceptual and practical consequences o recombina- are GC or AT) and the distribution o repetitive DNA tion or many aspects o human genetics and genomics elements. With such banding techniques, all o the chro- are substantial and are outlined in the Box at the end mosomes can be individually distinguished, and the o this section. nature o many structural or numerical abnormalities Prophase o meiosis I diers in a number o ways can be determined, as we examine in greater detail in rom mitotic prophase, with important genetic conse- Chapters 5 and 6. quences, because homologous chromosomes need to Although experts can oten analyze metaphase chro- pair and exchange genetic inormation. The most criti- mosomes directly under the microscope, a common pro- cal early stage is called zygotene, when homologous cedure is to cut out the chromosomes rom a digital chromosomes begin to align along their entire length. image or photomicrograph and arrange them in pairs in The process o meiotic pairing—called synapsis—is nor- a standard classifcation (Fig. 2-11). The completed mally precise, bringing corresponding DNA sequences picture is called a karyotype. The word karyotype is also into alignment along the length o the entire chromosome used to reer to the standard chromosome set o an pair. The paired homologues—now called bivalents— individual (“a normal male karyotype”) or o a species are held together by a ribbon-like proteinaceous structure Figure 2-11 A human male karyotype with Giemsa banding (G banding). The chromosomes are at the prometaphase stage o mitosis and are arranged in a standard classifcation, numbered 1 to 22 in order o length, with the X and Y chromosomes shown separately. See Sources & Acknowledgments. Metaphase Decondensation as cell returns to interphase Interphase nucleus Decondensed Prophase chromatin Condensation as mitosis begins Figure 2-12 Cycle o condensation and decondensation as a chromosome proceeds through the cell cycle. 16 THOMPSON & THOMPSON GENETICS IN MEDICINE Chromosome replication Interphase Prophase I Meiosis I Meiosis I Meiosis II Metaphase I Four haploid gametes Figure 2-13 A simplifed representation o the essential steps in Anaphase I meiosis, consisting o one round o DNA replication ollowed by two rounds o chromosome segregation, meiosis I and meiosis II. called the synaptonemal complex, which is essential to the process o recombination. Ater synapsis is com- plete, meiotic crossing over takes place during pachy- tene, ater which the synaptonemal complex breaks down. Interphase Metaphase I begins, as in mitosis, when the nuclear membrane disappears. A spindle orms, and the paired chromosomes align themselves on the equatorial plane with their centromeres oriented toward dierent poles (see Fig. 2-14). Anaphase o meiosis I again diers substantially rom the corresponding stage o mitosis. Here, it is the two members o each bivalent that move apart, not the sister Meiosis II chromatids (contrast Fig. 2-14 with Fig. 2-9). The homologous centromeres (with their attached sister Metaphase II Figure 2-14 Meiosis and its consequences. A single chromosome pair and a single crossover are shown, leading to ormation o our distinct gametes. The chromosomes replicate during inter- phase and begin to condense as the cell enters prophase o meiosis I. In meiosis I, the chromosomes synapse and recombine. A cross- Anaphase II over is visible as the homologues align at metaphase I, with the centromeres oriented toward opposite poles. In anaphase I, the exchange o DNA between the homologues is apparent as the chromosomes are pulled to opposite poles. Ater completion o meiosis I and cytokinesis, meiosis II proceeds with a mitosis-like division. The sister kinetochores separate and move to opposite Gametes poles in anaphase II, yielding our haploid products. CHAPTER 2 — INTRODUCTION TO THE HUMAN GENOME 17 chromatids) are drawn to opposite poles o the cell, a Grandpaternal Grandmaternal process termed disjunction. Thus the chromosome DNA sequences DNA sequences number is halved, and each cellular product o meiosis I has the haploid chromosome number. The 23 pairs o homologous chromosomes assort independently o one another, and as a result, the original paternal and mater- nal chromosome sets are sorted into random combina- Paternal tions. The possible number o combinations o the 23 chromosomes chromosome pairs that can be present in the gametes is 223 (more than 8 million). Owing to the process o cross- ing over, however, the variation in the genetic material that is transmitted rom parent to child is actually much GENETIC CONSEQUENCES AND MEDICAL RELEVANCE OF HOMOLOGOUS RECOMBINATION The take-home lesson o this portion o the chapter is a simple one: the genetic content of each gamete is unique, because o random assortment o the parental chromo- somes to shue the combination o sequence variants between chromosomes and because o homologous recombination to shue the combination o sequence variants within each and every chromosome. This has signifcant consequences or patterns o genomic varia- tion among and between dierent populations around the globe and or diagnosis and counseling o many common conditions with complex patterns o inheritance (see Chapters 8 and 10). The amounts and patterns of meiotic recombination are determined by sequence variants in specifc genes and Paternal chromosome Paternal chromosome at specifc “hot spots” and dier between individuals, inherited by Child 1 inherited by Child 2 between the sexes, between amilies, and between popula- tions (see Chapter 10). Figure 2-15 The effect of homologous recombination in meiosis. Because recombination involves the physical inter- In this example, representing the inheritance o sequences on a twining o the two homologues until the appropriate typical large chromosome, an individual has distinctive homo- point during meiosis I, it is also critical or ensuring logues, one containing sequences inherited rom his ather (blue) proper chromosome segregation during meiosis. Failure and one containing homologous sequences rom his mother to recombine properly can lead to chromosome misseg- (purple). Ater meiosis in spermatogenesis, he transmits a single regation (nondisjunction) in meiosis I and is a requent complete copy o that chromosome to his two ospring. However, cause o pregnancy loss and o chromosome abnormali- as a result o crossing over (arrows), the copy he transmits to each ties like Down syndrome (see Chapters 5 and 6). child consists o alternating segments o the two grandparental Major ongoing eorts to identify genes and their vari- sequences. Child 1 inherits a copy ater two crossovers, whereas ants responsible for various medical conditions rely on child 2 inherits a copy with three crossovers. tracking the inheritance o millions o sequence dier- ences within amilies or the sharing o variants within groups o even unrelated individuals aected with a par- ticular condition. The utility o this approach, which has uncovered thousands o gene-disease associations to date, greater than this. As a result, each chromatid typically depends on patterns o homologous recombination in contains segments derived rom each member o the meiosis (see Chapter 10). Although homologous recombination is normally original parental chromosome pair, as illustrated sche- precise, areas o repetitive DNA in the genome and genes matically in Figure 2-14. For example, at this stage, a o variable copy number in the population are prone to typical large human chromosome would be composed occasional unequal crossing over during meiosis, leading o three to fve segments, alternately paternal and to variations in clinically relevant traits such as drug maternal in origin, as inerred rom DNA sequence vari- response, to common disorders such as the thalassemias or autism, or to abnormalities o sexual dierentiation ants that distinguish the respective parental genomes (see Chapters 6, 8, and 11). (Fig. 2-15). Although homologous recombination is a normal and Ater telophase o meiosis I, the two haploid daugh- essential part o meiosis, it also occurs, albeit more rarely, ter cells enter meiotic interphase. In contrast to mitosis, in somatic cells. Anomalies in somatic recombination are this interphase is brie, and meiosis II begins. The one o the causes o genome instability in cancer (see Chapter 15). notable point that distinguishes meiotic and mitotic interphase is that there is no S phase (i.e., no DNA 18 THOMPSON & THOMPSON GENETICS IN MEDICINE synthesis and duplication o the genome) between the frst and second meiotic divisions. Meiosis II is similar to an ordinary mitosis, except that the chromosome number is 23 instead o 46; the chromatids o each o the 23 chromosomes separate, and one chromatid o each chromosome passes to each daughter cell (see Fig. 2-14). However, as mentioned Testis earlier, because o crossing over in meiosis I, the chro- mosomes o the resulting gametes are not identical (see Fig. 2-15). Spermatogonium 46,XY HUMAN GAMETOGENESIS AND FERTILIZATION The cells in the germline that undergo meiosis, primary spermatocytes or primary oocytes, are derived rom the Primary spermatocyte zygote by a long series o mitoses beore the onset o 46,XY meiosis. Male and emale gametes have dierent histo- ries, marked by dierent patterns o gene expression that reect their developmental origin as an XY or XX embryo. The human primordial germ cells are recogniz- Meiosis I able by the ourth week o development outside the embryo proper, in the endoderm o the yolk sac. From Secondary spermatocytes there, they migrate during the sixth week to the genital ridges and associate with somatic cells to orm the prim- 23,X 23,Y itive gonads, which soon dierentiate into testes or ovaries, depending on the cells’ sex chromosome con- Meiosis II stitution (XY or XX), as we examine in greater detail in Chapter 6. Both spermatogenesis and oogenesis require meiosis but have important dierences in detail and timing that may have clinical and genetic conse- quences or the ospring. Female meiosis is initiated once, early during etal lie, in a limited number o cells. 23,X 23,X 23,Y 23,Y In contrast, male meiosis is initiated continuously in Spermatids many cells rom a dividing cell population throughout the adult lie o a male. In the emale, successive stages o meiosis take place over several decades—in the etal ovary beore the emale in question is even born, in the oocyte near the time o ovulation in the sexually mature emale, and ater ertilization o the egg that can become that emale’s ospring. Although postertilization stages can be studied in vitro, access to the earlier stages is limited. Testicular material or the study o male meiosis is less difcult to obtain, inasmuch as testicular biopsy is 23,X 23,X 23,Y 23,Y included in the assessment o many men attending iner- tility clinics. Much remains to be learned about the cytogenetic, biochemical, and molecular mechanisms involved in normal meiosis and about the causes and Figure 2-16 Human spermatogenesis in relation to the two consequences o meiotic irregularities. meiotic divisions. The sequence o events begins at puberty and takes approximately 64 days to be completed. The chromosome number (46 or 23) and the sex chromosome constitution (X or Spermatogenesis Y) o each cell are shown. See Sources & Acknowledgments. The stages o spermatogenesis are shown in Figure 2-16. The seminierous tubules o the testes are lined with spermatogonia, which develop rom the primordial CHAPTER 2 — INTRODUCTION TO THE HUMAN GENOME 19 germ cells by a long series o mitoses and which are in dierent stages o dierentiation. Sperm (spermatozoa) are ormed only ater sexual maturity is reached. The last cell type in the developmental sequence is the Ovary primary spermatocyte, a diploid germ cell that under- goes meiosis I to orm two haploid secondary spermato- cytes. Secondary spermatocytes rapidly enter meiosis II, each orming two spermatids, which dierentiate without urther division into sperm. In humans, the entire process takes approximately 64 days. The enor- mous number o sperm produced, typically approxi- Primary oocyte mately 200 million per ejaculate and an estimated 1012 in follicle in a lietime, requires several hundred successive mitoses. Meiosis I As discussed earlier, normal meiosis requires pairing Suspended in o homologous chromosomes ollowed by recombina- prophase I tion. The autosomes and the X chromosomes in emales until sexual present no unusual difculties in this regard; but what maturity o the X and Y chromosomes during spermatogenesis? Although the X and Y chromosomes are dierent and are not homologues in a strict sense, they do have rela- Secondary oocyte tively short identical segments at the ends o their Meiotic spindle respective short arms (Xp and Yp) and long arms (Xq and Yq) (see Chapter 6). Pairing and crossing over 1st polar body occurs in both regions during meiosis I. These homolo- gous segments are called pseudoautosomal to reect their autosome-like pairing and recombination behav- ior, despite being on dierent sex chromosomes. Ovulation Meiosis II Oogenesis Whereas spermatogenesis is initiated only at the time o puberty, oogenesis begins during a emale’s development as a etus (Fig. 2-17). The ova develop rom oogonia, cells in the ovarian cortex that have descended rom the primordial germ cells by a series o approximately 20 mitoses. Each oogonium is the central cell in a develop- ing ollicle. By approximately the third month o etal Fertilization development, the oogonia o the embryo have begun to develop into primary oocytes, most o which have already entered prophase o meiosis I. The process o 2nd polar body oogenesis is not synchronized, and both early and late stages coexist in the etal ovary. Although there are several million oocytes at the time o birth, most o these Sperm degenerate; the others remain arrested in prophase I (see Fig. 2-14) or decades. Only approximately 400 eventu- ally mature and are ovulated as part o a woman’s menstrual cycle. Ater a woman reaches sexual matur