chapters 12 & 13.docx
Document Details
Uploaded by CompactConsonance
Tags
Related
- OCR (A) Biology A-level 2.3 Nucleotides and Nucleic Acids PDF Notes
- Molecular & Cellular Biology 3rd Year Pharmacy Syllabus 2024-2025 PDF
- Lecture Notes: Introduction to Molecular Genetics (PDF)
- Molecular Biology Lv2 Nucleic Acids Lesson 1 PDF
- Nucleic Acids and Molecular Genetics PDF
- BHS016-1 Molecular Genetics Lecture 01 - Nucleic Acids PDF
Full Transcript
Chapter 12 Nucleic acids carry genetic information that codes for protein structure. Primary characteristics of nucleic acids, such as nucleic acid complementarity and melting temperature, form the basis of specificity for almost all nucleic acid--based tests. Nucleic acids can be detected earlier...
Chapter 12 Nucleic acids carry genetic information that codes for protein structure. Primary characteristics of nucleic acids, such as nucleic acid complementarity and melting temperature, form the basis of specificity for almost all nucleic acid--based tests. Nucleic acids can be detected earlier than antibodies during the course of an illness. DNA and RNA The two main kinds of nucleic acids are DNA and RNA. DNA carries the primary genetic information within chromosomes found in each cell. There are different types of RNA, including messenger RNA (mRNA), transfer RNA (tRNA), ribosomal RNA (rRNA), and noncoding RNA. DNA and RNA are macromolecules of nucleotides. A nucleotide is composed of a phosphorylated deoxyribose or ribose sugar and a nitrogen base. There are five nitrogen bases that make up the majority of nucleic acids found in nature: adenine (A), cytosine (C), guanine (G), thymine (T, found in DNA), and uracil (U; found only in RNA The nitrogen base of a nucleotide, either guanine, adenine, cytosine, or thymine, is attached to the 1′ carbon of the deoxyribose sugar. The deoxyribose 5′ carbon may be bound to one, two, or three phosphate groups. The deoxyribose 3′ carbon carries a hydroxyl group (OH).-2 Nucleotide structure. Nitrogen base ring positions are numbered ordinally, and the ribose ring positions are numbered with prime numbers. Nitrogen bases are attached to a ribose sugar in RNA. RNA contains adenine, cytosine, and guanine but has uracil nucleotides in place of the thymines found in DNA. Unlike deoxyribonucleotides, which are hydroxylated on the 3′ carbon, the phosphorylated ribose sugar in RNA carries hydroxyl groups on both the 2′ and 3′ carbons. Substituted Nucleotides Natural modifications of the nucleotide structure include methylation, deamination, additions, substitutions, and other chemical modifications. These modifications may be enzymatically catalyzed in the cell or spontaneous reactions. Nucleotide modifications can also result in nucleotides with new properties. Addition and removal of methyl groups (--CH3) to DNA affect gene function. Nucleotide base modifications are also caused by environmental insults such as chemicals or radiation. Gram-negative bacteria use modified nucleotides in a type of immune system, the restriction modification (rm) system. The bacterium adds methyl groups to its own DNA to distinguish it from that of invaders, such as bacterial viruses. Recognizing its own DNA in this way, the bacterium can target the invader's DNA for enzymatic degradation. The Nucleic Acid Polymer Nucleotides are polymerized into nucleic acids by attachment of the 3′ hydroxyl groups on the deoxyribose or ribose sugar to the 5′ phosphate group of the adjacent nucleotide, forming a phosphodiester bond. The hydroxyl group on the 3′ carbon and the phosphate group on the 5′ carbon that participate in the formation of the DNA polymer through phosphodiester bonds give the DNA strands polarity, that is, a 5′ phosphate end and a 3′ hydroxyl end. Sequences are by convention ordered in the 5′ to 3′ direction. In the double helix, complementary strands hydrogen-bond together in an antiparallel arrangement, with the 5′ phosphates of the two strands at opposite ends of the helix. The two chains (strands) of the DNA double helix are held together by hydrogen bonds between their nucleotide bases. Guanine (G) and cytosine (C) are complementary; that is, they will only hydrogen-bond with each other. Adenine (A) and thymine (T) are complementary to each other as well. G pairs with C by three hydrogen bonds, and A pairs with T by two hydrogen bonds. Two bases joined together in this way are called a base pair (bp). The length of a double-stranded DNA macromolecule is measured in bp. The length of a single strand of RNA (or DNA) is measured in bases (b). Metric prefixes are used to describe long strands of DNA or RNA, for example, 1,000 bp or b comprise a kilobase pair (kbp) or kilobase (kb), respectively. One million bp or b comprise a megabase pair (Mbp) or megabase (Mb), respectively. Most microorganisms contain one double helix, usually in circular form and a few Mbp in size. Viruses may carry a double- or single-stranded DNA or RNA molecule. DNA Replication Chromosomes in bacteria carry a defined sequence of nucleotides called the origin of replication for DNA replication to begin. Replication proceeds through the chromosome, followed by binary fission of the bacterial cell. Some bacteria, such as Escherichia coli and Bacillus subtilis, have replication at two, four, or eight origins, depending on the growth rate, which allows for shorter cell doubling times in rich growth environments. The cell cycle, consists of four stages: G1, S, G2, and M (Fig. 12--6). DNA replication takes place during the S phase of the cell cycle. At the end of the S phase, the DNA complement of the cell is doubled. This is the G2 phase. One complement of chromosomes is divided into each of two daughter cells during the M phase. Each daughter cell will then be in the G1 phase. Replication of DNA is semiconservative; that is, the two strands of the DNA duplex are separated, and each single strand serves as a template for a newly synthesized complementary strand. DNA replication proceeds with the formation of phosphodiester bonds between the 5′ phosphate of an incoming nucleotide and the 3′ hydroxyl group of the previously added nucleotide. This reaction is catalyzed by a DNA polymerase enzyme. The parental template strand is read from the 3′ to 5′ direction, whereas synthesis proceeds from 5′ to 3′, making the completely replicated strands antiparallel. DNA synthesis cannot begin without a preexisting 3′ hydroxyl group. To begin synthesis in vivo, a primer of RNA is synthesized by an RNA polymerase (primase) enzyme. The requirement for DNA synthesis to read the template strand in a 3′ to 5′ direction is not consistent with copying of both strands simultaneously in the same direction. To accommodate this arrangement, one strand, termed the lagging strand, is copied discontinuously toward the replication fork, whereas the other strand, called the leading strand, is copied continuously in the direction of replication; see --7. RNA Synthesis RNA synthesis is catalyzed by RNA polymerase, which begins polymerization of RNA by binding to its recognition start site in DNA (promoter). RNA synthesis can start de novo without a primer. RNA polymerase is a more error-prone, slower polymerase than DNA polymerase. There are more start sites for RNA polymerization than for DNA synthesis in the cell. The bulk of DNA synthesis takes place in the S phase of the cell cycle, whereas RNA synthesis occurs throughout the cell cycle and varies, depending on the cellular requirements. Only about 2% of the RNA-coding regions are translated into protein. Some genes code for transfer RNA and ribosomal RNA, which are required for translation of protein-coding messenger RNA into protein (see section that follows). Large portions of the genome are occupied by retrotransposons, DNA elements that can move from one location to another through an RNA intermediate. The remaining noncoding RNA, initially thought to be spontaneous and randomly initiated RNA synthesis, is now known to be composed of regulatory RNA molecules that affect both transcription and translation of the protein-coding genes. These RNAs include microRNA and long noncoding RNA. Noncoding RNA, along with methylated nucleotides and modified histone proteins associated with DNA, are considered epigenetic mechanisms. In contrast to genetics, which is based on the order or sequence of nucleotides, epigenetics involves chemical changes in histone proteins, modification of DNA such as base methylation, and noncoding RNA activities that can influence the expression of genes independent of the nucleotide sequence. Protein Synthesis The central dogma of genetics states that genetic information flows from DNA to mRNA, the process of transcription, and from mRNA to protein, the process called translation (Fig. 12--8). Proteins are directly responsible for the phenotype, or observable properties, of an organism, such as eye color, height, and enzyme activity. After DNA is transcribed by RNA polymerase into mRNA, the mRNA transcripts of protein-coding genes are translated into protein. Each mRNA is marked by a guanidine nucleotide covalently attached to its 5′ end in an unusual 5′--5′ bond (cap) and 2 to 20 adenines at the 3′ end (polyadenylation). These structures maintain the stability of the mRNA and allow its recognition by the ribosomes. Ribosomes are organelles that are composed of ribosomal proteins and ribosomal RNA (rRNA). Ribosomes assemble on the mRNA for protein synthesis.-8 mRNA is transcribed from DNA using RNA polymerase. The mRNA delivers the information to ribosomes, where protein synthesis takes place. As each amino acid is added, the peptide chain continues to grow. This process, known as translation, is accomplished with the help of tRNA, which brings in individual amino acids. Translation means converting information from one language to another. The language held in the order or sequence of the four nucleotides in the DNA chains must be translated into the order or sequence of the 20 amino acids making up a protein chain. The nucleotide sequence of mRNA contains a 3-base recognition sequence called a codon for each of the 20 amino acids, that is, the genetic code ( 12--1). The codons are carried from the nucleus to the cytoplasm in mRNA to be translated into protein. Molecules of tRNA serve as adaptors between the nucleotide sequence in the RNA and the amino acid sequence in proteins. There are distinct tRNAs for each of the 20 amino acids. Each tRNA, folded into an inverted "L"-like structure, carries an amino acid at the 3′ end and a 3-base complementary sequence (anti-codon) to the codon of that amino acid. The 5 ends of the tRNAs are covalently attached to their corresponding amino acids (charged) by aminoacyl-tRNA synthetase enzymes. The charged tRNAs assemble with the ribosome and mRNA to initiate protein synthesis. Within the ribosome, the anti-codon hydrogen-bonds to the codon on the mRNA, holding the amino acid in place to be covalently attached to the growing peptide by peptidyl transferase activity. Synthesis proceeds as the ribosome moves in three base steps along the mRNA as each charged tRNA binds. Synthesis terminates when the ribosome encounters a stop codon in the mRNA. Multiple initiation, elongation, and termination factors participate in this process. Newly synthesized proteins are directed through the endoplasmic reticulum of the cell to their final destination. DNA Sequence Changes A change in the nucleotide sequence in DNA is called a mutation or variant. Depending on its frequency of occurrence, a nucleotide sequence change may also be referred to as a polymorphism. The term variant is recommended for nucleotide sequence changes that are inherited (germline), whereas mutation is a more general term for spontaneous changes in DNA (somatic). These alterations may range in size from a single base pair to millions of base pairs that result in chromosomal structural abnormalities. Point mutations involve one or a few base pairs and are classified by their effect on the amino acid sequence ( 12--2). Conservative and silent mutations do not affect phenotype, whereas nonconservative, nonsense, and frameshift mutations will likely affect protein structure or function, depending on their location in the protein sequence. Mutations early in the gene sequence will likely have a greater effect on protein function than mutations that occur toward the end of the protein. There is a recommended nomenclature for naming nucleotide changes in clinical reports. Mutations are indicated by the location of the change in the nucleotide sequence, followed by the original nucleotide, an arrow, and finally, the replacement nucleotide. For example, a mutation that replaces a guanidine with an adenosine nucleotide at position 2175 would be expressed as: 2175G→A. The term may be preceded by further notations (g., c., r.) to indicate whether the mutation is in genomic DNA, complementary DNA from the mRNA sequence, or RNA, respectively. For example, in large databases, the mutation just described might be denoted as: c.G2175A. Changes in the amino acid sequence are indicated by the original amino acid and the location in the protein, followed by the substituted amino acid. For example, a replacement of a glycine with a valine at the 339th amino acid in a protein would be expressed as G339V or p.G339V, using the single letter code for the amino acids (see Table 12--1). These expressions are designed to avoid confusion when referring to amino acid or nucleotide changes because the letters A, C, G, and T are also used in the single-letter amino acid codes. Polymorphisms Structurally, mutations, variants, and polymorphisms are the same thing---changes in the reference amino acid or nucleotide sequence. Alterations in DNA or protein sequences shared by at least 2% of a natural population are considered polymorphisms. The different versions of the affected sequences are referred to as alleles. Polymorphisms can involve a single base pair (single-nucleotide polymorphisms or SNPs) or millions of base pairs. Polymorphic changes may or may not have phenotypic effects. Deleterious phenotypic changes are usually limited so that they do not reach the required frequency in a population; however, some polymorphisms are maintained because they are also associated with a beneficial phenotypic effect. A well-known example of this is the A to T base substitution in the beta-globin gene on chromosome 11 that causes sickle cell anemia. This DNA substitution results in the replacement of glutamic acid (E) with valine (V) at position 6 in the protein sequence (E6V). The mutation results in abnormal red blood cells that do not circulate efficiently. The deleterious effect has likely been maintained in the population because it is balanced by a beneficial phenotype of resistance to Plasmodium species, which cause malaria. A highly polymorphic region in the human genome is the major histocompatibility (MHC) locus on chromosome 6. The different nucleotide sequences result in multiple versions or alleles of the human leukocyte antigen (HLA) genes in the human population. These alleles differ by nucleotide sequence at the DNA level (polymorphisms) and by amino acid sequence. Each person will have a particular group of HLA alleles, which are inherited from his or her parents. The HLA proteins coded for by these alleles play important roles in the immune response and allow the immune system to differentiate "self" from "non-self" ( 3). The recommended nomenclature for HLA alleles is discussed in Chapter 16. Other highly polymorphic areas of the genome include the genes coding for the antibody proteins and antigen receptor proteins in B cells and T cells, respectively. Polymorphisms are introduced in each cell through cell-specific genetic events (gene rearrangements), followed by enzymatically catalyzed sequence changes (somatic hypermutation). These sequences differ from cell to cell, allowing for the generation of a large repertoire of antibodies and antigen receptors to better match any foreign antigen (s 4 and 5). Polymorphisms are found all over the human genome. Although there are millions of SNPs, larger polymorphic differences occur less frequently. Polymorphisms that create, destroy, or otherwise affect sequences in DNA that are recognized by nuclease enzymes (restriction enzymes isolated from bacteria) are detected as restriction fragment length variations or polymorphisms (RFLPs) that differ among individuals. Repeat-sequence polymorphisms, such as short tandem repeats (STRs) and variable-number tandem repeats (VNTRs), are head-to-tail repeats of a single base pair to more than 100 bp repeat units. STRs and VNTRs can be detected as RFLPs or by using amplification procedures. STR testing has replaced RFLP testing for human identification (DNA fingerprinting in forensics) and HLA typing for parentage testing. STRs and VNTRs are the markers commonly used to follow engraftment of donor cells into recipient blood and bone marrow after allogeneic bone marrow transplantation. In addition to the nuclear genome, mitochondria, located in the cytoplasm of eukaryotic cells, carry their own genome. The mitochondrial genome is circular, containing about 16,500 bp. Polymorphisms are also found in two regions of mitochondrial DNA sequences (hypervariable regions). These polymorphisms are not transcribed into RNA and do not affect protein structures. They are used for maternal lineage testing, because all maternal relatives share the same mitochondria and so have the same mitochondrial polymorphisms. Electrophoresis Analysis of DNA for mutations and polymorphisms is performed in a variety of ways. Many of these laboratory procedures use electrophoresis to observe the sizes or amounts of nucleic acid. Electrophoresis is the movement of particles under the force of an electric current. Particles can move through gas, liquid, or even solid phases. Gel Electrophoresis For nucleic acids, a semisolid matrix or gel is used to sieve the nucleic acid polymers. There are two types of gels used for nucleic acid analysis: agarose and polyacrylamide. Agarose gels are natural polymers of agarobiose, a disaccharide found in plants. Polyacrylamide gels are synthetic polymers of acrylamide and bis-acrylamide. These synthetic polymers are more precisely designed for high-resolution separation, that is, distinguishing differences in nucleic acids as small as one nucleotide. Polyacrylamide gels are also used for protein resolution by size or charge. In contrast, agarose gels do not have such high resolution but are less expensive and less toxic to use than acrylamide. Agarose gels are useful for standard laboratory separations of nucleic acids of 50 bp or more. Agarose in low concentrations is used to separate very large nucleic acids of tens of thousands of base pairs. Under the force of an electric current, nucleic acids, which are negatively charged, will move from the negative pole (cathode) to the positive pole (anode). Smaller (shorter) nucleic acid chains will move faster through the gel matrix than larger ones. The shorter chains will appear below longer ones when the nucleic acid samples are visualized in the gel by staining. A standard molecular weight marker of nucleic acid chains of known sizes run with the test samples can be used to estimate the size in bases or base pairs of the test nucleic acids. The proper type and concentration of gel are determined based on the expected sizes of the nucleic acids to be separated. Agarose gels are frequently prepared from powdered agarose dissolved and melted in a buffer solution that will carry the electric current (running buffer). There are a variety of buffer solutions used for different types of nucleic acids and different gel types. The agarose suspension is heated to a clear liquid, poured into a mold, and allowed to cool and polymerize. Because powdered acrylamide is toxic, most laboratories purchase predissolved acrylamide solutions or polymerized gels. Liquid acrylamide solutions require the addition of a nucleating agent and catalyst in order to solidify. For both agarose and polyacrylamide gels, a comb is placed in the liquid gel to form wells at one end of the gel for loading of the samples. After solidifying, the gels are placed in a bath of running buffer. To detect the nucleic acid, a fluorescent stain (ethidium bromide, SYBR green, or others) can be mixed with the gel solution, placed in the gel bath, or used to soak the gel after the electrophoresis is complete. The nucleic acid sample to be separated is mixed with a loading solution that contains a density agent (glycerol or Ficoll) and a visual dye, such as bromophenol blue. The density agent allows loading of the sample into the wells of the gel submerged in the running buffer. The visual dye aids in seeing the sample while loading into the well and during electrophoresis. The gel bath is connected to a power supply that will establish a current between platinum wires at the top and bottom of the gel, the two poles of the gel bath. The nucleic acids will move under the force of the current, working their way through the gel matrix at a speed that depends on their size. After electrophoresis, the nucleic acids can be visualized through the fluorescent dye excited by ultraviolet light. Nucleic acid chains will appear as lines or bands on the gel (Fig. 12--9A). The distance from the loading well to the band will be inversely proportional to the size of the nucleic acid. Capillary Electrophoresis Capillary electrophoresis is a more sensitive, semiautomated type of electrophoresis, separating particles in a gas, liquid, or gel. Because nucleic acids do not resolve in solution, a gel or polymer is inserted into the capillary to sieve the nucleic acids (capillary gel electrophoresis). Capillary gel electrophoresis instruments range from a single capillary to 96 capillaries. Multiple samples can run through each capillary, as the instruments are capable of detecting fluorescent signals at more than one wavelength. For detection, DNA chains must carry a fluorescent label (a covalently attached molecule that emits fluorescence). A laser inside of the capillary instrument excites the fluorescent labels as they move through the capillary. The dyes emit fluorescence that is detected and transferred to a computer as an electrical signal. The signals are displayed as peaks of fluorescence (Fig. 12--9B).-9 After gel electrophoresis (A), nucleic acids are visualized by detection of a nucleic acid-specific fluorescent dye, such as ethidium bromide or SYBR green. The agarose gel shown has six lanes, where samples were loaded. The first lane (1) shows the molecular weight standard. Lanes 2 to 5 show sample DNA. The size of the DNA fragments can be estimated by comparing how far they migrated in the gel compared with the molecular weight standard. A negative (blank) control is in lane 6. In capillary electrophoresis (B), the gel is replaced by an electropherogram. Instead of bands, peaks appear, representing the nucleic acid fragments (top four rows). The last row on the electropherogram shown is the negative control. Peaks of a molecular weight standard are shown at the top. From it, the instrument computes and displays the lengths of the nucleic acid fragments in base pairs (boxes beneath each peak). All the fragments shown and the molecular weight marker were run simultaneously through a single capillary. The preparation of capillary gel electrophoresis involves loading premixed polymer solution and buffers into the electrophoresis instrument along with the capillary (or array of capillaries for multicapillary systems). The polymer is automatically injected into the capillary by the instrument. Nucleic acid samples are diluted in formamide to denature the DNA into single strands, and the molecular weight standard is added directly to the sample. The samples are placed into the instrument either in separate tubes or in a plate format. Molecular weight standards are mixed with each sample. The samples enter the capillary by an electrokinetic process that attracts the negatively charged DNA to the end of the capillary submerged into the tube or plate well containing the sample. During electrophoresis, the nucleic acids sieve through the capillary as they would in a gel. The shorter fragments move faster than the longer ones. As each labeled nucleic acid chain passes by the detector, a peak is generated by the computer. The molecular weight markers move through the same capillary with the sample, enabling the instrument to automatically assess the size of the fragments in base pairs. Molecular Analysis Nucleic acid tests are designed to detect changes in the DNA sequence (mutations and polymorphisms) or to measure differences in amounts of RNA synthesized. There are four main approaches to nucleic acid analysis: strand cleavage methods, hybridization methods, amplification methods, and sequencing. Strand Cleavage Methods Specific Procedures Restriction Enzymes One of the first methods for analysis of DNA was restriction enzyme mapping. Restriction enzymes are endonucleases that will separate the phosphodiester bonds between nucleotides in DNA. These endonucleases recognize and bind to specific nucleotide sequences in the DNA so that they will only separate the DNA at those locations. Restriction enzymes used in clinical laboratory methods recognize palindromic sites, that is, nucleotide sequences that read the same 5′ to 3′ on both strands of the DNA, for example: 5′GAATTC3′ 3′CTTAAG5′ which is the recognition site for the restriction enzyme EcoR1. Restriction enzymes are isolated from bacteria, where they serve as part of a primitive immune system that allows the bacteria to recognize their own DNA and degrade any incoming foreign DNA. Restriction enzymes are named for the organisms from which they are isolated. EcoR1 was the first enzyme isolated from E. coli, strain R. HindIII is the third enzyme isolated from Haemophilus influenzae, strain d. There are hundreds of restriction enzymes with unique binding and cleavage sites. DNA is characterized based on the pattern of fragments produced after incubation with restriction enzymes and electrophoresis. DNA with a different sequence will yield different-sized fragments characteristic of that DNA (Fig. 12--10). Early work in recombinant DNA technology relied on these types of studies. Today, RFLP analysis is applied to epidemiological studies of microorganisms and identification of resistance factors carried on extrachromosomal DNA (plasmids) in the cell.-10 Restriction enzyme mapping characterizes DNA by the pattern of fragments generated when the DNA is cut with restriction enzymes. DNA sample A has two restriction sites (arrows), whereas the DNA sample B has only one. When these two DNAs are digested with the restriction enzyme, DNA A will yield three fragments, and DNA B will yield two. These band patterns are a characteristic of the two DNAs. (M = molecular weight standard, used for sizing the DNA fragments) CRISPR-Cas9 DNA analysis using restriction enzymes is limited to sequences recognized by these enzymes. Another type of restriction system found in archaea, gram-negative, and gram-positive bacteria uses a common enzyme guided by RNA to specific sites. Clustered regularly interspaced short palindromic repeats (CRISPRs) are classes of repeated DNA sequences found in microbial DNA. The repeated sequences are interrupted by spacer sequences matching regions extracted from invading plasmids or viruses. These spacer sequences serve as adaptive immunity with memory of the invading DNA. The locus also encodes the CRISPR-associated protein (Cas) enzyme. To fend off an invader, short RNA sequences transcribed from the CRISPR spacer regions guide the Cas enzyme to the matching invading DNA. CRISPR/Cas9 has been used in the laboratory to alter DNA at user-defined locations by substituting synthetic RNA of a desired sequence to guide the Cas enzyme. The synthetic RNA leads the Cas9 endonuclease to the site of choice, providing the specificity of restriction enzymes with the versatility of guiding cuts to any sequence site. CRISPR RNA can also lead transcription activators, repressors, gene promoters, or reporter molecules to target sequences. CRISPR has been utilized for DNA analysis, gene therapy, and genome editing. Clinical applications of this system are under development. Other types of cleaving enzymes, such as those that only digest single-stranded nucleic acids or those that recognize folded nucleic acids, are also used to screen for mutations and polymorphisms. Hybridization Methods Specific Procedures Restriction enzyme cleavage methods are highly informative for investigating small genomes, such as those of microorganisms or plasmids. For complex genomes, such as human DNA, such analyses are not practical, as the DNA is too large and complex to generate readable fragment patterns. How does one analyze specific DNA regions in a complex genome by RFLP without first cloning the region of interest? This question was addressed by Edwin Southern in the mid-1970s. The significance of his invention, the Southern blot, was that informative studies could be performed directly on large and complex genomes by cleaving the DNA into smaller fragments with restriction enzymes, separating the fragments by gel electrophoresis, and identifying the region of interest through hybridization with labeled probes (short nucleic acids that bind to complementary sequences). Hybridization involves the binding of two complementary strands of nucleic acids, in this case, the template strand and a probe. A variation of the Southern blot, called the northern blot, was subsequently developed to analyze RNA structure and expression. Northern blots were mostly research tools and not used routinely for diagnostic purposes. Clinical Correlations Western Blot The western blot was used for many years as a confirmatory test for the presence of antibodies to HIV, the cause of AIDS, and to Borrelia burgdorferi, the cause of Lyme disease. Although the western blot has been replaced by less labor-intensive methods in the clinical laboratory, this highly specific method is still widely used in research laboratories for protein analysis (s 21 and 24). Detection of proteins and protein modifications can be done by a method known as the western blot. In the western blot procedure, serum, cell lysate, or extracted proteins are separated by gel electrophoresis and blotted to a membrane. The probes for western blot are polyclonal or monoclonal antibodies specific for the proteins of interest. Western blots may also be probed with biological fluids such as serum to detect the presence of antibodies produced in response to infection. Detection is performed with secondary antibody--enzyme conjugates and color- or light-producing substrates. Array Methods Southern blotting and its variations allowed assessment of one or a few molecular targets on as many samples as the gel system would allow. As knowledge of genetic networks and pathways grew, it became apparent that informative studies should include simultaneous analysis of many genes or proteins to assess the true biological state of a cell or an organism. Thus began the study of genomics. Genomics refers to the analysis of hundreds to thousands of targets or whole genomes, rather than single genes. The first methodology to perform these studies involved reverse-dot-blot hybridization, that is, hybridization of a labeled sample to unlabeled immobilized probes spotted or arrayed on a solid support. Modern arrays can carry up to hundreds of thousands of probes. There are three basic types of arrays: comparative genomic arrays, RNA expression arrays, and high-density oligonucleotide or SNP arrays. Comparative genomic hybridization arrays are used to detect amplifications or deletions in DNA (Fig. 12--11). Gene expression (mRNA synthesis) is measured using expression arrays, where mRNA from the test material is converted into labeled cDNA, which is hybridized to the probes. SNP arrays have single-nucleotide resolution and can even be used to determine DNA nucleotide sequence. Generally, thousands of targets with probes bound to a very small area, such as a microscope slide, is referred to as a microarray. Microarrays use highly specific unlabeled probes attached directly to a solid support. The support can be glass slides or beads (bead arrays). The test sample (nucleic acids or proteins isolated from cultures, cells, or body fluids) is labeled and hybridized to the many immobilized probes. Microarrays are used for a variety of applications, including detection of chromosome microdeletions by virtual karyotyping and gene-expression profiling. In the former method, genomic DNA is assessed for loss or gain of genetic material at specific chromosomal locations compared with a normal reference sample. In the latter method, the levels of mRNA transcribed from thousands of genes are compared with normal reference samples to look for up- or down-regulation of gene transcription. -11 For array analysis, unlabeled probes are immobilized and hybridized to labeled sample material (green). A reference material (red) is hybridized to the same array. The results of the array are relative test:reference colors. In this example, a green color indicates amplified gene regions, and the neutral yellow colors indicate no amplification or deletion of those regions. A lack of red color, which would indicate deletion of a gene region, is seen. -12 (A) Three of 100 to 400 bead colors, each with antibody to a different analyte. The presence of multiple targets can be detected by the unique colors of the beads that have associated fluorescence from the secondary antibody. Flow cytometry is used to assay each bead separately for bound fluorescence. (B) For nucleic acid analysis, bead array antibodies are replaced with single-stranded oligonucleotides complementary to the test nucleic acid. If present, biotinylated sample DNA will hybridize to the sequences, and the biotin-specific conjugate will generate a signal. For array analysis, sample labeling is fluorescent, allowing dual detection of the test sample and a reference sample that is hybridized to the array along with it (see Fig. 12--11). This results in measurement of increased or decreased amounts of test material relative to the normal reference. In bead array systems, beads carry fluorescent labels specific to the probe they carry so that bound sample, if present, can be detected in a flow cytometric method. Bead array assays are based on preparations of fluorescent beads of 100 to 400 different fluorescent "colors." Each color bead is attached to an antibody or a nucleic acid probe that will bind specifically to the target protein or nucleic acid sequences. The target nucleic acid molecules may be directly labeled, or for protein targets, a secondary antibody conjugated to a fluorescent signal may be used to detect the presence of the target (Fig. 12--12). Because many beads are used, each bound to a different antibody or probe, multiple targets can simultaneously be detected in a single assay run; in other words, a multiplex assay is performed. Clinical applications include HLA typing ( 16) and respiratory virus panels. In Situ Hybridization In situ hybridization (ISH) refers to detection of targets in place as they appear in tissues, cells, and subcellular structures. Labeled probes are used to bind or hybridize to the targets. ISH is frequently used in pathology studies of tissue and cell suspensions. Immunohistochemistry is a type of ISH using labeled antibodies to detect the presence of clinically significant protein targets, such as those expressed by tumor cells. Probes for these tests are monoclonal antibodies linked to enzymes, such as horseradish peroxidase, that produce visible signals from chromogenic substrates. Alternatively, enzyme-linked secondary antibodies recognizing the primary antibody isotype may be used. Positive and negative controls must be included in ISH testing to ensure accuracy of the results. Normal tissue that expresses the protein target should serve as the positive control, whereas an adjacent section cut from the test tissue without the addition of the primary antibody and tumor tissues that do not express the antigen should serve as negative controls. Ideally, the control tissues are processed with the test tissue. Control tissues processed differently from the test tissue validate reagent performance but do not verify the tissue preparation. If staining of positive control tissue is not satisfactory, or if unwanted staining occurs in negative controls, all results with the patient specimen should be considered invalid. Clinical Correlations Programmed Cell Death Ligand (PD-L1) The programmed cell death ligand (PD-L1) is a transmembrane protein that suppresses the adaptive immune response when it is bound to a receptor on activated lymphocytes and dendritic cells called programmed cell death protein 1 (PD-1). PD-L1 is found on some tumor cells and is thought to block recognition of the abnormal cells by lymphocytes that express PD-1. Detection of PD-L1 by immunohistochemistry guides the use of immunotherapy for those tumors that express the ligand. Fluorescence in situ hybridization (FISH) methods use fluorescently labeled probes and require specialized microscopes equipped to detect the emitted fluorescent signals. FISH is commonly performed to detect specific chromosome abnormalities, such as microdeletions or gene amplifications. In these methods, probes ranging in size from a few thousand to hundreds of thousands of bases long are covalently attached to the fluorescent dye. FISH can be performed on nondividing (or interphase) cells or directly on metaphase chromosomes from dividing cells. The DNA from the sample is denatured into single strands. The probes are applied to prepared slides of the cells or chromosomes, where they hybridize to their complementary sequences. The resulting signals indicate if the targeted gene or region is abnormal. In addition to the test probes, reference probes that target the centromeres of selected chromosomes are used to identify the chromosomes of interest while assessing deletion or amplification (Fig. 12--13). ISH methods are sensitive to the buffer and temperature conditions of hybridization, a concept referred to as stringency. Protocols must be strictly followed to avoid false-positive results caused by nonspecific binding of probes or false negatives caused by failure of the probe to bind. Array methods (comparative genome hybridization) complement FISH testing in cases of multiple or complex genetic abnormalities as well as deletions and amplification of genes. Amplification Methods Specific Procedures The most frequently used methods in molecular diagnostics involve some aspect of amplification, that is, copying of nucleic acids. Previously performed in vivo using replication of plasmids carrying cloned fragments in bacteria, the development of the in vitro PCR by Kerry Mullis greatly facilitated and broadened the potential applications of gene amplification. PCR was quickly followed by other target-amplification methods, such as reverse transcriptase PCR (RT-PCR), transcription-mediated amplification (TMA), and strand displacement amplification (SDA). PCR has also facilitated the development of DNA sequencing assays.-13 FISH analysis of the epidermal growth factor receptor gene. The gene probe is labeled orange, whereas a probe complementary to the centromere of chromosome 7 is labeled green. Normally, there are two chromosomes, each carrying one gene in each nucleus (left). The image on the right shows gene amplification with multiple orange signals associated with single green signals. Polymerase Chain Reaction (PCR) PCR is an in vitro DNA replication procedure. A PCR reaction includes all the necessary components required for DNA replication: the sample containing the DNA template to be copied, oligonucleotide primers to prime the synthesis of the copies, the four deoxyribonucleotides (dNTPs), DNA polymerase to covalently join the dNTPs, and buffer containing mono- and divalent cations with a pH optimal for the polymerase activity. The oligonucleotide primers are key components to the specificity of the PCR reaction. Primers are synthetic single-stranded nucleic acids, usually 18 to 30 b in length. They are complementary to sequences flanking the region of the template DNA to be copied (Fig. 12--14). For many procedures, premixed PCR reagents (deoxyribonucleotides, primers, and buffer) are supplied from manufacturers to which only the template DNA and, in some cases, enzyme are added. The PCR reaction mix is subjected to an amplification program consisting of a designated number of cycles. A cycle is comprised of temperature changes. A standard three-step PCR cycle includes (1) a denaturation step to separate the double-stranded DNA into single strands (94ºC--96ºC), (2) an annealing step to allow binding of the primers (50ºC--70ºC), and (3) an extension step in which complementary nucleotides are added to the 3′ end of the primers to complete DNA synthesis (68ºC--72ºC). In a standard cycle, each of these temperatures will be held for 30 to 60 seconds. PCR cycles vary, depending on the target DNA and the protocol. The cycle is repeated 20 to 50 times depending on the assay. The set of repeated cycles makes up the PCR program. The program may also include an initial 5- to 15-minute incubation at the denaturation temperature to activate specialized DNA polymerases, which are designed to remain inactive at room temperature. A final 7- to 10-minute step at the extension temperature may also be included after the last cycle to allow complete copying of the products. For convenience, a hold at 4ºC is usually added to the end of the amplification program.-14 In the PCR reaction, short, single-stranded primers hydrogen-bond to complementary sequences flanking the region of interest. The PCR reaction will produce millions of copies of the desired sequences. Note: The image is shortened. Primers are usually 18 to 30 b long, and the PCR products range from 50 to more than 1,000 bp. The instrument used to carry out the amplification program is called a thermal cycler. There are a variety of thermal cyclers configured for different applications. In general, thermal cyclers are either chamber-based or block-based. In chamber-based cyclers, the sample tubes are subjected to the amplification program temperatures through the surrounding air in the chamber. In block cyclers, sample tubes are placed in a metal block that is heated and cooled according to the amplification program. Amplification programs may include the time it takes to achieve the different temperatures (ramp speed). Ramp speed will affect the efficiency of the amplification as well as the time required to complete the program. PCR amplification produces millions of copies or amplicons of the DNA region of interest. At 100% PCR efficiency, the number of copies will be 2n, where n is the number of cycles (20--50) in the amplification program. The products of the PCR reaction are visualized by gel or capillary gel electrophoresis, as the last part of a PCR procedure (Fig. 12--15). For many tests, the presence, absence, or size of a PCR product is the test result. For other applications, PCR products are directly placed into subsequent reactions, such as restriction enzyme analysis or sequencing. A variety of modifications have been made to the PCR reaction. RT-PCR starts with an RNA template. Complementary or copy DNA (cDNA) is synthesized from the RNA in a separate step using reverse transcriptase, an RNA-dependent DNA polymerase. The cDNA serves as the template for the PCR reaction. Alternatively, enzymes that copy both RNA and DNA are used in simultaneous RT and PCR reactions that do not require a separate RT step. RT-PCR is a method of analysis for cellular RNA or qualitative detection of RNA viruses, such as HIV and hepatitis C virus (HCV). PCR primer design introduces additional flexibility into the PCR reaction. In sequence-specific primer PCR (SSP-PCR, also called amplification refractory mutation system PCR, or ARMS-PCR), primers are designed so that they will end on a potentially mutated or polymorphic base pair. Annealing of the last base at the 3′ end of the primer is critical for polymerase activity. If the last base of the primer is not complementary to the template, the DNA polymerase will not recognize the primer as a substrate for extension, and no PCR product will be produced. SSP-PCR is a common approach used to detect mutations and polymorphisms, such as in HLA typing ( 16).-15 Detection of PCR products by gel electrophoresis. Unlike the 3′ end of primers, the 5′ end does not have to be complementary to the template. This allows attachment of noncomplementary sequences containing restriction enzyme recognition sites or RNA polymerase-binding sites to PCR products. The amplicons can then be conveniently inserted into plasmids for biological analyses or transcribed into RNA and translated into in vitro transcription or translation systems. Labels in the form of fluorescent molecules, biotin, or other molecules may also be covalently attached to the 5′ end of primers. This allows capture immobilization of the PCR products or detection in capillary electrophoresis systems (Fig. 12--16). Quantitative PCR (qPCR). Standard PCR results are interpreted as the presence, absence, or size of the PCR product, but quantification of starting material is not easily measured. In 1993, Higuchi et al. demonstrated that target quantification could be achieved by observing the accumulation of PCR product in real time during amplification. Both DNA and RNA targets can be measured by qPCR. For RNA, complementary DNA made from the RNA using reverse transcriptase as the input material. Although originally termed "real-time PCR" or "RT-PCR," the preferred term is now quantitative PCR or qPCR to avoid confusion with reverse transcriptase PCR (also RT-PCR). To be detected in real time, the qPCR product was followed initially by photography and then by using a fluorescent dye specific for double-stranded DNA (ethidium bromide) at intervals during the amplification program. A less toxic DNA-specific dye, SYBR green, is now used for this purpose. While primers determine the intended target, more specific detection of a product is achieved with probes rather than DNA binding dyes. There are four types of probes in general use: fluorescence resonance energy transfer (FRET), TaqMan, molecular beacon, and scorpion probes (Fig. 12--17). These probes hybridize by sequence complementarity in order to generate fluorescent signals from the accumulating PCR products. SYBR green is not specific to the sequence, so artifacts of the PCR reaction (misprimes and primer dimers from primer self-amplification) will also produce a signal. Because the probes are sequence-specific, they provide higher specificity for the intended product than SYBR green. Accumulation of PCR product detected by TaqMan probes through 50 PCR cycles is shown in --18. The fluorescence depicted on the y-axis is the fluorescence signal from the dye or probe. The fluorescence plotted versus cycle number generates a curve similar to a bacterial growth curve, with a lag phase, a log phase, a linear phase, and a stationary phase. The length of the lag phase is assessed by counting the number of PCR cycles required to reach a threshold level of fluorescence. The cycle at which the sample fluorescence reaches this value is called the threshold cycle or Ct. As seen in a standard curve of dilutions of known target nucleic acid shown in --19, there is an inverse relationship between the amount of target and the Ct value. For test samples, target is quantified by converting its Ct to the number of DNA copies in the starting sample using the standard curve. An internal amplification control is included in each reaction. This control is a gene target that is always present at a constant level. For qPCR and RT-qPCR, internal controls confirm that negative results are true negatives and are not because of amplification failure during the PCR. In RT-qPCR, the level of target quantified in this way is expressed relative to an internal amplification control. Just as PCR stimulated a wide variety of test methods and applications, qPCR has also been modified to address a variety of clinical questions. Widely used applications of qPCR and RT-qPCR include detection of microorganisms, especially viruses and other pathogens that are difficult or dangerous to culture in the laboratory; tumor-associated gene expression; and tissue typing. Multiplex qPCR methods are performed to assess multiple targets simultaneously, such as in the analysis of expression of multiple genes. qPCR testing can also be applied to the measurement of donor bone marrow in the recipient after a bone marrow transplant. Digital Droplet PCR. Another approach to qPCR is digital droplet PCR. Unlike the relative quantification of qPCR, digital droplet PCR provides absolute quantification; that is, there is a numerical expression of the number of molecules in the sample (in contrast to the number of molecules relative to a control). Furthermore, digital droplet PCR measurements are made after the amplification program has finished (endpoint), and therefore, they are not affected by variations in PCR efficiencies that may be encountered with different primers and targets.-16 Detection of PCR products by capillary gel electrophoresis. One PCR primer (forward or reverse) is covalently attached to a fluorescent dye, such as fluorescein. The double-stranded products are denatured and diluted (left). Only the single strand of the PCR product with labeled primer will be detected (center). The output from the capillary instrument is an electropherogram (right) showing peaks of fluorescence that are analogous to band patterns on gel electrophoresis. Digital droplet PCR is based on preparing a limiting dilution of sample template molecules into individual droplets of reaction buffer in oil, followed by separate amplification of each molecule (Fig. 12--20). When sample template DNA in an aqueous reaction mix is added to inert oil, an emulsion forms. The droplets in oil act as individual reaction chambers. The emulsion is then subjected to a PCR amplification program. After amplification, droplets that received a template molecule will contain product, and droplets without template will not. The droplets in the oil emulsion are counted by microfluidics to determine how many of the droplets contained template and how many did not. Instrument software translates the positive and negative droplet ratio to the absolute number of target molecules.-17 Probe systems for real-time quantitative PCR (qPCR). FRET probes generate a signal when target (or copies of target) sequences are present. When the donor (D) and reporter (R) probes bind, energy provided to the reporter dye by the donor dye generates a fluorescent signal. TaqMan probes generate a signal as the target DNA is copied. This results in the degradation of the probe, releasing the reporter dye (R) from the quencher (Q), allowing the reporter dye to fluoresce. Molecular beacons also contain a reporter (R) and quencher (Q), which are separated in the presence of target sequences; these hybridize to the probe and separate the reporter and quencher. The bottom part of the figure shows a scorpion probe covalently attached to a primer. With each replication of target, the scorpion probe is opened, releasing a reporter from a quencher dye and generating a signal. Unlike the other probes, the scorpion signal is covalently attached to the product.-18 Real-time quantitative PCR signal generated from a TaqMan probe. The normalized fluorescence (ΔRn) is plotted against PCR cycles 1 to 50. The threshold cycle is indicated by the green line. The length of the lag phase is inversely correlated with the amount of starting material.-19 Ct values (y-axis) were determined for serial 10-fold dilutions of a synthetic target of known concentration (x-axis). The resulting standard curve is used to convert Ct values of test samples to concentration.-20 Digital droplet PCR quantifies the absolute number of target molecules (left) by limiting dilution into an emulsion of individual aqueous droplets, each containing one or no template molecules. After the PCR, droplets with product are counted, and the ratio of empty droplets to product-containing droplets is calculated to determine the absolute number of molecules in the specimen. Top: high concentration of template; bottom: low concentration of template. The linear response of the digital PCR quantification allows the detection of small changes in target number that is not possible by qPCR. Digital PCR has been applied to analysis of gene copy number variation, detection of rare mutations, and infectious disease. Transcription-Based Amplification A variety of amplification methods have been developed since the introduction of PCR. An advantage of several of these methods is that temperature cycling is not required. Kwoh and colleagues developed a transcription-based amplification system in 1989. Commercial variations of this process include transcription-mediated amplification (TMA), nucleic acid sequence--based amplification (NASBA), and self-sustaining sequence replication (3SR). The methods are similar, with variations in enzyme systems. For transcription-based amplification systems, RNA is the target as well as the primary product. A complementary DNA copy (cDNA) is synthesized from the target RNA, and then transcription of the cDNA produces millions of copies of RNA products. The RNA products transcribed from the cDNA can also serve as target RNA for synthesis of more cDNA (Fig. 12--21). The RNA products are detected by chemiluminescence with acridinium ester or, in the case of NASBA, molecular beacon probes. In contrast to PCR, TMA is an isothermal process, which does not involve the repeated heating and cooling required for PCR. Targeting RNA allows for the direct detection of RNA viruses, such as HCV and HIV. Targeting the RNA of organisms with DNA genomes, such as Mycobacterium tuberculosis, is more sensitive than targeting the DNA because each microorganism makes multiple copies of RNA, whereas it has only one copy of DNA per cell. Detection of Chlamydia trachomatis in genital specimens and cytomegalovirus (CMV) quantification in blood are additional applications for TMA. The high sensitivity of TMA also makes it suitable for screening for viral infections in donor blood.-21 Transcription-mediated amplification (TMA) targets RNA. In the first step, a cDNA:RNA hybrid is synthesized by reverse transcriptase using a primer with a tail (that will ultimately form the binding site for a later enzyme). Hybridized (but not single-stranded) RNA is degraded by RNase II, leaving single-stranded DNA. This ssDNA serves as a template for reverse transcriptase to synthesize a complementary strand of DNA, including the primer tail, to complete the binding site for RNA polymerase. The RNA polymerase uses dsDNA as a template to synthesize many copies of RNA. This new RNA can cycle back to step one and repeat the process, resulting in a large amplification of product. Probe Amplification In probe amplification, the number of target nucleic acid sequences in a sample is not changed. Rather, primers are extended or ligated into many copies of detectable probes. Examples of probe amplification are strand displacement amplification (SDA), loop-mediated isothermal amplification (LAMP), and molecular inversion probe amplification (MIP). Strand Displacement Amplification (SDA). SDA is an isothermal amplification process. After an initial denaturation step, the reaction proceeds at one temperature. In SDA, the amplification products are the probes rather than the target DNA. After the target DNA is denatured by heating to 95°C, two primers---an outer and an inner primer---bind close to each other (Fig. 12--22). As the outer primer is extended by DNA polymerase, it displaces the product formed by the simultaneous extension of the inner primer (probe). The probe becomes the target DNA for the next stage of the process. The second stage of the reaction is the exponential probe amplification phase, which involves extension from a nick formed in the strand by a restriction endonuclease enzyme. Loop-Mediated Isothermal Amplification (LAMP). The LAMP system has high specificity and sensitivity for target DNA. In this process, PCR primers carry sequences at the 5′ end that will self-hybridize, forming loops that self-prime in a cyclic manner. An advantage of this process is the shortened run time (fewer than 30 minutes). LAMP methods are used to detect HIV, cytomegalovirus, Staphylococcus aureus, and E. coli. Molecular Inversion Probe (MIP). MIP (also called padlock probe) is another isothermal, highly sensitive detection system. In this method, the ends of the probes bind to target sequences so that, in the presence of target, the probe ends are brought together and ligated to form circles. Circularized probes accumulate based on the amount of target present and are detected by gel electrophoresis. These probes can be further PCR-amplified to increase sensitivity. MIP methods have been applied to detection of S. aureus, Streptococcus mutans, influenza virus, and to RNA typing. Signal Amplification In signal amplification, large amounts of signal are bound to the target sequences that are present in the sample. Branched DNA is an example of a commercially available signal amplification method. Branched DNA. In branched DNA (bDNA) amplification, a series of short, single-stranded DNA probes are used to capture the target nucleic acid and to bind to multiple reporter molecules, loading the target nucleic acid with signal (Fig. 12--23).-22 SDA showing one target strand. (1) Primers bind to single-stranded DNA at a complementary sequence. (2) A polymerase extends the primer from the 3′ end. (3) The extended primer forms a double-stranded DNA segment containing a site for a restriction enzyme at each end. (4) The enzyme binds double-stranded DNA at the restriction site and forms a nick (cutting only one strand of the double helix). (5) The DNA polymerase recognizes the nick and extends the strand from that site, displacing the previously created strand. (6) Each strand can then anneal and continue the process. Because multiple probes hybridize to the target sequences in bDNA, its specificity is enhanced over methods using a single probe or primer to bind to the target. This allows for multiple genotypes of the same virus to be detected by incorporating different probes that recognize slightly different sequences. The bDNA signal amplification assay has been applied to the qualitative and quantitative detection of hepatitis B virus (HBV), HCV, and HIV-1. DNA Sequencing Specific Procedures The function of DNA is to store genetic information. That information is stored in the form of the order or sequence of the four nucleotide bases in the DNA chain. Early in the history of recombinant DNA technology, the idea of sequencing or detecting the nucleotide order of the nucleic acids was actively pursued. Two sequencing methods emerged in the mid-1970s: the Maxam--Gilbert chain breakage method and the chain termination sequencing method developed by Dr. Fred Sanger and colleagues. Sanger or chain termination sequencing quickly gained popularity, as it was not subject to the toxic chemicals and complex interpretation required by the Maxam--Gilbert method. Alternative sequencing methods have since been developed, including pyrosequencing and next-generation sequencing.-23 In branched DNA, signal is amplified through hybridization of complementary probes to the target DNA or RNA. Branched DNA molecules carry multiple signals for each target molecule. A second generation of the method has increased sensitivity because of the binding of additional signals. Chain Termination (Sanger) Sequencing Direct determination of the order, or sequence, of nucleotides in a DNA chain is the most explicit method for identifying genetic mutations or polymorphisms, especially when looking for changes affecting only one or two nucleotides. Chain termination sequencing is a modification of the DNA replication process. It uses modified nucleotide bases called dideoxynucleotide triphosphates (ddNTPs), which differ from dNTPs in that they do not have an OH group at the 3′ carbon of the deoxyribose sugar (Fig. 12--24). In a standard Sanger sequencing reaction, the DNA template to be copied is denatured into single strands, and similar to PCR, all components required for DNA synthesis are added: a primer pair to outline the target DNA, DNA polymerase enzyme, and the four dNTPs. In addition, the four ddNTPs (ddATP, ddTTP, ddCTP, and ddGTP) are added to the mixture. Each ddNTP is labeled with a different fluorescent dye (ddATP green, ddCTP blue, ddGTP black, ddTTP red) so that the products of the sequencing reaction can be distinguished by color. Synthesis will stop if a ddNTP is incorporated into the growing DNA chain (chain termination) because without the hydroxyl group at the 3′ sugar carbon, the 5′-3′ phosphodiester bond cannot be made. Each newly synthesized chain will terminate, therefore, with a ddNTP (Fig. 12--25). The result of the sequencing reaction is a collection of fragments of various sizes, or DNA ladder. The fluorescently labeled DNA ladder is resolved by gel or capillary gel electrophoresis. Gel-based resolution will result in a series of bands of different sizes. The DNA sequence is read from the bottom to the top of the gel by which ddNTP terminated each fragment. Sequencing results from capillary gel electrophoresis are a series of fluorescent peaks, termed an electropherogram (Fig. 12--26). Accuracy of interpretation of sequencing data from a dye terminator reaction depends on the quality of the template (free of residual PCR components), the efficiency of the sequencing reaction, and the purity of the sequencing ladder. Clear, clean sequencing ladders are read accurately by sequencing software, and a text sequence is generated. Software programs report the certainty of each nucleotide base in the sequence and compare test sequences with reference sequences to identify mutations or polymorphisms in the DNA.-A DNA ladder (left) resolved by gel electrophoresis is read from the bottom of the gel to the top of the gel (shortest fragment to the longest). The terminating base is determined by its tube (gel lane). In dye terminator sequencing, the sequence is read automatically as fluorescently labeled fragments migrate through the gel or capillary. Each fragment passes a detector that will generate an electropherogram of fluorescent peaks (right). Sequencing software will produce a text report of the DNA sequence. Sequencing and re-sequencing of known DNA regions are routine laboratory methods where mutations are not always in predicted locations in genes, requiring a survey of most or all of the gene sequences. Sequencing is used extensively in genetics and oncology for definitive identification of gene abnormalities. It is also used for sequence-based tissue typing. Germline or inherited variations in the DNA sequence are readily detected, usually from blood specimens. Somatic (non-inherited) mutations in clinical specimens, such as cancerous tumors, are sometimes difficult to detect, as they may be diluted by normal sequences that mask the somatic change. A Sanger sequencing reaction is performed on a single DNA strand. When a sequence change is detected, the alteration is confirmed by sequencing the complementary strand of the DNA by priming the synthesis reaction on the strand opposite the strand that was sequenced. Alterations affecting a single base pair may be subtle on an electropherogram, especially if the alteration is in the heterozygous form or mixed with a normal reference sequence. For this reason, standard Sanger sequencing is less sensitive for the detection of DNA sequence changes than other methods. Pyrosequencing Pyrosequencing is a sequencing method first developed in the 1980s. The procedure relies on the generation of light (luminescence) when nucleotides are added to a growing strand of DNA. With this system, there are no gels, fluorescent dyes, or ddNTPs. In the pyrosequencing reaction mix, a single-stranded DNA template is mixed with a sequencing primer, enzyme, and substrate. In a predetermined order, the pyrosequencer introduces dNTPs sequentially to the reaction. If the introduced nucleotide is complementary to the base in the template strand next to the 3′ end of the primer, DNA polymerase forms a phosphodiester bond between the primer and the nucleotide, releasing pyrophosphate (PPi) (Fig. 12--27). The released PPi is converted to ATP in the presence of adenosine 5′ phosphosulfate (APS) to energize generation of a luminescent signal. This signal indicates that the introduced nucleotide is the correct base in the sequence. If the dNTP is not complementary to the template, no phosphodiester bond is formed, and no signal is produced. Unincorporated dNTPs are enzymatically removed before the introduction of the next dNTP. The pyrosequencing reaction generates a pyrogram of luminescent peaks associated with the addition of the complementary nucleotides (see Fig. 12--27, bottom panel). Because pyrosequencing produces short- to moderate-length sequence information (up to 100 bases), it is not as versatile as Sanger sequencing, which can produce reads longer than 1,000 bases, especially for sequencing long unknown regions of DNA. Two factors have kept pyrosequencing in use. With increasing identification of clinically important, frequently recurring variants in known gene locations ("hot spots"), it became necessary only to sequence targeted areas, the immediate regions around the nucleotide base change. Because pyrosequencing is less labor intensive than Sanger sequencing, it is more convenient for these types of short sequence analyses. Second, some instruments developed for genomic or next-generation sequencing (NGS) utilize the pyrosequencing chemistry. Next-Generation Sequencing The first human genome was sequenced by chain termination (Sanger) sequencing. The 7-year project involved hundreds of capillary electrophoresis instruments and bioinformatics experts and cost billions of dollars. With NGS technologies, a human genome can now be sequenced by a single sequencer in a few hours. NGS is designed to sequence large numbers of templates simultaneously (massively parallel sequencing), yielding hundreds of thousands of sequences in a single run. These short sequences are then assembled into a complete sequence. NGS is also a metagenomic technology, involving simultaneous sequencing of multiple small genomes, such as mixed populations of microorganisms in the environment or in body fluids. A goal that stimulated the development of NGS technologies was to sequence the human genome for a minimal cost (less than \$1,000), bringing the expense of genomic studies into the realm of clinical analysis. Production of the "\$1,000 genome" has been achieved, and thousands of genomes have been sequenced as part of the 1,000 Genome Project. Not initially included in the sequencing cost were challenges of interpretation, reporting, and data storage. These issues are now included for implementation of NGS in clinical analysis. The first mass-marketed technologies of NGS included pyrosequencing, sequencing by synthesis with reversible dyes, ion conductance, and sequencing by ligation. Pyrosequencing, reversible dye sequencing, and ion conductance sequencing have most frequently been applied to clinical applications. In contrast to chain termination sequencing of PCR products or long templates in large plasmids, NGS procedures begin with short DNA templates, less than 1 kb, usually less than 500 bp. Methods of template preparation include amplification with multiple primer pairs to select regions of interest in multiplex PCR reactions or using emulsion PCR. Alternatively, a set of short fragments can be prepared from enzymatically digested or sonicated genomic DNA. Fragments generated by these methods comprise a library. A library can include selected genes or gene regions (known as targeted libraries or gene panels), only coding DNA or exons (whole exome sequencing), or whole genomes (whole genome sequencing). PCR primers or probes are used to select targeted libraries and exomes. For example, analysis of the MHC locus by NGS begins with PCR selection and amplification of Class I and Class II genes. The PCR products are templates for sequencing. Probes are used similar to primers to select and copy targeted regions for sequencing (Fig. 12--28). Fragmented DNA is prepared from whole genomes or from larger (\>1,000 bp) PCR-selected gene regions. The ends of the DNA fragments are ligated to adapters, that is, short synthetic double-stranded DNA sequences carrying PCR primer-binding sites. The primer-binding sites allow unbiased amplification to occur for all the genomic fragments using a single primer pair.-28 DNA libraries are prepared using fragmentation, probe, or primer selection of specific genes or gene regions. Adaptors carrying primer-binding sites are added to the ends of fragmented DNA. In a second PCR reaction, primers carry short index DNA sequences to identify each sample, as all samples in a run are pooled and sequenced simultaneously.Amplification of the adapter-ligated library is performed with indexing primers. On the 5′ end of indexing primers are short sequences or "bar codes" to assist in organization of the sequence data. A bar code or index is a 6- to 10-b sequence assigned to each sample and gene region in the library. With multiple samples sequenced together in a run, bar codes are used in post-run analysis to associate samples and gene regions with their sequences. After library amplification and indexing, the PCR products are cleaned of residual reaction products, pooled, and introduced to the sequencer. Depending on the sequencing technology, the sequencing may be in solution or immobilized on a solid support. For NGS by pyrosequencing, libraries are generated by emulsion PCR. One primer in each PCR reaction is bound to a solid support (bead). After the PCR reaction, the products are denatured, and the beads carry one strand of the PCR product as sequencing templates into hundreds of thousands of wells of a picoplate. Reagents and nucleotides are introduced into the plate, and independent sequencing reactions result in the release of PPi, which simultaneously generates light signals, as previously described. In ion conductance sequencing, nucleotide order is determined through release of hydrogen ions (analogous to release of PPi in pyrosequencing) by DNA synthesis. Bead-attached library templates prepared as previously described are loaded with sequencing reagents onto a semiconductor or ion chip. When a nucleotide complementary to a template is introduced and incorporated, the release of the hydrogen ion is detected by a pH change in the reaction (Fig. 12--29). The ion chip contains sensors for more than one million wells, which allow parallel, simultaneous detection of hundreds of thousands of independent sequencing reactions. Ion conductance sequencing is faster than other methods that require optical sensing (images). In the reversible dye terminator technology, libraries carry sequences complementary to immobilized primers on a solid support (flow cell). After introduction of the sample to the flow cell, the libraries are amplified by bridge PCR, forming clusters of templates across the flow cell. Sequencing occurs at hundreds of thousands of cluster locations by addition, detection, and removal of each of the four nucleotide labels (Fig. 12--30). For all NGS variations, sequence data in the form of processed images or electrical pulses are collected by instrument software. The sequence quality is assessed and then the sequence is determined. Variants, polymorphisms, or the sequence itself are identified by comparison with stored reference sequences. After a successful NGS run, areas of interest will have been sequenced from a few to thousands of times. The number of times a region is sequenced is called the coverage. Whole human genomes will have lower coverage (30X to 40X, 3 billion bp sequenced) than whole exome sequencing (sequencing of only protein-coding regions; 50X to 100X, 30 to 50 million bp sequenced). Gene panels of a few hundred targeted genes are sequenced with higher coverage than genomes or exomes, depending on the number of genes included in the panel. At least 500X coverage is required to call rare variants, such as those in mixed tissue samples (tumor DNA containing lymphocytes or other normal tissue). Cost is the limiting factor for the degree of coverage used.-29 In ion chip sequencing, the pooled, indexed DNA libraries and sequencing primers are applied to an ion chip in individual micro-DNA synthesis reactions. Nucleotides are introduced sequentially to the chip, and if the introduced nucleotide is complementary to the template sequence, a phosphodiester bond is formed, releasing a hydrogen ion. A sensor detects the resulting pH change correlated with addition of each nucleotide base. -30 For reversible dye terminator sequencing, the pooled, indexed libraries are hybridized to immobilized primers at millions of positions on a solid support (flow cell). Each molecule of DNA is then amplified in place by bridge PCR. The amplified templates are sequenced by synthesis with simultaneous addition of the four fluorescently labeled nucleotides. Nucleotides complementary to each amplified template are added to the end of the growing chain, and an image is made of the flow cell. The fluorescent signals from the nucleotides are removed, followed by addition of the next round of nucleotides. The sequential images are converted to sequence information for each template. A sequence variant (difference from the reference sequence) may be present in some or all of the copies of the sequence covered. The percentage of sequences carrying the variant is the allele frequency in the sample. Somatic variants tend to have low allele frequencies, whereas germline (inherited) sequence changes will comprise 50% or 100% of the covered sequences. In the clinical laboratory, targeted NGS has been most commonly applied to genetics (including pharmacogenomics), oncology, and HLA typing. NGS has advanced the accuracy and extent of HLA typing. NGS is used for typing of the HLA Class I HLA-A, -B, and -C and Class II HLA-DRB1/3/4/5, HLA-DQA1, HLA-DQB1, HLA-DPA1, and HLA-DPB1 genes. The increased amount of information and throughput of NGS allows resolution of ambiguities that were not possible with previous molecular methods, including Sanger sequencing. Ambiguities in HLA typing arise from polymorphisms outside of sequenced areas and from closely spaced polymorphisms that may be on one or separate chromosomes. Clinical Correlations The Cancer Genome Atlas (TCGA) The Cancer Genome Atlas (TCGA) is a national program in which NGS has been used to sequence the nucleic acid isolated from tumors of thousands of patients. The large amount of data has been catalogued and analyzed by bioinformatics to identify common themes among cancer types. The information obtained continues to increase our understanding of cancer and has had a large clinical impact by improving the ways in which cancer is being diagnosed and treated ( 17). Whole exome sequencing has been used to identify variants responsible for inherited disease conditions. Whole genome sequencing is primarily a research tool. Unlike other methods, massive parallel sequencing has the capacity to investigate all known genetic loci for clinically significant alterations. Tumor mutational burden (number of mutations/Mb of sequenced DNA) has been identified as a biomarker for some types of cancer. DNA alterations increase the antigenicity of the tumor, which can affect treatment strategy decisions and disease prognosis. Genetic data have been amassed in large databases, facilitating future sequence analysis. There are also growing databases of somatic variant data used in diagnosis and treatment strategies for cancer and other diseases. Bioinformatics Information technology has had to accommodate the vast amount of data arising from molecular analyses such as high-throughput sequencing and arrays. Bioinformatics merges this biological data with information technology. The interpretation of data such as that generated by NGS requires massive storage space and contributes to continual renewal of previously stored data in organized databases. This database information can then be used to refine interpretation of newly collected data. In NGS, powerful computer data assembly systems are required to organize the short sequence information generated from sequence libraries and identify variants. Several factors affect the identification of sequences and sequence variants. Adequate sample DNA is required to provide the required coverage of regions of interest. The quality of the sequence must be sufficient for confidence in identifying each base at each position of the sequence. For clinical sequencing, a 1/1,000 probability of error is recommended. For targeted gene panels, manufacturers have developed programs to produce a finished report with variants identified automatically directly from the sequencer. Software programs independent of the sequencer have been designed to generate spreadsheet files that allow the user to apply quality, variant type, allele frequency (what percentage of the coverage contains the variant), and other parameters in a process called filtering. Nucleotide or protein sequence data at this stage may be stored as text files in the FASTA or FASTQ format (named after a DNA and protein sequence alignment software package first described in 1985). FASTA files are the text of nucleotide sequences only. FASTQ files include a quality symbol for each base. NGS tends to be more error-prone than Sanger sequencing, mostly because of the library preparation. Errors must be distinguished from true sequence variants in a sample. Variants occurring with a frequency of less than 2% are suggested to be indicators of sequence error. Errors are detected and minimized with adequate coverage, database information, and software design. Once variants are identified, further assessment is required to find their biological significance. Intronic or silent exonic mutations that do not affect protein sequence or conservative changes that do not affect protein structure are not biologically significant. Variants that change protein structure or affect protein epitopes may be, but are not always, significant. For clinical applications, the medical significance of variants and the genes in which they are found must be determined. Variants are classified based on historical sequence database information. Large databases of previously observed variants and phenotypic associations are used for annotating the variants and identifying those that are significant and reportable. In some cases, such as germline genetic disease associations, standard chain termination sequencing is recommended for confirmation of critical variants. A standard nomenclature system developed by the International Union of Pure and Applied Chemistry and the International Union of Biochemistry and Molecular Biology is used to express sequence information so that clear communication and organized storage of sequence data are possible. SUMMARY The two main types of nucleic acids are DNA and RNA. They are polymers made up of chains of nucleotides. Nucleotides of DNA contain a deoxyribose sugar with one of the following bases: adenine, guanine, thymine, or cytosine. RNA is made up of nucleotides containing a ribose sugar bonded to one of four nitrogen bases: adenine, guanine, cytosine, or uracil (instead of thymine). DNA is double stranded and arranged in a double helix, whereas RNA is typically single stranded. In a DNA molecule, specific base pairing occurs: adenine pairs with thymine, and guanine pairs with cytosine. When a DNA molecule replicates, the two daughter strands separate; each is a template for a newly synthesized complementary strand. The high specificity of detection of nucleic acid sequences through complementary base pairing is the basis of all molecular diagnostic testing. The central dogma of molecular biology refers to the fact that DNA serves as the template for messenger RNA, which in turn codes for proteins. Mutations and polymorphisms are changes in nucleotide sequences that may affect specific protein structure and function. Gel and capillary electrophoresis are used in many molecular methods. In these techniques, the negatively charged DNA fragments are separated by size under the force of an electric current in a semisolid gel or polymer solution. Restriction fragment length polymorphisms (RFLPs) are changes in DNA that result in different size pieces when cleaved by restriction enzymes. CRISPR-Cas9 is a gene-editing system that can be used to alter DNA at specific locations. Hybridization is the very specific binding of two complementary DNA strands or a DNA strand and an RNA strand. Often a probe, which has a short known nucleic acid sequence, is used to detect an unknown nucleic acid sequence in a sample. Hybridization techniques include Southern blot analysis, microarray technology for simultaneous assessment of multiple genes, and fluorescent in situ hybridization of specific genetic regions. Amplification involves making many copies of a specific nucleic acid sequence to obtain enough material for laboratory identification. Amplification methods include polymerase chain reaction (PCR), reverse transcriptase PCR (RT-PCR), quantitative PCR (qPCR), and digital PCR. All these methods amplify the target DNA. In transcription-mediated amplification (TMA), the target is RNA instead of DNA. A cDNA copy is made of the original RNA and used to produce millions of RNA copies. Strand displacement amplification (SDA) involves amplification of a probe rather than the original target DNA. Other probe amplification methods are loop-mediated amplification (LAMP) and the molecular inversion probe (MIP) method. Branched DNA represents a signal amplification method in which multiple probes attach to the original target sequence DNA. DNA sequencing involves determining the order of nucleotides in a DNA chain and is the most specific way of detecting polymorphisms and mutations. The Sanger chain termination sequencing method involves replicating a single DNA strand in the presence of fluorescent-labeled modified nucleotide bases called dideoxy nucleotide triphosphates. DNA fragments of various sizes are generated from the original template, and the sequence is determined by detecting the fluorescent labels. Pyrosequencing is an alternate sequencing method that relies on the generation of light when nucleotides are added to a growing DNA chain. Next-generation sequencing (NGS) methods allow for rapid sequencing of a large number of small DNA templates at one time. The short sequences are then assembled into a complete sequence. NGS can be used to determine the sequence of specific genes using targeted gene panels, the sequence of only the coding regions within DNA (whole exome sequencing), or the sequence of an entire genome (whole genome sequencing) to identify mutations associated with a variety of diseases. Bioinformatics uses information technology to analyze the vast amount of data generated by NGS testing and interprets the data for clinical relevance by comparison with known databases. CASE STUDIES 1\. A patient with a diagnosis of stage III lung cancer underwent surgery, and the tumor tissue was submitted for PD-L1 testing by immunohistochemistry (IHC). A section of tumor was placed in 10% buffered formalin for 24 hours, embedded in paraffin, and thin (4 micron) sections were cut from the fixed tissue in the paraffin block. The tissue sections were tested by IHC in an automated immunostainer using rabbit IgG anti-human PD-L1 primary antibody. Bound primary antibody was detected with an anti-rabbit isotype secondary antibody conjugated to an enzyme-labeled polymer to generate a color stain. Upon pathology review of the stained sections, it was determined that 60% of the tumor cells expressed PD-L1. Questions a\. Describe positive and negative controls that would be used for the detection of PD-L1. b\. What is the immunologic role of PD-L1? c\. What is the clinical significance of the test results, and should the patient in this case receive immunotherapy? 2\. A blood sample from a patient under treatment for HIV infection was submitted for testing. A laboratory test for the presence of HIV by qPCR was performed. The test can detect 50 to 1,000,000 viral copies per mL of plasma. Previous results had shown the presence of the virus at levels of 1,500, 600, 500, 220, and 100 copies per mL during the course of treatment. The results of the qPCR test for the current specimen were negative; however, the internal amplification control for the qPCR test was also negative. Questions a\. How would you interpret these results? 1\. The results are negative, consistent with the patient history. 2\. The results are not interpretable because the amplification control is negative. 3\. The amplification control confirms that there is no contamination. 4\. The results are not correct, based on the previous positive results. b\. The test was repeated, and this time the target (HIV) amplification was negative, whereas the amplification control was positive. How would you interpret these results? 1\. The results show a true negative. 2\. The results are not interpretable because of contamination. 3\. The results are not correct, based on the previous positive results. 4\. The amplification control confirms the presence of HIV. c\. To prepare the test report, the results are entered along with the sensitivity of the test (50 copies/mL). Should these results be reported as 0 copies/mL because nothing was detected by this qPCR test? Chapter 13 Flow Cytometry Flow cytometry is a system in which single cells (or beads) in a fluid suspension are analyzed in terms of their intrinsic light-scattering characteristics. The cells are simultaneously evaluated for their extrinsic properties (i.e., the presence of specific surface or cytoplasmic proteins) using fluorescent-labeled antibodies or probes. Flow cytometers, originally developed in the 1960s, did not make their way into the clinical laboratory until the early 1980s. At that point, physicians started seeing patients with a new mysterious disease characterized by a decrease in circulating T helper (Th) cells. Since that time, flow cytometry has been routinely used for monitoring HIV infection status, as well as immunophenotyping cells, or identifying their surface and cytoplasmic antigen expression. Flow cytometers can simultaneously measure multiple cellular or bead properties by using several different fluorochromes. A fluorochrome, or fluorescent molecule, is one that absorbs light across a spectrum of wavelengths and emits light of lower energy across a spectrum of longer wavelengths. Each fluorochrome has a distinctive spectral pattern of absorption (excitation) and emission. By using laser light, different populations of cells or particles can be analyzed and identified on the basis of their size, shape, and antigenic properties. Flow cytometry is frequently used in leukemia and lymphoma characterizations as well as in the identification of immunodeficiency diseases such as AIDS (see Chapters 18, 19, and 24). Flow cytometry is also used to enumerate hematopoietic stem cells, detect human leukocyte antigen (HLA) antibodies in transplantation, and identify cells undergoing apoptosis. In addition, flow cytometry has been applied in functional assays for chronic granulomatous disease (CGD) and leukocyte adhesion deficiency, fetal red blood cell (RBC) and F-cell identification in maternal blood, and identification of paroxysmal nocturnal hemoglobinuria (PNH), to give just a few examples. A significant advantage of flow cytometry is that because the flow rate of cells within the cytometer is so rapid, thousands of events can be analyzed in seconds, allowing for the accurate detection of cells that are present in very small numbers. Instrumentation The major components of a flow cytometer include the fluidics, the laser light source, and the optics and photodetectors. Data analysis and management are performed by computers. Fluidics For cellular parameters to be accurately measured in the flow cytometer, it is crucial that cells pass through the laser light one cell at a time. Cells are processed into a suspension; the cytometer draws up the cell suspension and injects the sample inside a carrier stream of isotonic saline (sheath fluid) to form a laminar flow. The sample stream is constrained by the carrier stream and is thus hydrodynamically focused so that the cells pass single file through the intersection of the laser light source (Fig. 13--1). Each time a cell passes in front of a laser beam, light is scattered, and the interruption of the laser signal is recorded. Laser Light Source Solid-state diode lasers are typically used as light sources. The wavelength of monochromatic light emitted by a laser in turn dictates which fluorochromes can be used in an assay. Not all fluorochromes can be used with all lasers because each fluorochrome has distinct spectral characteristics. Newer instruments have up to five lasers---red, blue, violet, ultraviolet (UV), and yellow-green---each of which produces different colors when exciting a particular fluorochrome. This allows for as many as 20 fluorochromes, or colors, to be analyzed in a single tube at one time. Because of a cell passing through the laser, light is scattered in many directions. The amount and type of light scatter (LS) can provide valuable information about a cell's physical properties. Light at two specific angles is measured by the flow cytometer: (1) forward scatter (FSC) and (2) side scatter (SSC), also called right-angle LS. FSC is considered an indicator of size, whereas the SSC signal is indicative of granularity or the intracellular complexity of the cell. Thus, these two values can be used to characterize different cell types based on their inherent properties and are considered intrinsic parameters. If one looks at a sample of whole blood on a flow cytometer where all the RBCs have been lysed, the three major populations of white blood cells (WBCs)---lymphocytes, monocytes, and neutrophils---can be roughly differentiated from each other based solely on their intrinsic parameters (FSC and SSC) (Fig. 13--2). Unlike FSC and SSC, which represent light-scattering properties that are intrinsic to the cell, extrinsic parameters require the addition of a fluorescent probe for their detection. Fluorescent-labeled antibodies bound to the cell can be detected with the laser. By using fluorescent molecules with various emission wavelengths, laboratory scientists can simultaneously evaluate an individual cell for several extrinsic properties. The clinical utility of such multicolor analysis is enhanced when the fluorescent data are analyzed in conjunction with FSC and SSC to enable more accurate subtyping. The combination of data allows for characterization of cells according to size, granularity, DNA and RNA content, antigens, total protein, and cell surface receptors. Optics and Photodetectors The various signals (light scatter and fluorescence) generated by the cells' interaction with the laser are detected by photodiodes for FSC and by photomultiplier tubes for fluorescence. The number of fluorochromes capable of being measured simultaneously depends upon the number of photomultiplier tubes in the flow cytometer. The specificity of each photomultiplier tube for a given band length of wavelengths is achieved through the arrangement of a series of optical filters that are designed to maximize collection of light derived from a specific fluorochrome while minimizing interference of light from other fluorochromes. The newer flow cytometers use fiber-optic cables to direct light to the detectors. When fluorescence from labeled antibodies bound to cell surfaces reaches the photomultiplier tubes, it creates an electrical current that is converted into a voltage pulse. The voltage pulse is then converted into a digital signal using various methods, depending on the manufacturer. The digital signals are proportional to the intensity of light detected. The intensity of these converted signals is measured on a relative scale that is generally set into 1 to 256 channels, from the lowest energy level or pulse to the highest level. Sample Preparation Samples commonly used for analysis include whole blood, bone marrow, and fluid aspirates. Whole blood should be collected into ethylenediaminetetraacetic acid (EDTA), the anticoagulant of choice for samples processed within 30 hours of collection. Heparin can also be used for whole blood and bone marrow and can provide improved stability in samples for up to 48 hours. Blood should be stored at room temperature (20°C to 25°C) before processing and should be well mixed before being pipetted into staining tubes. Hemolyzed or clotted specimens should be rejected. Peripheral blood, bone marrow, and other samples with large numbers of RBCs require erythrocyte removal to allow for efficient analysis of WBCs. Historically, density gradient centrifugation with Ficoll-Hypaque (Sigma, St. Louis, MO) was used to generate a cell suspension enriched for lymphocytes or lymphoblasts. However, this method results in selective loss of some cell populations and is time-consuming. Density-gradient centrifugation has mainly been replaced by erythrocyte lysis techniques, both commercial and non-commercial. Samples are treated with lysing buffers to destroy the erythrocytes while leaving the WBCs intact. Tissue specimens such as lymph nodes should be collected and transported in tissue culture medium (RPMI 1640) at either room temperature (if analysis is imminent) or 4°C (if analysis will be delayed). The specimen is then disaggregated to form a single cell suspension, either by mechanical dissociation or enzymatic digestion. Mechanical disaggregation, or "teasing," is preferred and is accomplished by the use of a scalpel and forceps, a needle and syringe, or a wire mesh screen. Antibodies are then added to the resulting cellular preparation and the samples are processed for analysis. The antibodies used are typically monoclonal, each labeled with a different fluorescent tag. Data Acquisition and Analysis Once the intrinsic and extrinsic properties of the cells have been collected, the data are digitized and ready for analysis. Typically 10,000 to 20,000 "events" are collected for each sample. Each parameter can be analyzed independently or in any combination. Graphics of the data can be represented in multiple ways. The first level of representation is the single-parameter histogram, which plots a chosen parameter (generally fluorescence) on the x-axis versus the number of events on the y-axis; thus, only a single parameter is analyzed using this type of graph (Fig. 13--3). The operator can then set a marker to differentiate cells that have low levels of fluorescence (negative) from cells that have high levels of fluorescence (positive) for a particular fluorochrome-labeled antibody. The computer will then calculate the percentage of "negative" and "positive" events from the total number of events collected. The next level of representation is the bivariate histogram, or dual-parameter dot plot, where each dot represents an individual cell or event. Two parameters, one on each axis, are plotted against each other. Each parameter to be analyzed is determined by the operator. Using dual-parameter dot plots, the operator can draw a "gate" around a population of interest and analyze various extrinsic and intrinsic parameters of the cells contained within the gated region (Fig. 13--4). The gate allows the operator to screen out debris and isolate subpopulations of cells of interest. Gates can be thought of as a set of filtering rules for analyzing a very large database. The operator can filter the data in any way and set multiple or sequential filters (or gates). When analyzing a population of cells using a dual-parameter dot plot, the operator chooses which parameters to analyze on both the x- and y-axes, divides the dot plot into four quadrants, and separates the positive events from the negative events in each axis (Fig. 13--5). Quadrant 1 consists of cells that are positive for fluorescence on the y-axis and negative for fluorescence on the x-axis. Quadrant 2 consists of cells that are positive for fluorescence on both the x- and y-axes. Quadrant 3 consists of cells that are negative for fluorescence on both the x- and y-axes. Quadrant 4 consists of cells that are positive for fluorescence on the x-axis and negative for fluorescence on the y-axis. The computer and specialized software will calculate the percentage of cells in each quadrant based on the total number of events counted. A gate can be drawn around a population of cells based on their FSC versus SSC characteristics, and the extrinsic characteristics of the gated population can then be analyzed. For example, lymphocytes can be gated, after which the T-cell subpopulations (CD3+, CD4+ or CD3+, CD8+) and B cells (CD38+, CD3--) can be analyzed (Fig. 13--6). The absolute count of a particular cell type---for instance, CD4+ T lymphocytes---can be obtained by multiplying the absolute cell count of the population of interest (e.g., lymphocytes) derived by a hematology analyzer by the percentage of the fluorescent-positive cells in the sample CD3+, CD4+ lymphocytes. This method is considered a dual-platform analysis. The disadvantage to this type of analysis is that it has a greater potential for added error associated with the use of two distinct methods to derive the absolute count. The single platform is now the method of choice to eliminate this type of error. Single platforms can be achieved by two types of methods: bead-based or volumetric. In the bead-based method, a known quantity of fluorescent beads is added to the flow cytometry tubes, and a simple mathematic calculation allows the absolute WBC numbers to be directly determined from the individual flow cytometry tubes. In the volumetric method, the precise volume of the sample can be used to calculate the absolute number of events. Detailed phenotypic analysis can determine the lineage and clonality, as well as the degree of differentiation and activation, of a specific cell population. This information is useful for differential diagnosis or clarification of closely related lymphoproliferative disorders (see Chapter 18). Immunophenotyping requires careful selection of combinations of individual markers based on a given cell lineage and maturation stage. Attempts to standardize individual marker panels, especially by European laboratory groups, are ongoing; however, the markers selected for inclusion in testing panels vary from institution to institution. Clinical Applications Routine applications of flow cytometry in the clinical laboratory can be divided into two categories: nonmalignant immunophenotyping and malignant immunophenotyping. Nonmalignant immunophenotyping includes the characterization and enumeration of n