Genetic Testing Essentials (Molecular Basis) PDF

Document Details

SuppleConnemara7979

Uploaded by SuppleConnemara7979

Thomas Jefferson University

Charles P. Scott, Ph.D.

Tags

genetic testing prenatal diagnosis molecular biology medical genetics

Summary

This document is a set of lecture notes on genetic testing, covering the molecular basis of genetic tests used in pregnant patients. It discusses cell-free DNA, next-generation sequencing, and array comparative genomic hybridization. It also touches on single nucleotide polymorphisms and uniparental disomy.

Full Transcript

Aug. 29, 2023 Genetic Testing Essentials Page 1 of 15 Genetic Testing Essentials (Molecular Basis of Genetic Tests Used in Pregnant Patients) Instructor: Charles P. Scott, Ph.D. ([email protected]) Block 5: Urology/Endocrine/Reproduction Thread: Biochemistry Conflict of Interest: Dr. Sc...

Aug. 29, 2023 Genetic Testing Essentials Page 1 of 15 Genetic Testing Essentials (Molecular Basis of Genetic Tests Used in Pregnant Patients) Instructor: Charles P. Scott, Ph.D. ([email protected]) Block 5: Urology/Endocrine/Reproduction Thread: Biochemistry Conflict of Interest: Dr. Scott has no conflicts of interest to disclose. Prerequisites Before this session, you should be able to do the following: 1. Explain the major factors that affect DNA hybridization. 2. Explain how PCR amplification works. Learning objectives By exam time, you should be able to do the following: 1. Describe the location, sources, and approximate length of cell-free DNA in a pregnant person. 2. Summarize the method that underlies massively parallel sequencing/next generation sequencing (NGS) and explain the relationship between sequence coverage and sequence reliability. 3. Outline how array comparative genomic hybridization (aCGH) is performed, and describe typical results of aCGH. 4. Discuss single-nucleotide polymorphisms (SNPs), provide an example of a SNP and a variant of unknown significance (VUS) and explain how SNP microarrays can be used to determine SNP copy numbers. 5. Define the chromosome abnormality denoted by the term uniparental disomy and explain how copy number and B-allele frequency from a SNP microarray can be used to infer uniparental disomy. Modified from notes originally prepared by Dr. Peter Ronner Aug. 29, 2023 Genetic Testing Essentials Page 2 of 15 Study Questions 1. Given the fetal fraction and the normalized chromosomal signal, predict the fetal genotype from the cfDNA data below. Patient Fetal Chr. 13 fraction Chr. 18 Chr. 21 Chr. X Chr. Y A 6% 1.00 1.00 1.03 0.97 0.03 B 8% 1.04 1.00 1.00 1.00 -- C 4% 1.00 1.02 1.00 0.98 0.02 D 10% 1.00 1.00 1.00 0.95 -- E 5% 1.00 1.00 1.00 1.00 0.03 2. Explain the difference between a screening test and a diagnostic test, discuss the significance of sensitivity versus specificity for each and classify the following: a) 17-OHP; b) cfDNA; c) karyotype 3. Identify the copy number variation from the aCGH data provided below. 4. Using the data in problem 3 A & B, would you expect to see a B allele frequency of 0.5 for SNPs with two variants in either of the regions shaded in blue? Why or why not? Modified from notes originally prepared by Dr. Peter Ronner Aug. 29, 2023 Genetic Testing Essentials Page 3 of 15 Overview Genetic testing in utero provides critical information to parents and health-care providers to inform decision-making aimed at achieving optimal outcomes for infants and families. Adequate fetal sources of DNA are not accessible for genetic testing until approximately the 10th week of pregnancy. This lecture is focused on the biochemical methods through which genetic information is determined from fetal DNA samples. This lecture provides background for Dr. Al-Kouatly’s clinical medicine lecture on Genetic Testing in Pregnancy. A. Cell Free Methods 1. Cells undergoing apoptosis shed extracellular vesicles (exosomes) containing DNA fragments derived from degradation of chromosomal DNA. These DNA fragments are approximately 150 bp in length. a. In blood, this so-called cell-free DNA (cfDNA) has a short half-life (~15 min) but has a reasonable steady state concentration since cells are continuously undergoing apoptosis. 2. In a pregnant person, most of the cell-free DNA derives from hematopoietic cells. a. By 10-20 weeks of gestation, up to 10% of the circulating cell-free DNA derives from the placenta. i. In most pregnancies, the placenta is genetically identical to the fetus. 5. Cell-free DNA is typically analyzed by an amplification-based sequencing technique called Next Generation Sequencing (see section B). Note that one fetal allele is essentially indistinguishable from the maternal background. 6. Although in principle specific sequences can be amplified and compared to paternal DNA for trait analysis, and meiotic crossover events might even be detectable through deep sequencing techniques, such analyses are not routine. 7. Currently, cell-free DNA is mostly used for the detection of trisomies 13, 18 & 21, and for determining the number of sex chromosomes. NOTE: cfDNA is a SCREENING test, NOT a diagnostic test! Screening tests identify disease indicators; diagnostic tests establish the presence/absence of a condition. Positive screening results must be confirmed with diagnostic tests! 8. An important parameter for successful cfDNA-based testing is the fetal fraction of cfDNA (fetal cell-free DNA divided by total cell-free DNA in the sample). Modified from notes originally prepared by Dr. Peter Ronner Aug. 29, 2023 Genetic Testing Essentials Page 4 of 15 a. Currently, a proper analysis of fetal cfDNA requires that at least 4% of all cfDNA stems from the placenta. This requirement limits fetal cfDNA analysis to >10 weeks. B. Next Generation Sequencing (NGS) 1. Next generation sequencing is also called massively parallel sequencing. 2. There are several competing methods for NGS. The most commonly used approach (machines from Illumina) works as follows, starting with a patient sample: a. If patient DNA is not already fragmented (as in cell-free DNA, see section A), cut it into random fragments of 200-500 base pairs. b. Ligate flanking sequences to the sample DNA to enable capture by hybridization and to provide priming sites for amplification. c. Hybridize the ligated patient DNA to a flow cell that is derivatized with oligonucleotides that are complimentary to the ligated flanking sequences. d. Amplify DNA using the oligonucleotides that are attached to the flow cell as primers. This results in extended products that are attached to the flow cell. e. Wash away the patient DNA, and perform multiple rounds of amplification using the oligonucleotides that are attached to the flow cell as primers. This so-called “In-situ PCR” forms clusters (polonies) that contain many copies of the same DNA. f. Sequence the DNA by: i. Adding a primer corresponding to one of the ligated flanking sequences; ii. Extending the primer with a polymerase to incorporate a fluorescently labeled base complimentary to the next base of the patient-derived template (each of the four common nucleobases is labeled with a different fluorophore); iii. Taking an image of the polonies in the flow cell (typically 106 – 108) to record the identity of the base in the patient-derived sequence; iv. Removing the fluorescent label and 3’-protecting group, then repeating steps ii – iii; v. Using image analysis/bioinformatics software, determine the nucleotide sequence. Modified from notes originally prepared by Dr. Peter Ronner Aug. 29, 2023 Genetic Testing Essentials Page 5 of 15 3. The Illumina method produces reads with a length of about 300-600 bases. a. The error rate is about 0.1% for each base 4. From the length of each chromosome (in base pairs) and the number of polonies from each chromosome, the dosage of each chromosome can be calculated. 5. To account for an uneven number of reads and reduce errors in reported sequence, pieces of DNA are sequenced many times (e.g., 50-400 times) and in both directions (forward & reverse reads). 6. Mathematical algorithms are used together with reference genome sequences to assemble sequence reads into a proband’s chromosome sequence. Assembly of overlapping sequences obtained from massively parallel sequencing. The sequence is given on the x-axis. Stacked on top of this sequence are fragments of various lengths that fit this sequence. 7. Coverage refers to an average number of reads per DNA sequence (in the above image, the average number of gray bars seen over a particular location of the genome). Modified from notes originally prepared by Dr. Peter Ronner Aug. 29, 2023 Genetic Testing Essentials Page 6 of 15 a. For whole exome sequencing, a typical goal for coverage is 100x. 8. Automated and manually assisted bioinformatics processing gives rise to a report for the health care provider. a. Bioinformatics processing and interpretation are a very expensive component of sequencing. b. Variants of unknown significance (VUS) are common and their reporting/non-reporting presents many ethical issues. 9. NGS has reduced the cost of sequencing dramatically, making whole genome sequencing feasible. Access to patient genetic information on the genomic scale has profound implications for health care delivery. C. Cell-based Methods 1. Any sample collected from a fetus must be small, especially early in gestation. Cell free methods circumvent small sample size through amplification (e.g., in situ PCR). For cell-based methods such as chorionic villus sampling (CVS) or amniocentesis, small samples are expanded through cell culture methods. 2. DNA from cultured cells can be readily harvested for array-based methods (see section D). 3. In addition to providing a non-mutagenic approach for expanding sample size, cell culture enables enrichment of cellular subpopulations (e.g., mitotic cells) that are necessary for some diagnostic tests (e.g., karyotyping). D. Chromosomal Microarrays 1. A microarray is typically a treated slide (e.g., made of glass) to which are anchored in known places many different molecules of known structure. The description below is specific to DNA microarrays. a. A DNA microarray is often about 1 x 1 in. in size and is subdivided into more than a million sites (“cells”). Cell Cell Cell Modified from notes originally prepared by Dr. Peter Ronner Aug. 29, 2023 Genetic Testing Essentials Page 7 of 15 b. Each site contains millions of copies of the same short DNA sequence (called a probe, reporter, or oligo; for instance 25 nucleotides long). c. These probes hybridize to added sample DNA (or a mixture of sample and reference DNA). 2. Recall that a single-nucleotide mismatch impairs hybridization most in an ultrashort probe and least in an ultra-long probe. Keep in mind that a short sequence of nucleotides (e.g., ATGC) can statistically occur many times in the genome, but a much longer sequence (e.g., >20 nucleotides) is unlikely to occur more than once. (Since DNA is composed of four bases, any given sequence will occur once in 4N, where “N” is the number of bases in the sequence. Given that the human genome is roughly 3 billion bases in length, any sequence longer than 20 bases is likely to be unique unless enriched by positive selection). Probes of 20 - 30 nucleotides are selective on the genomic scale but are also sensitive to single base substitutions (e.g., single nucleotide polymorphisms (SNPs)) since the effect of mismatches on hybridization are affected by probe length. These forces also determine the properties of short and long probes discussed next. 3. Probes on DNA microarrays can be selected to recognize large deletions or amplifications. Such probes are relatively long and therefore cannot detect small alterations of the genome. These probes are also non-polymorphic (nonvariant). They are well suited to detect abnormal copy numbers and are sometimes called “copy number probes”. 4. Probes on DNA microarrays can also be selected to recognize single bases at select locations. Such probes are shorter (~25 bp) and are present with the appropriate sequence diversity. These probes can test for polymorphisms (see section F) and thus are especially useful in detecting absence of heterozygosity, e.g., due to consanguinity or uniparental disomy (more on this later). Such probes are sometimes called “SNP variant probes”. 5. In practice, chromosomal microarrays often contain a mixture of short and long probes, i.e., they test for polymorphic and non-polymorphic regions. 6. Chromosome microarrays are designed to find known microdeletions, examine subtelomeric regions, and look for alterations throughout the genome. 7. Cells used for analysis by chromosomal microarray do not have to divide. Modified from notes originally prepared by Dr. Peter Ronner Aug. 29, 2023 Genetic Testing Essentials Page 8 of 15 E. Array Comparative Genomic Hybridization (aCGH) 1. For aCGH, reference DNA is cut up with restriction enzymes, and patient DNA is cut up in the same manner. Reference DNA is labeled with one fluorescent dye, and patient DNA is labeled with a different fluorescent dye. Then, a competition is set up with the intent to find deletions and duplications (or amplifications). 2. Labeled reference DNA and labeled patient DNA are allowed to hybridize to probes on a DNA microarray. a. The location of each probe on the microarray and the location of the matching sequence in the genome are known. b. Remember that each location (“cell”) contains millions of identical probe molecules. Reference DNA and patient DNA compete for binding to these probe molecules. 3. The fluorescence of each dye at each location on the array is measured. It is then possible to figure out whether there is more patient DNA or more reference DNA at a particular location. Array Comparative Genomic Hybridization (aCGH). In this experiment, patient DNA was labeled green, and reference DNA was labeled red. When the patient and reference DNA are present in similar amounts, the spot looks yellow, because equal amounts of red and green generate yellow. If there is an excess of patient DNA, the sample looks more green, and if there is a loss of patient DNA, the sample looks more red. From: E Karampetsou et al., J Clin Med 2014:3:663-678. Modified from notes originally prepared by Dr. Peter Ronner Aug. 29, 2023 Genetic Testing Essentials Page 9 of 15 4. Sometimes, the ratio of patient to control DNA is expressed as a logarithm to the base of 2 (i.e., log2) and displayed as such. a. If there are equal amounts of both DNA fragments, the ratio is 2/2 = 1, and the log2 of 1 is 0. b. If there is a deletion in the patient’s genome, the ratio is 1/2 = 0.5, and the log2 of 0.5 is -1. c. If there is a duplication in the patient DNA, the ratio is 3/2 = 1.5, and the log2 of 1.5 is +0.6. 5. A difference in the amount of DNA between patient and reference DNA is called a copy number variation (CNV). 6. The following is a sample of aCGH output for chromosome 17 where red dots represents the sample and blue dots represent the reference. Log2 ratio of patient DNA to reference DNA. The blue-shaded area highlights a CNV in the form of a 7 Mb deletion at the end of the p arm. http://www.scielo.br/pdf/jped/v91n1/0021-7557-jped-91-01-00059.pdf 7. aCGH provides locations of copy number variations (CNVs). a. A CNV arises from deletions or insertions of DNA. i. The resolution of aCGH depends on the length of the probes and their distribution across the genome. CNVs can be detected, if they are larger than about 50,000 base pairs. b. Inversions and balanced translocations (often detectable by karyotyping and FISH) cannot be detected by aCGH because they do not change the amount of a particular patient DNA to that particular reference DNA (i.e., sequence context is lost in aCGH because fragments of chromosomes are prepared and studied). c. Aneuploidies (e.g., monosomy, trisomy) can be detected by aCGH. d. Triploidy cannot be detected by aCGH. Modified from notes originally prepared by Dr. Peter Ronner Aug. 29, 2023 Genetic Testing Essentials Page 10 of 15 F. Single Nucleotide Polymorphism (SNP) Array 1. A polymorphism refers to something that assumes multiple forms. In biology, a single nucleotide polymorphism (SNP) refers to a base in a particular position in the genome of a population that can have an alternate nucleotide, e.g., A instead of C. The less common form must generally occur in more than 1% of the population to be reported. 2. In the simplest/usual case, a SNP has only two alleles that we shall call A and B. A person can be homozygous (AA or BB) or heterozygous (AB), missing one allele (A- or B-), missing both alleles (--), or have 3 copies (AAA, AAB, ABB, BBB), etc. 3. A SNP microarray contains probes for the A allele as well as probes for the B allele. a. The SNP probes are typically 25-80 bp long. 4. In contrast to aCGH, a SNP microarray does NOT use a competing reference DNA (however, a reference file is used to interpret measurements). a. Patient DNA is cut into shorter pieces using restriction enzymes. b. The fragments are labeled with a fluorescent dye. c. The labeled DNA is denatured (i.e., made single-stranded by heating) and allowed to hybridize with the DNA probes on the microarray. 5. The fluorescence is measured for each spot of probe DNA on the microarray. The measured intensities are compared to those found in a normal standard. The measured fluorescence intensity is thus estimated to derive from 0, 1, 2, or 3 copies of a particular DNA location. Modified from notes originally prepared by Dr. Peter Ronner Aug. 29, 2023 Genetic Testing Essentials Page 11 of 15 Single Nucleotide Polymorphism (SNP) Array. From: E Karampetsou et al., J Clin Med 2014:3:663-678. 6. The measured data are displayed against an ideogram of each chromosome. 7. The fluorescence intensity of the sums of many individual pairs of A- and Balleles indicates the number of copies of a chromosome section in the sample (see illustration above). 8. The B allele frequency (BAF) can be calculated for each SNP. a. On average, the B allele frequency is close to 0.5 (this is because the alleles were assigned A and B randomly). i. A plot of B allele frequency for a normal karyotype shows points mostly at 0%, 50%, and 100% (keep in mind that there is some error of measurement). Modified from notes originally prepared by Dr. Peter Ronner Aug. 29, 2023 Genetic Testing Essentials Page 12 of 15 b. Uniparental disomy refers to the presence of 2 chromosomes (or a part of 2 chromosomes) that derive(s) only from one parent (how this happens is complex and tangential to a lecture on genetic testing). Uniparental disomy is a problem when the affected region contains imprinted genes or disease alleles that show recessive inheritance. i. In uniparental disomy, a plot of B allele frequency shows data only near 0% and 100% (not at 50%, because there is no heterozygosity), while the fluorescence intensity is indicative of 2 copies. Cytogenetics into Cytogenomics: SNP Arrays Expand the Screening Capabilities of Genetics Laboratories. LK Conlin and NB Spinner. Illumina Application Note (https://www.illumina.com/documents/products/appnotes/appnote_cytogenetics.pdf) Modified from notes originally prepared by Dr. Peter Ronner Aug. 29, 2023 Genetic Testing Essentials Page 13 of 15 Answers to Study Questions 1. Given the fetal fraction and the normalized chromosomal signal, predict the fetal genotype from the cfDNA data below. Patient Fetal Chr. 13 fraction Chr. 18 Chr. 21 Chr. X Chr. Y A 6% 1.00 1.00 1.03 0.97 0.03 B 8% 1.04 1.00 1.00 1.00 -- C 4% 1.00 1.02 1.00 0.98 0.02 D 10% 1.00 1.00 1.00 0.95 -- E 5% 1.00 1.00 1.00 1.00 0.03 a. Since the fetal fraction is 6%, 94% of the signal is maternal (47% from each chromosome) and 6% of the signal derives from the fetus (3% from each chromosome). A normalized signal of 1.03 for chromosome 21 indicates additional dosage consistent with fetal trisomy 21. The observation of reduced signal from the X chromosome coupled with evidence for a Y chromosome suggests a fetal genotype of 47,XY,+21 (Down syndrome). b. This patient has excess signal for chromosome 13 consistent with fetal trisomy 13. A normalized X chromosome signal of 1.00 coupled with lack of Y chromosome signal suggests a fetal genotype of 47,XX,+13 (Patau syndrome). c. This patient has excess signal for chromosome 18 consistent with fetal trisomy 18. Reduced X chromosome signal coupled with detectable Y chromosome signal suggests a fetal genotype of 47,XY,+18 (Edwards syndrome). d. In this patient, signal from the X chromosome is reduced, but there is no evidence for a Y chromosome, which suggests a fetal genotype of 45,X (Turner syndrome). e. This patient has a normalized X chromosome signal of 1.00 coupled with additional signal indicating presence of a Y chromosome, which suggests a fetal genotype of 47,XXY (Klinefelter syndrome). Modified from notes originally prepared by Dr. Peter Ronner Aug. 29, 2023 Genetic Testing Essentials Page 14 of 15 2. Explain the difference between a screening test and a diagnostic test, discuss the significance of sensitivity versus specificity for each and classify the following: a) 17-OHP; b) cfDNA; c) karyotype The purpose of screening tests is to identify disease indicators, usually in asymptomatic populations at low risk. As such, screening tests should have high sensitivity. Screening tests require confirmation, so false positives are tolerable, but false negatives must be minimized. In contrast, diagnostic tests are used on symptomatic individuals (or individuals with positive screening tests) to confirm or rule out the presence of a disease condition. Specificity is critical for establishing definitive diagnoses. a. The 17-OHP newborn screen is used in asymptomatic newborns to identify infants with CAH. It is a screening test. b. cfDNA is used early in pregnancy to detect trisomies and sex chromosomes. It is a screening test. c. Karyotyping is typically performed when there is clinical suspicion of genetic anomalies. Karyotyping is the gold-standard for diagnosing trisomies. It is also useful for diagnosing inversions and translocations (especially in combination with FISH), which are invisible to array-based approaches. Karyotyping is a diagnostic test. 3. Identify the copy number variation from the aCGH data provided below. (http://www.scielo.br/pdf/jped/v91n1/0021-7557-jped-91-01-00059.pdf) a. Log2 ~ 0.6 in the q-arm of chromosome 6 indicates duplication. b. Log2 ~ -1 in the p-arm of chromosome 4 indicates deletion. c. Log2 < 0 at the end of the q-arm of chromosome 1 suggests deletion. Modified from notes originally prepared by Dr. Peter Ronner Aug. 29, 2023 Genetic Testing Essentials Page 15 of 15 4. Using the data in problem 3 A & B, would you expect to see a B allele frequency of 0.5 for SNPs with two variants in either of the regions shaded in blue? Why or why not? In 3A, duplication means that there are THREE alleles for any SNP in the shaded region. As such, the B allele frequency can either be 1, 2/3 (0.67), 1/3 (0.33) or 0. In 3B, deletion within the shaded region means that there is only ONE allele for any SNP in the shaded region, therefore the B allele frequency can only be 1 or 0. Modified from notes originally prepared by Dr. Peter Ronner

Use Quizgecko on...
Browser
Browser