Bio 150: Fluorescence In-Situ Hybridization (FISH) - Midyear 2022-2023 PDF
Document Details
Uploaded by BriskAntigorite
Tags
Summary
This document provides an introduction to fluorescence in-situ hybridization (FISH), a molecular cytogenetic technique used to detect and locate specific DNA sequences or entire chromosomes within cells. It details the methodology and applications of FISH, including disease diagnosis, gene mapping, and the identification of chromosomal abnormalities. The document is part of a course on cellular and molecular biology.
Full Transcript
Bio 150 Introduction to Cellular and Molecular Biology Date Submitted: 07/13/23 FLUORESCENCE IN-SITU HYBRIDIZATION Score: Midyear AY 2022-2023 \ I. INTRODUCTION Fluorescence In Situ Hybridization (FISH) is a molecular cytogenetic technique that detects and locates a specific DNA sequence or an...
Bio 150 Introduction to Cellular and Molecular Biology Date Submitted: 07/13/23 FLUORESCENCE IN-SITU HYBRIDIZATION Score: Midyear AY 2022-2023 \ I. INTRODUCTION Fluorescence In Situ Hybridization (FISH) is a molecular cytogenetic technique that detects and locates a specific DNA sequence or an entire chromosome in a cell (Dutra, 2023). The general principle of FISH is based on fluorescent probes that bind to specific regions on the chromosome with a high degree of sequence complementarity. This can either be done with a short double-stranded RNA probe or a short double-stranded DNA probe. FISH has various applications, including disease diagnosis, gene mapping, the localization of mutations on chromosomes, and the identification of chromosomal abnormalities. The technique can also serve as a reference for comparisons among chromosomal arrangements of genes in related species (Shakoori, 2017). Joseph Gall and Mary Lou Pardue developed the FISH technique in the 1960s. It originally used radioactive labels in hybridization probes, and the sites were detected using autoradiography. However, it was later replaced with fluorescent labels because of their greater safety, stability, and ease of detection (Shakoori, 2017). DNA is made of two linked strands of molecules that are coiled together into a structure known as a double helix. Each strand is able to bind together because of the hydrogen bond present between its bases, where each sequence encodes specific biological information (Bates, 2023). Two complementary sequences then bind together or hybridize. The FISH technique makes use of the ability of one DNA strand to hybridize specifically with another DNA strand. It uses small DNA or RNA strands called probes that are attached to a fluorescent reporter molecule (Shakoori, 2017). The probes are complementary to specific parts of a chromosome or to a specific target sequence of sample DNA. Figure 1. Fluorescent in situ hybridization (FISH) identification of human chromosomes through chromosome painting. Image retrieved from Shakoori (2017). Figure 1 shows a computer-generated “false color” image of a human karyotype where the variations in fluorescence wavelength among the probes are enhanced to appear as distinct primary colors. The DNA probes that are specific to regions of particular chromosomes are attached to fluorescent markers, which are then hybridized with a chromosome spread. The unique pattern on each chromosome allows for the detection of chromosomal variations (Shakoori, 2017). II. METHODOLOGY Using heat, FISH unwinds the double helix structure to allow the binding of the probes to their complementary sequence in the patient’s DNA. If a small deletion is present in the region complementary to the probe, the probe will not be able to hybridize. If duplication is present, more probes will be able to hybridize (Shakoori, 2017). Fluorescence in-situ hybridization (FISH) uses complementary probes to bind and visualize specific nucleic acid sequences in cells or tissues. The probes are labeled with fluorescent molecules and applied to fixed samples. After hybridization and washing, fluorescence microscopy is used to visualize the bound probes, providing information about the abundance of the target sequences. location and Cell or tissue samples are obtained and fixed with 4% paraformaldehyde at room temperature for 10 minutes to preserve the cellular structures as well as maintain nucleic acid integrity (Dutra, 2023). The specific DNA or RNA probes were designed using bioinformatic tools (Hertoghs et al., 2003) to target complementary sequences of the nucleic acid of interest. The following reagents were used for probe design and labeling: DNA or RNA probes specific to the target sequence, fluorescently labeled nucleotides (e.g., Cy3-dUTP or FITC-dUTP), and labeling enzymes (e.g., Klenow DNA polymerase or reverse transcriptase) (Heslop-Harrison et al., 1991). After the samples were prepared, they were denatured at 75°C for 10 minutes to allow binding to the target nucleic acids. These samples were then mixed with the labeled probes. Hybridization was carried out at an appropriate temperature of 37°C overnight to facilitate specific binding between the probe and the target sequences. Hybridization buffers containing formamide (Sigma-Aldrich) were used to optimize the hybridization conditions. Following hybridization, samples underwent a series of post-hybridization washes to remove unbound or non-specifically bound probes. Stringent washing conditions were employed, including a wash with 0.2× SSC (Saline-Sodium Citrate) buffer at 60°C for 15 minutes, to ensure specific hybridization and reduce background noise (O’Connor, 2008). Finally, to visualize cellular structures and enhance contrast, samples were counterstained with appropriate dyes or stains, such as DAPI (4',6-diamidino-2-phenylindole, Thermo Fisher Scientific) for nuclear staining. Mounted samples were protected using an appropriate mounting medium to preserve fluorescence signals and prevent photobleaching. IV. DISCUSSION How are probes made? Fluorescence in situ hybridization probes are short DNA or RNA sequences that are labeled with fluorescent molecules. These are made to complement target sequences, therefore, only allowing them to GUTIERREZ, LAGRASON, LAGUNDA SANDAGA, TAGARINO specifically bind to these regions. Through hybridization to a complementary target sequence, FISH probes allow for its visualization under a fluorescence microscope. These probes are classified into different types: gene specific, centromeric, whole chromosome, telomeric probes. Each of the probe types mentioned target a specific location, and acts on a variety of applications (Ford & Reid, 2000). FISH probes can be labeled with different fluorescent molecules such as fluorescein isothiocyanate, cyanine dyes, or Alexa Fluor dyes. The fluorophore used could vary depending on the requirements of the procedure to be conducted (Lukumbuzya et al., 2019). As discussed, one crucial step in Fluorescent In-Situ Hybridization is the probe design. The probe is the most critical component in ensuring the specificity of hybridization. Essentially, commercially available probes are best used in facilitating this assay. However, when synthesizing probes, pure DNA is required and proper quality control must be performed. Techniques such as gel electrophoresis or spectroscopy may be employed to evaluate the probe’s integrity and fluorescence. It is also important to note that Cot-1 DNA is supplemented along with the probes. Cot-1 DNA is typically used in FISH procedures to act as a blocking agent to minimize non-specific binding of probe sequences. By supplementing the probe with Cot DNA, “background noise” is reduced which allows for a more specific and accurate detection of the target sequences (Wang et al., 1995). With the DNAse enzyme, random cuts in the DNA are made which are called nicks. The DNAse is an endonuclease which catalyzes the cleavage of DNA molecules within the polynucleotide chain by targeting the phosphodiester bonds in between them. Then, DNA polymerase attaches special nucleotides with fluorophore (fluorescein, Cy3, or Cy5) to the DNA; this is to enable visualization and detection. Lastly, Ligase enzyme seals the nicks with the fluorescent nucleotides (Bartlett, 2004). What did this tell us about chromosome organization? The fluorescent in-situ hybridization, as discussed, uses labeled fluorescent probes which consist of nucleotide base sequences complementary to that of the target DNA sequence with an attached fluorescent marker, that is hybridized to the DNA and serves as colored markers for the particular DNA sequence under a fluorescent microscope. This has been utilized Group 4 together with karyotyping to observe chromosome organization by determining the location of the corresponding genes that are expressed or responsible for certain characteristics. This has enabled the diagnosis of chromosomal abnormalities such as gene translocation, deletion, and/or duplication (O’Connor, 2008). In relation to this there are different probes utilized to achieve these observations, namely: locus specific probes, alphoid ro centromeric repeat probes and whole chromosome probes. The locus specific probes are used for determining gene location within the chromosome and number of its copies while alphoid or centromeric repeat probes are used to determine if an individual has correct number of chromosomes and lastly, whole chromosome probes which used to generate spectral karyotype which is useful for detecting chromosomal abnormalities (Shakoori, 2017). single nearby red signal. Thus, indicating duplication of the gene in chromosome 17 (O’Connor, 2008). The observations of the changes in chromosome organization with FISH implies that chromosomes have a certain degree of specificity and that the localization and number of genes present in chromosomes have a significant effect on gene expression, as reflected by significant changes in an organism’s anatomy and/or physiology; thus, the chromosome organization has an implication on the regulation of DNA processes (Zimmer and Fabre, 2011). One illustration of the application of FISH is in the diagnosis of Charcot-Marie-Tooth type 1A, a neurological condition caused by gene duplication on chromosome 17. Red probes were used to locate the duplicated gene while a green probe was used in the determination of the chromosome 17. Figure 2. Fluorescent in situ hybridization (FISH) identification of human chromosomes 17 in Charcot-Marie-Tooth disease Type 1A. As shown in Figure 2, it was determined that there was duplication of the gene in one of chromosome 17 dictated by the two red signals near the green signal while the other green signal only has a GUTIERREZ, LAGRASON, LAGUNDA SANDAGA, TAGARINO Group 4 Second Generation Sequencing - Sequencing by Ligation/ SOLiD Supported Oligonucleotide Ligation and Detection (SOLiD) introduces a new technique of DNA sequencing which is the sequencing-by-ligation and it is a second-generation DNA sequencing technology developed by Applied Biosystems (Applied Biosystems, 2008). Sequencing-by-ligation is dependent on the ligation of DNA fragments through the usage of DNA ligase in order to determine the underlying sequence of the target DNA compared to the technique introduced by its predecessors which focuses on the sequencing-by-synthesis principles where the addition of nucleotides with DNA polymerase (DNAP) is utilized to sequence the target DNA (Nguyen, 2021). Emulsion PCR is used to amplify a ssDNA primer-binding region, which is known as an adapter, which has been conjugated to the target sequence on a bead. The beads mentioned will be deposited onto a glass surface (ATD Bio, n. d.). Once a bead is deposited, a primer of length N is hybridized to the adapter. Then after, the beads are exposed to a number of 8-mer probes which contain different fluorescent dye at the 5' end and a hydroxyl group at the 3' end. The bases 1 and 2 are complementary to the nucleotides to be sequenced. On the other hand, the bases 3-5 are degenerates and the bases 6-8 are inosine bases. The complementary probe will hybridize to the target sequence that is adjacent to the primer. The DNA ligase is then used to join an 8-mer probe with the primer. A phosphorothioate bond between the bases 5 and 6 allows the fluorescent dye to be cleaved from the fragment using silver ions. This cleavage allows fluorescence to be measured which uses four different fluorescent dyes that have different emission spectra. Also, this cleavage generates a 5’-phosphate group which can undergo further ligation. The extension product is melted off after the first round of sequencing is completed. Then, a second round of sequencing is performed with a primer that has a length of N-1, succeeding sequencing then contains primers that are shorter than the previous primers. The SOLiD platform uses four (4) fluorescent dyes for detection and two-base encoding to analyze the sequence, in which there are sixteen (16) possible combinations of two (2) nucleotide bases associated with the fluorophores (Choudhori, 2014). The raw data acquired after a cycle cannot be translated directly into a sequence because a single color can be any of the four (4) nucleotide combinations as seen in Figure 1 (Garrido-Cardenas et al., 2017). With this, successive cycle is required to identify the base at a certain position by ruling out the other possible combinations using the fluorophore emitted in the next cycle. In this way, each nucleotide base of the sequence is interrogated twice by different probes. Therefore, the next base can be deduced if the previous base is known until the whole sequence is read (Menon, 2021). The whole sequence can be deduced if one base is known as seen in Figure 2. (Applied Biosystems, n.d.). Figure 1. Association of nucleotide pairs to the fluorescent dyes used in the SOLiD platform. Retrieved from Garrido-Cardenas et al., 2017. Figure 2. An example on reading a sequence using the 2 base encoding by Applied Biosystems. The SOLiD can generate over six gigabases of mappable data and more than two hundred forty million tags per run that’s why it is advantageous when it comes to ultra high throughput. SOLiD sequencing is considered to be one of the most accurate second-generation sequencing technologies at 99.94% because of its capability for intensive sequencing (Ho et al., 2011). The ability of SOLiD sequencing to reduce the measurement errors and superior SNP detection contributes to its robust accuracy (Castellana et al., 2012). Furthermore, this technology is easy to implement and is readily accessible for the reason that it can be performed with off-the-shelf reagents (Applied Biosystems, 2008). In addition, SOLiD allows users to track run status in real time to help ensure that runs are completed successfully. Also, the independent flow cell configuration enables users to run two completely independent experiments on a single SOLiD Analyzer. The SOLiD System's open slide format and flexible bead densities enable increases in throughput on the current system with modest protocol and chemistry optimizations. On the other hand, one of the disadvantages of SOLiD sequencing is the fact that it is slow when it comes to sequencing due to the fact that it takes up to seven days to complete a single run (ATD Bio, n. d.). Moreover, it has a short read length of 35 bp which is significantly smaller than what is being offered by competing sequencing technologies. Additionally, recent research has demonstrated that palindromic sequences are difficult to sequence effectively using sequencing-by-ligation techniques (Applied Biosystems, 2008). SOLiD sequencing was able to make a lasting impact on a number of applications in the domains of transcriptomics, bacterial genome research, and chromatin immunoprecipitation since the benefits of the technology vastly exceed the disadvantages. DNA Organization and Sequencing Techniques: 3C or Hi-C Chromatin Conformation Capture/ Hi-C Chromatin (or chromosome) Conformation Capture, also known as the 3C technique, is a method for identifying the spatial arrangement of chromosomal DNA in fixed cells (Cope & Fraser, 2009). The 3C technique is utilized for the mapping of the chromatin interaction across the genome in which the interactions between the genomic loci are measured. Here, the maps present the idea regarding the spatial organization of chromosomes and the methods by which they fold (Akgol Oksuz et al., 2021). There are two commonly and widely used 3C protocols, the Hi-C and Micro-C respectively, which differ in terms of the essential experimental factors such as the techniques and strategies involved in crosslinking chemistry and chromatin fragmentation. For this assigned topic, the focus will be on the Hi-C method. Figure 1. Overview on how 3C is done Guide Questions: 1) How does it work? Hi-C is a 3C based technology that provides an “all-vs-all” approach as it uses chromosome conformation to identify pair-wise chromatin interaction across the entire genome becoming a standard tool for studying genome organization (Lafontaine et al., 2021). The general procedure of Hi-C involves the chemical crosslinking of cells wherein they are then fragmented, lysed, and subjected to restriction enzyme before marking with biotin. The fragments will then be ligated before they are prepared for sequencing resulting in the map of pair-wise chromatin interaction across the entire genome. Below are the details of the general process of the Hi-C protocol. First, cells are fixed with formaldehyde allowing the crosslinking of DNA and DNA interactions. The cells are then lysed and subjected to restriction enzymes wherein the restriction enzyme, specifically the restriction endonuclease digests the cross linked DNA cutting them into smaller fragments. The 5’ overhang and the 3’ overhang ends are then filled and incorporated with biotinylated nucleic acid. Next, a ligation on the blunt-end is then performed. Here, this step will be done under very dilute conditions as it favors the ligation process. A circular, hybrid DNA strand will then be produced as a result of the ligation which in turn biotinylated at the ligation junction. Then, the crosslinking is then sheared into DNA fragments. Only the biotinylated DNA fragments formed through the ligation will then be pulled down with the aid of the Streptavidin beads. Using the highthroughput sequencing adaptors, these fragments will be sequence wherein the resulting sequences will be used to map the genome. Fig 2&3. Hi-C Method However, although the Hi-C is a great method for the identification of the chromatin interactions genome-wide as it can detect a long-range of DNA interactions, Hi-C still has its limitations when it comes to the resolution of the assay. This is because the sparsity of long-range contacts on the contact matrix and the large percentage of zero-contact counts between loci in the matrix are typical flaws in high-resolution Hi-C data. Consequently, certain current modeling techniques might not be able to represent at a higher resolution (Oluwadare et al., 2019). 2) How do you interpret the Hi-C map? Prior to using Hi-C data for model creation, they are typically converted to a matrix form known as a contact matrix or a contact map, which is a N × N matrix generated from Hi-C data that shows the number of interactions across chromosomal regions. The number of equal-sized segments of a chromosome (N) is the size of the matrix. The length of equal-sized areas (for example, 1 Mb base pair) is referred to as resolution. In a Hi-C experiment, each item in the matrix comprises a count of read pairings that connect two homologous chromosome regions. As a result, the chromosomal contact matrix represents all of the observed interactions between chromosome sections (University of Colorado, 2015). Since Hi-C data are often converted to a contact matrix where chromosomes are plotted against each other, each read pair will bear a distinct interaction with each other. These interactions are then plotted in the model depending on its occurrence. Higher interactions will have a darker plot while the least number of interactions will have a lighter plot. To put it all together, the contact matrix of a Hi-C data will show the amount of interactions between pairings of chromosomes. It has been noted that high interaction frequency can be seen most of the time on the same chromosome rather than on different chromosomes, hence the perfectly diagonal sloping dark line. Moreover, the white or gray lines are non- mappable regions of the genome as these are repetitiveness of the genome which can lead to inaccurate interpretations (Lajoie et al., 2015). 3) What did this tell us about chromatin organization (Topological associating domains)? Hi-C shows the presence of two levels of chromatin organization known as Topological Association Domains (TADs) and compartments (see figure). Fig 4. Hi-C maps showing Topological Association Domains A TAD is a certain organizational element of the genome, particularly the chromosomes, containing 100,000 to 1,000,000 base pairs which form a cluster of loops (Pollard et al., 2017). Meanwhile, compartments are longer-range interactions which are thought to involve interactions between many TADs. Compartments are also suggested to correspond to larger domains of euchromatin and heterochromatin (Pollard et al., 2017). TADs can be considered as sub-segments of a chromosome. A single TAD unit consists of one continuous section of one chromosome but TADs can assemble into larger compartments called the A/B compartments. A compartment generally corresponds to transcriptionally active regions where TADs belonging to the same A compartment can interact between themselves. They also tend to have higher gene density, more chromatin accessibility, and more active histone modifications. On the other hand, the B compartment contains TADs that tend to be transcriptionally repressed or inactive regions like heterochromatin (Quon, 2020). These compartments are associated with different distinct nuclear structures. For instance, compartment B tends to be found in close proximity to the nuclear lamina and the interior near the nucleolus while the compartment A TADs can be observed in the interior of the nuclear space or between the nucleolus and the lamina (Quon, 2017). Fig 5. A and B compartment structures in the nucleus TADs are bounded by specific types of regulatory proteins like CTCF (CCTC binding factor) which marks approximately 75% of TAD boundaries at binding sites called insulators. These elements were originally known as short DNA sequence elements that typically divide regions with both active and inactive genes. There are 50,000 to 70,000 CTCF binding sites that have been mapped in mammals. Most of these were found to be located within TADs (Pollard et al., 2017). CTCF can be associated with a ring-shaped complex known as cohesin, which is a key architectural factor in chromosomes. Cohesin was first identified because it regulates the pairing of replicated sister chromatids during cell division. Since CTCF and cohesin are found to work together to bind regulatory elements and genes, it is hypothesized that this association wih a mobile DNA element could be a factor that contributed to the development of complex patterns of gene regulation in humans (Pollard et al., 2017). Assignment 1: Chromosome Banding Chromosome banding is the process of staining chromosomes to help researchers better understand and identify their structural composition. It involves tagging and identifying chromosomes by giving the appearance of various colored bands or alternating bands of dark and light regions on the chromosome as a result of staining with dyes. These bands play a crucial role in identifying the precise location of genes on a chromosome that can be distinguished from its neighboring sections due to variations in brightness or darkness of bands. There are four main types of chromosome banding: (Quinacrine) Q-banding, (Centromeric) C-Banding, (Giemsa) G-Banding, and (Reverse) R-banding. All four types start with the same extraction and sample preparation process as described in steps 1-6 in the diagram below. Source: https://www.onlinebiologynotes.com/chromosome-banding-and-painting/ Steps 7-8 is technique specific: for Q- banding, highly fluorescent quinacrine mustard is used and the result is subjected to UV light to reveal banding pattern, AT situated in heterochromatin region shows dark staining due to quenching of dyes and fluorescence, while GC situated in euchromatin regions show light staining due to quenching dyes but not fluorescence; for C- Banding, DNA is denatured using alkali solutions and washed with an hypotonic solution, then giemsa dye is used, staining heterochromatin regions at and near the centromere; for G-banding, the euchromatic histones are denatured using a proteases then treated with giemsa this results in the lightening of phosphate rich regions due the interaction between thiazine and eosin components of giemsa dye and the DNA; and lastly for the R-banding, the chromosome is incubated/denatured in hot acidic saline then stained with giemsa dye resulting in GC rich regions being darkly stained and AT rich regions being lightly stained or not stained at all due to AT rich regions more readily being denatured (Howe et al., 2014; Huang & Cheng, 2016). One of the most commonly used dyes in the chromosome banding technique is the Giemsa dye. Giemsa dye is a differential visible light dye, developed by Gustav Giemsa, that selectively binds to the phosphate groups of DNA through intercalation, particularly attaching itself to regions with high concentrations of adenine-thymine bonding. This stain is used in the process of G-banding, which is the most fundamental and frequently used method for staining compressed or condensed chromosomes in cytogenetics. The dye consists of a combination of cationic thiazine dyes (most importantly Azure B which is a variant of methylene blue after its oxidation) and an anionic eosin dye (such as eosin Y). Since these dyes have different charges, thay have affinities towards different cellular compartments. Since Azure B is basic, they react with acidic components like nucleic acid, producing a deep purple color. Since Eosin is acidic, react with the cytoplasm and its components staining them pink. The structures of the following dyes are found in the image below from Estandarte (2012). Source:https://www.ucl.ac.uk/~ucapikr/projects/Ana_staining_LitRev.pdf. The formation of a thiazine-eosin precipitate in a 2:1 molar ratio is what leads to chromosomal staining. According to Estandarte (2012), two molecules of the small, fast diffusing, positively charged thiazine dye migrate and intercalate between the base pairs of the DNA (specifically into the minor grooves), thus, the chromosomes stain blue as a result of this. Azure B binding results in a configurational change that favors and accommodates the large, slow diffusing, negatively charged eosin molecule. The eosin molecule then forms a precipitate with the thiazine molecules which results in staining of the chromosomes to purple, since eosin stains pink. The formation of such a precipitate is favored in a hydrophobic environment. The type of interaction between the thiazine and eosin molecules is still up for debate but a study by Zanker (1981), suggested the formation of a charge-transfer complex wherein thiazine acts as the acceptor with eosin as the donor. This is supported by Wittekind and Gehring (1958) which suggests the formation of H bonds, between the thiazine and eosin molecules, that facilitate electron transfer. The two types of bands which can be observed are positive (darkly stained) and negative (lightly-stained) G-bands. Positive G-bands are represented by the darkly stained bands. These patches are hydrophobic, allowing the thiazine-eosin combination to precipitate. Hydrophobicity is caused by hydrophobic proteins. These regions are characterized as condensed and rich in protein disulfide cross-links. With these, the hydrophobic proteins are retained in position during the pretreatment process. Since they are predominantly AT-rich areas that make up the late replicating heterochromatin, they can be revealed by fluorochromes that are specific for AT-rich regions (Estandarte, 2012). Meanwhile, negative G-bands are bands that are lightly stained.These locations are less hydrophobic and less favorable to thiazine-eosin precipitation. These are early replicating euchromatin that is less condensed, and have their protein sulfur predominantly as sulfhydryls. These places have a lot of GC base pairs, therefore, negative G-bands can be revealed by fluorochromes that are specific for GC-rich regions of the chromosomes (R-banding) (Estandarte, 2012). Moreover, R-banding is a process that reverses the G-band staining approach or also known as Reverse chromosomal banding. This banding approach results in a chromosomal band pattern that is diametrically opposed to the G- and Q-banding patterns. In the R-banding technique, the black band (AT-rich region) seen in the G-banding method looks light, and vice versa. The R-banding process also uses Giemsa stain; however, the slide is heated in a buffer solution to ~87°C before Giemsa staining (Estandarte, 2012). Furthermore, the C-banding technique stains constitutive heterochromatin areas of centromeres. The centromeres are positioned near the center of chromosome at the attachment point of two chromatids which can be crucial during mitosis and meiosis. This technique involves denaturation of chromosomes in saturated alkaline solution before giemsa staining. These would depurinate the DNA and break the DNA backbone which causes extraction of DNA in certain regions in the chromosome(Estandarte, 2012). Source: https://www.researchgate.net/publication/23687486/figure/fig2/AS:196113509425153@1423768512122/Conventional-cytogenetic- analysisresults-of-the-fetus-G-and-R-banding-results-showing.png Chromosome banding is a technique used in chromosome karyotyping to identify normal and defective chromosomes for clinical and research purposes. Karyotype is the study of chromosome morphology of a chromosome complement in the form of size, shape, position of centromere, and any other extra traits. Karyotype indicates near or distantly related species based on the similarity or dissimilarity of their karyotypes or observed chromosome traits. The process of karyotyping can be applied to see the relationship between close species and it can also be useful in detecting chromosomal aberrations or mutations which may explain or have effect on the family history of genetic disorders, an unborn child for chromosome abnormalities, the cause of infertility or help diagnose specific types of gene-related illnesses (Kumar, et.al., 2021). Source:https://www.genome.gov/genetics-glossary/Karyotype PYROSEQUENCING What is Pyrosequencing? A DNA sequencing technique that relies on the detection of luminescence when a pyrophosphate is released during DNA synthesis (Liu et al., 2015) Sequencing by synthesis method; [a complementary strand is synthesized in the presence of polymerase enzyme] derived from the combined terms: Pyrophosphate and DNA sequencing It detects the release of pyrophosphate when nucleotides are added to the DNA chain. Where does pyrophosphate come from? In DNA synthesis, when a phosphodiester bond forms between the last nucleotide of the growing strand and a new complementary nucleotide, pyrophosphate is released (Kottur & Nair, 2018). Thus, every time a nucleotide is added during DNA synthesis, we know that pyrophosphate is released. How is pyrophosphate in DNA synthesis detected? Pyrophosphate is detected via an enzyme cascade reaction that results in the emission of light signal dependent on the quantity of ATP. This light signal confirms that a pyrophosphate has been hydrolyzed; consequently, a new nucleoside was added in the growing strand (Wanger et al., 2017). Figure 1. Pyrophosphate and a substrate, which in this case is, Adenosine Phosphosulfate gives ATP in the presence of ATP Sulfurylase ATP sulfurylases (ATPSs) are enzymes that catalyse the primary step of intracellular sulfate activation: the reaction of inorganic sulfate with ATP to form adenosine-5′-phosphosulfate (APS) and pyrophosphate (PPi). (Ullrich et al., 2001) Figure 2. This ATP produced in the previous reaction is used by the enzyme Luciferase in converting Luciferin to Oxyluciferin and production of light Luciferin is a small molecule substrate that react with oxygen in the presence of a luciferase (an enzyme) to release energy in the form of light; oxidizes in the presence of the enzyme that produces light (Cordeau et al., 2012) How does it work? When dNTP is added on a template DNA, pyrophosphate (PPi) is released. As mentioned earlier, this pyrophosphate can be converted into ATP by enzyme ATP sulfurylase. The ATP can be further utilized by enzyme luciferase to generate light (Mohsen & Kool, 2019) DNA polymerases use dNTPs to extend a DNA primer hybridized to a complementary DNA template. As the appropriate dNTP is incorporated into the growing strand, a phosphodiester bond is formed between the 3′hydroxyl terminus on the growing strand and the α-phosphate of the incoming dNTP. four types of dNTP (deoxynucleotide triphosphate) with each using a different DNA base: adenine (dATP), cytosine (dCTP), guanine (dGTP), and thymine (dTTP). Note: deoxyadenosine α-thio triphosphate (dATPαS) is used instead of dATP. This is because luciferase also uses ATP to produce light. dATPαS is used by polymerase, not by luciferase. Along with ATP, luciferase also uses oxygen and luciferin as a substrate. Addition of any dNTP will generate PPi. Only one dNTP is added at a time. When dNTP is added, pyrophosphates are released hence light is produced, available to be detected by sensors. If dNTP is not incorporated, no light is detected. There can be unused dNTP. This occurs when the incoming nucleotide is not complementary to the nucleotide of the template strand. This unused dNTP is removed by the enzyme apyrase. Apyrase is a nucleotide-degrading enzyme that removes unused nucleotides. How are the machine outputs interpreted? In the presence of ATP sulfurylase and adenosine, the pyrophosphate is converted into ATP. This ATP molecule is used for luciferase-catalyzed conversion of luciferin to oxyluciferin, which produces light that can be detected with a camera or sensors. In DNA pyrosequencing, the amount of light generated is proportional to the number of nucleotides incorporated. Oftentimes, the light intensity is graphically represented in a pyrogram. Y-axis represents the light intensity and x-axis represents the nucleotides added. The peaks of the graph represent the number of nucleotides present in the sequence.The relative intensity of light is proportional to the amount of base added (i.e. a peak of twice the intensity indicates two identical bases have been added in succession). This sequence will be complementary to the template strand sequence. Hence the sequence of the unknown DNA fragment is found using pyrosequencing. Figure 3. The y-axis corresponds to the height of the peak; they can also be shown as the number of nucleotides incorporated at each dNTP dispensation. While the x-axis represents the nucleotide positions in the DNA sequence being analyzed or simply the dNTP dispensation order (Ramon et al., 2003). Advantages and disadvantages of Pyrosequencing [in comparison to Sanger method (Shen & Qin, 2012)] Higher sensitivity: sanger sequencing requires more than 20% tumor load of a specimen to obtain a reliable result while pyrosequencing only needs 5% tumor load (advantages) - Faster than Sanger sequencing Has a higher sensitivity Cost-effective (disadvantages) - Not a procedure for simple research labs - Data processing and analysis can be more complex and challenging Cost-effective: Major limitation of Sanger's method was that it could only sequence a low amount of genes at one time and the cost of sequencing was very high. Next generation sequencing offers the capability to produce massive volumes of data from a single run at relatively lower cost. (Gupta & Verma, 2019) Faster sequencing: Pyrosequencing bases sequencer, Roche 454, can produce around 700 MB of data a day and still be accurate and reliable. Disadvantages: Complex procedure: profitable only for labs where large quantity of analysis is needed, like characterization of microorganisms, diagnosis of disease, or analysis of allele frequencies (Ahmad, 2018). Data processing: due to the amount of data generated special software is required for the data analysis. Analysis of data through EGFR, KRAS and BRAF are manual processes.