CH4306 Bioanalytical Techniques Lecture Notes PDF

CH4306 Bioanalytical Techniques Assoc Prof TAN Meng How N1.2-B2-33 [email protected] 1 Course Administration Lectures: Tuesdays 2.30-5.20pm (LT9) Textbook: Andreas Manz, Petra Dittrich, Nicole Pamme, and Dimitri Iossifidis. Bioanalytical Chemistry, Imperial College Press (2015, 2nd Edition). Grading scheme: Quiz 1 (20%) Quiz 2 (20%) Final Exam (60%) Topics: All 8 chapters in the textbook, plus functional genomics and enzymology 2 Lecture 1 – Biomolecules 3 Lecture Outline DNA (Hereditary Information) RNA (Coding and Non-Coding) Proteins (Traditional Workhorses) The Human Genome (Who We Are) 4 Lecture Outline DNA (Hereditary Information) RNA (Coding and Non-Coding) Proteins (Traditional Workhorses) The Human Genome (Who We Are) 5 What is DNA? Building blocks of DNA: Deoxyribonucleotides Each building block is composed of: (1) Phosphoric acid/ phosphate (2) Sugar (deoxyribose) (3) Base The four DNA bases: adenine, thymine, cytosine, guanine - Adenine and guanine belong to the double-ringed class of molecules called purines (abbreviated as R). - Cytosine and thymine are all pyrimidines (abbreviated as Y). DNA base-pairing The most fundamental role of DNA in the cell is in the storage and retrieval of biological information. DNA in a cell is double-stranded: - Adenine forms two hydrogen bonds with thymine (A = T) - Cytosine forms three hydrogen bonds with guanine (C ≡ G) The two DNA strands are reverse complement of each other. An example (the KRAS oncogene): 5’-ATGACTGAATATAAACTTGTGGTAGTTGGAGCTGGTGGCGTAGGCAAG … -3’ 3’-TACTGACTTATATTTGAACACCATCAACCTCGACCACCGCATCCGTTC … -5’ Chargaff’s Rules Erwin Chargaff was a biochemist who had analysed the purine and pyrimidine base content in DNA from a variety of organisms. He observed that although the base composition varied from species to species, in all the organisms that he studied, the percentage of A equalled that of T and the percentage of G equalled that of C. 8 DNA strands have directionality 5’ end has a free phosphate group Phosphodiester bond (in yellow) A nucleotide subunit (in grey) DNA is negatively 3’ end has a free charged hydroxyl group DNA double helix The double-helix model of DNA structure was deduced from X-ray diffraction images of DNA James Watson, Francis Crick, and Maurice Wilkins shared the 1962 Nobel Prize in Physiology or Medicine for the discovery Rosalind Franklin controversy A British molecular biologist (born 1920, died 1958) After obtaining her doctorate in physical chemistry from University of Cambridge, she spent three years in Paris learning X-ray diffraction techniques. In the 1950s, she took beautiful X-ray photographs of DNA at King’s College London, where she was leading her own research group. Maurice Wilkins mistook her for a technician and showed her photographs to James Watson and Francis Crick without her permission. 11 B-DNA helix There are three types of DNA helices: A-DNA, B-DNA (most common), and Z-DNA. B-DNA is a right-handed helix. The bases form the core of the double helix, while the sugar/phosphate backbones are on the outside. The helical axis passes through the central bases. The two grooves between the backbones are called the major and minor groove based on their sizes. B-DNA double helix makes one complete turn about its axis every 10.5 base pairs in solution. Although DNA is a relatively rigid polymer, it has three significant degrees of freedom - bending, twisting, and compression 12 Comparison of DNA helices A-DNA: - Right-handed - Planes of bases are tilted 20o relative to axis - 0.23nm rise between base pairs - Broadest helix type Z-DNA: - Left-handed - Base pairs are rotated - Exists transiently in the cell, as conformation is unstable - Occasionally induced by biological activity (e.g. transcription) and then quickly disappears. 13 Landmark paper It has not escaped our notice that the specific pairing we have postulated suggests a possible copying mechanism for the genetic material. A-T and G-C base pairings are also called Watson-Crick base pairings 14 DNA Replication DNA replication is the process by which a double-stranded DNA molecule is copied to produce two identical DNA molecules. Replication is an essential process because, whenever a cell divides, the two new daughter cells must contain the same genetic information, or DNA, as the parent cell. DNA replication occurs at an extraordinarily high fidelity. The error rate or mutation rate is approximately 1 nucleotide change per 109 nucleotides each time the DNA is replicated. The DNA replication machinery is highly conserved from bacteria to human. The mutation rate is roughly the same for all organisms. DNA replication occurs at a very fast rate. DNA is duplicated at rates as high as 1000 nucleotides per second. Base-pairing underlies DNA replication Since A can only pair with T, while C can only pair with G, each strand of DNA can serve as a template. DNA replication is semi-conservative. DNA synthesis is catalyzed by DNA polymerase Substrates: Single stranded DNA, deoxyribonucleoside triphosphates (dNTPs). The polymerase catalyzes the stepwise addition of a deoxyribonucleotide to one end of the primer strand. The reaction is driven by a large favorable free-energy change, caused by the release of pyrophosphate, which is further hydrolyzed to inorganic phosphate. The structure of DNA polymerase resembles a right hand in which the palm, fingers, and thumb grasp the DNA. The DNA polymerase has a proof-reading capability. Recall: The two DNA strands in a double helix are anti-parallel 5’-ATGGATTTATCTGCTCTTCG-3’ 3’-TACCTAAATAGACGAGAAGC-5’ (This is part of the BRCA1 gene, which can cause breast cancer when mutated.) 18 Two possible models for DNA replication 1) 5’-ATGGATTTATCTGCTCTTCG-3’ 3’-TACC... Both strands grow continuously. 5’-ATGG... 3’-TACCTAAATAGACGAGAAGC-5’ 2) 5’-ATGGATTTATCTGCTCTTCG-3’...AAGC-5’ DNA polymerization can occur in only 5’-ATGG... one direction. 3’-TACCTAAATAGACGAGAAGC-5’ 19 The incorrect model No 3’-to-5’ DNA polymerase has ever been found! Why is the simplest model incorrect? DNA replication occurs only in the 5’ to 3’ direction The replication fork has an asymmetric structure: - The DNA daughter strand that is synthesized continuously is known as the leading strand. - The daughter strand that is synthesized discontinuously is known as the lagging strand. The DNA synthesized on the lagging strand must be made initially as a series of short DNA molecules called Okazaki fragments. For the lagging strand, the direction of nucleotide polymerization is opposite to the overall direction of DNA chain growth. More details on the lagging strand Enzymes involved: - DNA primase synthesizes short RNA primers, which are approximately 200 nucleotides apart. - DNA polymerase extends from a RNA primer, until it reaches another primer. - RNase H erases RNA primers, thereby leaving gaps. - DNA polymerase fills in the gaps. - DNA ligase seals two consecutive fragments to produce a longer continuous DNA molecule. DNA replication: Preserving and propagating the cellular message A new daughter strand is assembled on each parent strand Uses complementary base pairing, and requires a series of enzymes Highly regulated process DNA packing in eukaryotes A human cell's DNA totals ~3 meters in length. All this DNA has to fit into a tiny nucleus of 5-10μm in diameter. This is like trying to stuff a piece of string 2km long into a tiny bead smaller than 1cm! To do this seemingly impossible feat, cells devised an ingenious packaging system: it wraps DNA around proteins called histones. The resulting DNA-protein complex is called chromatin. What is a gene? A gene is the molecular unit of heredity of a living organism. It refers to some stretches of DNA that code for a polypeptide or for an RNA chain that has a function in the organism. Living beings depend on genes, as they specify all proteins and functional RNA chains. Genes hold the information to build and maintain an organism's cells and pass genetic traits to offspring. E. coli has ~4000 genes, Saccharomyces cerevisiae has ~6000 genes, while human has ~20,000 genes An operon contains a cluster of genes under the control of a single promoter. The genes are transcribed together into only one mRNA strand. Operons are commonly found in bacteria. Example of an operon (lac operon): Gene structures Prokaryote: UTR: untranslated region RBS: ribosome binding site Eukaryote (particular in higher organisms): Alternative splicing generates protein diversity DNA and histones can be modified 28 Lecture Outline DNA (Hereditary Information) RNA (Coding and Non-Coding) Proteins (Traditional Workhorses) The Human Genome (Who We Are) 29 What is RNA? Like DNA, RNA is assembled as a chain of nucleotides. However, there are some important differences between DNA and RNA. The four DNA bases: adenine, thymine, cytosine, guanine The four RNA bases: adenine, uracil, cytosine, guanine Different sugars are used in DNA and RNA Deoxyribonucleotide Ribonucleotide The 2’ free hydroxyl group is highly reactive and makes RNA unstable No B-RNA helix is possible Steric clash 32 What does RNA look like in the cell? Unlike DNA, RNA exist as single-stranded molecules that can fold back on themselves to form complex secondary structures. Stem-loop structures are commonly observed in RNA molecules. (Structure of a rRNA) RNA structure is dynamic and can change depending on biological context and what other molecules bind to the RNA. Riboswitches A riboswitch is a regulatory segment of a messenger RNA molecule that binds a small molecule, resulting in a change in production of the proteins encoded by the mRNA. Ribozymes RNA molecules can form complex shapes and have reactive functional groups. Unlike DNA, some RNA molecules can function as catalysts, just like protein enzymes. These RNA catalysts are known as ribozymes. In 1989, Thomas Cech and Sidney Altman shared the Nobel Prize in chemistry for their "discovery of catalytic properties of RNA. It is now possible to make ribozymes that will specifically cleave any RNA molecule. (E.g. a ribozyme has been designed to cleave the RNA of HIV ) 35 Different types of RNA in the cell 36 Non-coding RNAs Not all RNAs in the cell go on to produce proteins. A non-coding RNA (ncRNA) is a functional RNA molecule that is not translated into a protein. For decades, scientists thought that there were only two classes of ncRNAs – tRNAs and rRNAs. Now, we know that there are other important classes of ncRNAs, such as microRNAs that are involved in RNA silencing and long ncRNAs (lncRNAs). Recently, numerous unannotated RNA transcripts have been uncovered. Are they protein-coding or non-coding? - Check coding potential - Check conservation - Perform further experiments What are tRNAs? tRNA is involved in the translation of mRNA into proteins. One end of the tRNA contains an anticodon loop that pairs with three basepairs (codon) in the mRNA to specify a certain amino acid. The other end of the tRNA has the amino acid attached to the 3' OH group via an ester linkage. There is a specific tRNA for each amino acid, 20 in all. From DNA to RNA Transcription 5’ end capping Splicing of pre-mRNA 3’ end polyadenylation Nuclear export of mature mRNA Some of the steps may occur concurrently. 39 Transcription Creating RNA from a DNA Template DNA bases are exposed One of the two strands of the DNA double helix, the antisense strand, acts as a template The nucleotide sequence of the RNA chain is determined by complementary base-pairing The RNA chain is elongated one nucleotide at a time RNA molecules produced by transcription are single strands Key Players of Transcription DNA template (with promoter) The promoter is the site where the transcription machinery binds for the initiation of transcription. The two DNA strands are separated in that region, and the RNA polymerase then begins the transcription process. RNA polymerase The enzyme catalyzes the formation of the phosphodiester bonds that link the nucleotides together to form a linear chain. Ribonucleoside triphosphates (NTPs) Many RNA transcripts can be synthesized simultaneously Once synthesized, the RNA strand does not remain hydrogen bonded to the DNA template. Instead, the RNA chain behind the RNA polymerase is displaced and the DNA double helix re-forms. The almost immediate release of the RNA strand from the DNA means that many RNA copies can be made from the same gene in a short time. The synthesis of additional RNA molecules starts before the first RNA is completed. Over a thousand transcripts can be synthesized in an hour from a single gene. Transcription of two genes, as observed under the electron microscope Protein isoforms Alternative splicing is a regulated process whereby multiple proteins are produced from a single gene. In this process, particular exons of a gene may be included within or excluded from the final, processed mRNA produced from that gene. 43 An example of alternative splicing α-tropomyosin is an integral component of the actin cytoskeleton. Pink arrowheads indicate sites where cleavage and poly-A addition can occur. 44 RNA can be modified in >150 ways 45 Lecture Outline DNA (Hereditary Information) RNA (Coding and Non-Coding) Proteins (Traditional Workhorses) The Human Genome (Who We Are) 46 What are proteins? Amino Acids: basic building blocks Synthesis – Transcription of DNA to mRNA by NTPs and RNA polymerase – Translation of mRNA to protein by tRNA and ribosomes The shape of a protein is important for its function. Primary Secondary Tertiary Quaternary Protein structures Polypeptides can fold into two common secondary structures: - -helix --- The polypeptide backbone follows a helical path. There are 3.6 amino acid residues per turn of the helix. - -sheet --- strands of protein lie adjacent to one another, interacting laterally via H bonds between backbone carbonyl oxygen and amino H atoms. The strands may be parallel or antiparallel. A higher order structure is created by a combination of loops, -helices, and -sheets. Proteins are modular A protein can contain several domains of known or unknown functions The presence of a well- studied domain in an uncharacterized protein can suggest the function of the protein Novel non-natural proteins can be produced by swapping domains or joining different domains together Functions of proteins Proteins have diverse biological functions, which can be classified into five main categories: Structural proteins: glycoproteins, collagen, keratin Catalytic proteins: enzymes Transport proteins: hemoglobin, serum albumin Regulatory proteins: hormones (insulin, growth hormones) Protective proteins: antibodies, thrombin Kwashiorkor Kwashiorkor is a severe form of malnutrition, caused by a deficiency in dietary protein. The extreme lack of protein causes an osmotic imbalance particularly in the gastro-intestinal system causing swelling of the gut diagnosed as an edema or retention of water. Classic symptoms include swelling of the ankles and feet as well as a distended abdomen. Generally, the disease can be treated by adding protein to the diet; however, it can have a long- term impact on a child's physical and mental development. 51 Amino Acids General Structure: α-carbon connected to four groups - amino group, carboxylic group, hydrogen atom, and a substituent group (R group) R=side chain (varies between different amino acids) 20 amino acids found in living organisms. The names for amino acids are often abbreviated to either three symbol or a one symbol short form (eg: Glycine, Gly, G). Classifications of amino acids Amino acids can be assorted into six main groups, on the basis of their structure and the general chemical characteristics of their R groups. Knowing the class of an amino acid is useful for predicting the impact of a particular mutation (e.g. a mutation from Asp to Glu is likely to have less impact than a mutation from Gly to His) Formation of polypeptides Two amino acids can join to form a dipeptide. Polypeptides are chains of ≥3 amino acids. 55 Translation Translation is the process whereby a mRNA molecule is decoded by a ribosome to produce a specific amino acid chain or polypeptide, which later folds into an active protein. A mRNA sequence is decoded in sets of three nucleotides Translating an mRNA Amino acids are added to the C’- terminus end The amino acid to be added is determined by complementary base- pairing between the anticodon of tRNA and the next codon on the mRNA chain The following cycle is repeated: - A spent tRNA with polypeptide sits in P-site of ribosome - A new tRNA binds to the adjacent vacant A-site on the ribosome - A peptide bond is formed between the existing polypeptide chain and the amino acid brought in by the new tRNA - The polypeptide is transferred to the new tRNA and the old tRNA leaves the P-site - The ribosome moves, so that the originally new tRNA is now at the P-site instead, leaving the A-site vacant The Genetic Code 1. The code is a triplet code 2. No gaps or overlaps between codons 3. The code is degenerate (≥1 codon per aa) 4. The 3rd base is often flexible (WOBBLE) 5. Some codons encode "stop" 6. The code is universal The Genetic Code Reading frames in protein translation In principle, three reading frames are possible from any mRNA sequence. In reality, only one polypeptide chain will generally be produced. A ribosomal frameshift allows alternative translation of an mRNA sequence by changing the open reading frame. This technique is commonly found in viruses, as it allows the virus to encode multiple types of proteins from the same mRNA. Binding of a ribosome to mRNA A ribosome binding site (RBS) is a mRNA sequence to which ribosomes can bind and initiate translation. In prokaryotes, it is a region 6-8 nucleotides upstream of the AUG codon called the Shine-Dalgarno sequence. The consensus sequence is AGGAGG; in E. coli, for example, the sequence is AGGAGGU In eukaryotes, there is no Shine-Dalgarno sequence. Instead, the ribosomes recognize the 5’ cap of mature mRNAs. The 5’ cap consists of a guanine nucleotide connected to mRNA via an unusual 5’-to-5’ triphosphate linkage. This guanosine is methylated on the 7 position directly after capping in vivo by a methyltransferase. It is referred to as a 7-methylguanylate cap, abbreviated m7G. For viruses, ribosomes recognize a nucleotide sequence known as the internal ribosome binding site (IRES). IRESes allow translation in a cap-independent manner. Post-translational modifications Many proteins undergo post-translational modifications, which can alter their functions or regulate their activities Different moieties can be attached to the proteins, such as - acetate - phosphate - lipids - carbohydrates - other peptides Glycosylation refers to the enzymatic process that attaches glycans to proteins (and also lipids). Protein glycosylation is an important research area in the biopharmaceutical industry. Production of human therapeutics with incorrect glycosylation patterns can trigger undesirable side reactions or immune responses in patients. Phosphorylation, a very common modification, is performed by a class of enzymes known as kinases, while the removal of the phosphate group is performed by phosphatases. Humans have ~500 distinct kinases. Protein glycosylation Glycosylation refers to the attachment of sugar moieties to proteins. Protein glycosylation has multiple functions in the cell: - The glycosylation pattern can serve to target the protein to a particular compartment. - The sugars can also act as ligands for receptors on the cell surface to mediate cell attachment or stimulate signal transduction pathways. - Because they can be very large and bulky, oligosaccharides can affect protein- protein interactions by either facilitating or preventing proteins from binding to cognate interaction domains. Glycosylation is thought to be the most complex post-translational modification due to the large number of enzymes involved. Glycosylated proteins (glycoproteins) are found in almost all living organisms that have been studied. 63 Diversity of glycosylation Glycosylation increases the diversity of the proteome to a level unmatched by any other post-translational modification. The cell is able to facilitate this diversity, because almost every aspect of glycosylation can be modified, including: Glycosidic linkage – the site of glycan (oligosaccharide) binding Glycan composition – the types of sugars that are linked to a particular protein Glycan structure – branched or unbranched chains Glycan length – short- or long-chain oligosaccharides Many cell signaling pathways rely on protein phosphorylation Regulation of proteins by phosphorylation is one of the most common modes of regulation of protein function. The targeted protein can be in a phosphorylated form or a dephophorylated form. One of these two is an active form, while the other one is an inactive form. Protein kinases and phosphatases work separately and in a balance to regulate the function of the targeted protein. In bacteria, the phosphorylated residues are histidines, aspartates, and (to a smaller extent), tyrosines. In eukaryotes, the phosphorylated residues are serines, threonines, or tyrosines. Two component signal transduction systems for environmental sensing A typical system consists of (at least) two proteins, a sensor histidine kinase and a response regulator. The histidine kinase detects a specific environmental stimulus through its (often periplasmic) sensor domain. This leads to a conformational change, resulting in ATP-dependent autophosphorylation of a invariant His residue. His~P serves as the phosphate donor for the receiver domain of the cognate response regulator, resulting in phosphorylation of a conserved Asp residue Frequently, the response regulator dimerizes after being phosphorylated. This leads to an activation of the effector domain of the response regulator, mediating the cellular response, usually by mediating differential expression of specific target genes. In other words, response regulators are often transcription factors. Reset of the system to pre-stimulus state is achieved by dephosphorylation, either by the intrinsic phosphatase activity exhibited by the sensor kinase or by other phosphatases. An example of a two component system Activity of FixL histidine kinase is inhibited by oxygen. In the absence of oxygen, FixL autophosphorylates at the membrane and initiates the signaling cascade via transphosphorylation of FixJ response regulator. FixJ activates the transcription of fixK and FixK then activates the expression of high affinity terminal oxidases. A negative feedback loop exists in the network design: FixK turns on FixT, which acts as an inhibitor of FixL by mimicking a response regulator. Lecture Outline DNA (Hereditary Information) RNA (Coding and Non-Coding) Proteins (Traditional Workhorses) The Human Genome (Who We Are) 68 What is a genome? A genome is a cell’s or an organism's complete set of hereditary information. The genome is typically encoded in DNA, except for some viruses where the genome is encoded in RNA instead. The genome includes all the genes and the non-coding sequences of the DNA. Each genome contains all of the information needed to build and maintain that organism. Characteristics of an organism’s genome include number of chromsomes, genome size, gene order, codon usage bias, GC-content, number of repetitive elements etc. Parts of a Genome Structural Genes DNA segments that code for some specific RNAs or proteins (e.g. mRNAs and tRNAs) Functional Sequences Regulatory elements, including promoters, operators, and insulators Non-Functional Sequences Introns and repetitive sequences. Used to be thought of as mostly “junk”, but evidence suggest that they might be functional in reality. 70 Timeline of key genome projects 1977 First DNA genome – Bacteriophage Φ-X174 1980 First mitochondrion genome 1982 First shotgun sequenced genome – Bacteriophage lambda 1995 First prokaryotic genome – Haemophilus influenzae 1996 First unicellular eukaryotic genome – Yeast 1998 First multicellular eukaryotic genome – Caenorhabditis elegans 2000 First insect genome - Drosophila melanogaster 2000 First plant genome - Arabidopsis thaliana 2001 Draft human genome published 2002 Draft mouse genome published See Genome OnLine Database (https://gold.jgi.doe.gov/) for completed and ongoing genome sequencing projects. Living organisms have a wide range of genome sizes 72 The Human Genome The Human Genome Project is an international scientific research project with the goal of determining the sequence of chemical basepairs that make up human DNA, and of identifying all the genes (and other functional elements) in the human genome. The project was declared complete in 2003. An analogy to the human genome stored on DNA is that of instructions stored in a book: The book (genome) would contain 23 chapters (chromosomes); Each chapter contains 48 to 250 million letters (A,C,G,T) without spaces; Hence, the book contains over 3.2 billion letters total; The book fits into a cell nucleus the size of a pinpoint; At least one copy of the book (all 23 chapters) is contained in most cells of our body. The only exception in humans is found in mature red blood cells, which become enucleated during development and therefore lack a genome. Public vs. Private Approaches Public: Project formally launched in 1990. World's largest collaborative biological project, which was performed in twenty universities and research centers in the United States, the United Kingdom, Japan, France, Germany, and China. Cost $3 billion. First draft announced in 2000 by Bill Clinton (U.S.) and Tony Blair (U.K.) and published in 2001. Private: Launched in 1998. Funded by Craig Venter and his firm Celera Genomics. $300 million. Relied upon data made available by the publicly funded project. Celera’s view of International Consortium International Consortium’s view of Celera Unfair competition: IC delivering the Unfair competition: Celera delivering the same goods but with state funding. same goods but can use IC data, while IC cannot use Celera data. 74 What are transposons? A transposon is a small piece of DNA that inserts itself into another place in the genome. It was first observed in maize. Barbara McClintock was awarded a Nobel Prize in Physiology or Medicine in 1983 for her discovery of transposons. 75 Two subclasses of transposons 1) Retrotransposons are genetic elements that can amplify themselves in a genome and are ubiquitous components of the DNA of many eukaryotic organisms. These DNA sequences use a "copy-and-paste" mechanism, whereby they are first transcribed into RNA, then converted back into identical DNA sequences using reverse transcription, and these sequences are then inserted into the genome at target sites. Retrotransposons are particularly abundant in plants. For example, in maize, 49–78% of the genome is made up of retrotransposons. 2) DNA transposons move in the genome of an organism via a single- or double-stranded DNA intermediate (no RNA involvement). The DNA transposons in the human genome today are no longer active and are thus called “fossils”. 76 Summary of the human genome Retrotransposons The human genome is full of transposon-based repetitive elements! Much of the human genome is unexplored! Types of repeats in the human genome 78 Satellite DNA Satellite DNA consists of very large arrays of tandemly repeating, non-coding DNA. Most satellite DNA is localized to the centromeric or telomeric region of the chromosome. Satellite DNA is also a key constituent of heterochromatin. Centromere A region of DNA that helps to ensure that the replicated chromosomes are moved correctly into the two daughter cells. It is easily recognized as the most constricted part of the mitotic chromosomes (indicated by white arrows). Spindle fibers (“ropes”) are attached to the centromere via the kinetochore (a multiprotein complex) during cell division. Telomere Telomeres (labeled in red) are the caps at the end of each strand of DNA that protect our chromosomes, like the plastic tips at the end of shoelaces. Telomeres get shorter each time a cell copies itself, but the important DNA stays intact (recall the Okazaki fragments). Eventually, telomeres get too short to do their job, causing our cells to age and stop functioning properly. Therefore, telomeres 79 act as the aging clock in every cell. Pseudogenes Pseudogenes are genomic DNA sequences similar to normal genes but are considered to be (generally) non-functional; they are regarded as defunct relatives of functional genes. 80 There are about 12,000 pseudogenes in the human genome! Some facts about the human genome Presently estimated Gene Number: 20,000 Average Gene Size: 27 kb The largest gene: Dystrophin 2.4 Mb - 0.6% coding – 16 hours to transcribe. The shortest gene: tRNATYR - 100% coding Largest exon: ApoB exon 26 is 7.6 kb; Smallest exon: 200 nm depending on the concentration of agarose used Pore sizes of agarose gels are much larger than polyacrylamide gels On standing the agarose gels are prone to syneresis (extrusion of water through the gel surface) The melting and gelling temperature of agarose can be modified by chemical modifications, most commonly by hydroxyethylation. With low melt agarose, one can make gels of even higher concentrations. 28 An overview of agarose gel electrophoresis Gel appears translucent after it solidifies. Remove comb before using gel. Be careful of orientation when plugging in the power supply! (Black-to- (DNA is negatively charged) black, red-to-red) 29 Casting and running an agarose gel With a greater gel concentration, porosity and average pore size decrease. 1% = 1g agarose in 100ml buffer When agarose has dissolved by heating the solution, add a reagent that allows you to readily visualize the DNA, swirl, and then pour into a casting mold with a comb. Allow the agarose gel to solidify. Mix each DNA sample with DNA loading dye. The DNA loading dye contains glycerol, which weighs the sample down to the bottom (Low-concentration gels (0.1–0.2%) are fragile and hard to handle.) of the well, as well as two different dyes (typically bromophenol blue and xylene cyanol) for visual tracking of DNA migration during electrophoresis. A different DNA sample is loaded in each well. A known DNA ladder is also loaded. You will see bubbles moving inside the gel box due to the current passing through. Shorter DNA strands will move along the gel 30 faster than the longer ones. DNA Visualization Ethidium Bromide (EtBr) Ethidium bromide is an intercalating agent commonly used as a fluorescent tag (nucleic acid stain) in molecular biology laboratories for techniques such as agarose gel electrophoresis. Absorption maxima of EtBr in aqueous solution are at 210 nm and 285 nm, which correspond to ultraviolet (UV) light. Hence, to visualize DNA using EtBr stain, we need to expose the gel to UV light. Upon exposure to UV light, EtBr will fluoresce with an orange colour, intensifying almost 20-fold after binding to the DNA. EtBr intercalates between the bases of DNA. By moving into the hydrophobic environment found between the base pairs, the ethidium cation is forced to shed any water molecules that are associated with it. As water is a highly efficient fluorescent quencher, removal of these water molecules allows the ethidium to fluoresce. EtBr is a mutagen. Safer alternatives (such as SYBR Safe or GelRed), which are cell impermeable, have been developed. 31 Types of Buffer TAE Buffer - Contains (i) Tris buffer (ii) Acetic acid (iii) EDTA - Most commonly used for routine DNA agarose gel electrophoresis TBE Buffer - Contains (i) Tris buffer (ii) Boric acid (iii) EDTA - Also commonly used SB Buffer - Contains (i) Sodium borate (ii) Boric acid - Has low conductivity and allows for less heat buildup and thus higher voltage and faster runs 32 Joule Heating Joule heating, also known as ohmic heating and resistive heating, is the process by which the passage of a current through the conductive buffer releases heat. The amount of heat released is given by: When the charged particles making up the current collide with ions in the buffer, the particles are scattered and so their motion becomes random and therefore thermal, increasing the temperature of the system. Due to heat transfer, a temperature gradient is formed across the gel cross-section, resulting in band broadening and loss of separation resolution. There are several ways to minimize Joule heating: - apply a low electric field - decrease the conductivity of the buffer - improve the dissipation of heat by using a thin gel - run experiment in a temperature controlled environment 33 Video: agarose gel electrophoresis 34 Pulsed-field gel electrophoresis Developed in 1983 at Columbia University (USA) Pulsed-field gel electrophoresis (PFGE) is a technique used for the separation of large DNA molecules by applying to a gel matrix an electric field that periodically changes direction. Small DNA fragments can find their way through the gel matrix more easily than larger DNA fragments. Additionally, above a threshold length of about 30-50kb, all large fragments will run at the same rate and appear in the gel as a single diffuse band. However, with periodic changing of field direction, various lengths of DNA react to the change at different rates. Hence, over the course of time with consistent changing of electric field directions, the DNA fragments will begin to separate more and more even for the very large ones. This procedure takes longer than normal gel electrophoresis due to the size of the fragments being resolved and the fact that the DNA does not move in a straight line through the gel. 35 Varying electric fields The pulse times are equal for each direction, resulting in a net forward migration of the DNA. 36 37 What is SDS-PAGE? The separation principle of SDS-PAGE is based solely on the difference in protein size (molecular weight). Proteins are denatured in the presence of the anionic detergent SDS (sodium dodecyl sulfate) with binding ratio of 1.4g SDS to 1g protein. (The detergent solubilizes or “dissolves” hydrophobic molecules.) Since each protein chain will be coated by many SDS molecules, the large negative charges of SDS will mask the intrinsic charges of the protein. Hence, separation depends entirely on the molecular sieving effect of the gel. The larger the molecular weight, the slower the protein migrates. 38 Effect of SDS on proteins A protein sample is boiled in SDS and β-mercaptoethanol. The β-mercaptoethanol reduces disulfide bonds. The end result has two important features: (1) All proteins contain only primary structure (are linear) (2) All proteins have a large negative charge, which mean they will all migrate towards the positive pole when placed in an electric field. 39 How to prepare a gel for SDS-PAGE Demonstration: https://www.youtube.com/watch?v=pnBZeL8nFEo In the reaction mix, there are: (1) Acrylamide (gel matrix) (2) Bisacrylamide (cross-linker) (3) SDS (maintains denaturation and negative charge of proteins) (4) Ammonium persulfate (5)TEMED (Tetramethylethylenediamine) TEMED, which is always added last, catalyses the decomposition of the persulfate ion to give a free radical, which is required to initiate the polymerization of acrylamide into chains (with bisacrylamide linking the chains together into a network). 40 Acrylamide Acrylamide is a potent neurotoxin, so handle with care! Porosity is controlled by the proportions of acrylamide and bisacrylamide. 41 Why are there two layers of gel in SDS-PAGE? Sample well Stacking gel Resolving or separating gel In order to accurately size fractionate, the proteins all have to be at the starting line at the same time. The stacking gel has large sized pores that allow the proteins to migrate freely and be lined up in a row at the interface of the two layers (we want all the proteins to start migrating from the same level). In the stacking gel, the migrating proteins are sandwiched tightly between the negatively charged glycinate ions (from the buffer) and chloride ions (in the gel), which are migrating towards the anode. This causes the proteins to get aligned in a sharp band. The resolving gel is the actual track where proteins run according to 42 their molecular weight. Visualization of proteins Once protein bands have been separated by electrophoresis, they can be visualized using different methods of in-gel detection. Demands for improved sensitivity for small sample sizes and compatibility with downstream applications have driven development of visualization methods. Typically, the gel is suspended in a tray filled with the necessary reagents. Most protein staining methods involve the same general incubation steps: (1) A initial wash (e.g. with water) to remove electrophoresis buffers and residual SDS from the gel matrix (2) An acid- or alcohol-wash to condition or fix the gel to limit diffusion of protein bands from the matrix (3) Treatment with the stain reagent to allow the dye or chemical to diffuse into the gel and bind (or react with) the proteins (4) Washing to remove excess dye from the background gel matrix 43 Coomassie Blue Coomassie blue dyes are a family of dyes commonly used to stain proteins in SDS-PAGE gels. The gels are soaked in dye, and excess stain is then eluted with a solvent. This treatment allows the visualization of proteins as blue bands on a clear background. In acidic buffer conditions, coomassie dye binds to basic and hydrophobic residues of proteins Coomassie dye reagents detect some proteins better than others due to differences in protein composition. Thus, coomassie dye reagents can detect as few as 8-10 nanograms for some proteins but more typically 25-100 nanograms for most proteins. Coomassie dye staining does not permanently chemically modify the target proteins. Because no chemical modification occurs, excised protein bands can be completely destained and the proteins recovered for analysis by mass spectrometry or sequencing. 44 Silver staining Silver staining is the most sensitive colorimetric method for detecting total protein. Silver ions (from silver nitrate in the stain reagent) interact and bind with certain protein functional groups. Strongest interactions occur with carboxylic acid groups (Asp and Glu), imidazole (His), sulfhydryls (Cys), and amines (Lys). The bound silver ions are reduced to metallic silver, resulting in brown-black color. The development process is essentially the same as for photographic film. Silver staining can detect less than 0.5 nanograms of protein in typical gels. In some protocols, glutaraldehyde or formaldehyde is used (as a reducing agent). These aldehydes can cause chemical crosslinking of the proteins in the gel matrix, limiting compatibility with downstream analysis by mass spectrometry. 45 Amino acids are zwitterionic Isoelectric point (pI) is the pH where the amino acid exhibits no net charge. Basic proteins have a higher pI, while acidic proteins have a lower pI. At the isoelectric point, the amino acid remains stationary under an applied electric field. Isoelectric focusing Isoelectric point (pI) is the pH value, where the overall protein charge equals to zero. This can be obtained by creating a pH gradient in the gel where protein is loaded and applying an electric current. Proteins migrate towards the cathode or anode according to their total charge up to the point where the gel pH equals pI of a given protein. Isoelectric focusing has a high resolution. Bands as narrow as 0.001 pH units can be obtained. 47 How to obtain a gel with a pH gradient? Isoelectric focusing has been successfully used in research, clinical, and agricultural fields to separate proteins in (for example) blood, muscle extracts, and seed extracts. In isoelectric focusing, a stable pH gradient with constant conductivity is very important. This can be achieved by: (1) Carrier ampholytes (2) Immobilized pH gradients 48 What are carrier ampholytes? An amphoteric compound is a molecule or ion that can react both as an acid as well as a base. Carrier ampholytes are amphoteric molecules that contain both acidic and basic groups and will exist mostly as zwitterions with a high buffering capacity near their pI. Commercial carrier ampholyte mixtures comprise hundreds of individual polymeric species with pIs spanning a specific pH range. When a voltage is applied across a carrier ampholyte mixture, the carrier ampholyte with the lowest pI (and the most negative charge) migrates towards the anode (+). The carrier ampholyte with the highest pI (and the most positive charge) migrates towards the cathode (-). The other carrier ampholytes align themselves between the extremes, according to ther pIs, and buffer their environment to the corresponding pH. 49 Illustration of carrier ampholytes Each ampholyte reaches an equilibrium position along the separation medium. A pH gradient with increasing pH over the gel length formed by a mixture of hundreds of ampholytes each with a different pI. The concentration of each ampholyte is the same to ensure a homogeneous conductivity. 50 Limitations of carrier ampholytes Since the carrier ampholyte-generated gradient is dependent on an electric field, it breaks down when the field is removed. The pH gradient is susceptible to drift towards the ends of the gel, especially towards the cathode (-). Over time, this leads to a plateau in the middle of the gradient with gaps in the conductivity. Hence, we need to restrict the focusing time of the experiment. There can be significant batch-to-batch and company-to-company variations in the properties of carrier ampholytes, which limits the reproducibility of focusing experiments. Carrier ampholytes have a tendency to bind to the sample proteins, which may incorrectly alter the migration of the proteins. 51 Immobilized pH gradients (IPG) An immobiline is a weak acid or base that can be incorporated into an acrylamide gel matrix for isoelectric focusing. Immobilines are not zwitterionic; they are either acidic (A is a weakly acidic or or basic. basic buffering group) Gels with immobilized pH gradients (IPG) are made using a cassette system and a gradient maker to mix two kinds of acrylamide solutions, one with immobiline having acidic buffering property and the other with basic buffering property. The immobilines co-polymerize with the acrylamide gel matrix. The pH depends on the ratio of the two immobilines. The gels can be dried and stored. To use them, a simple rehydration step is needed. 52 Advantages of IPG The protein sample can be applied immediately (no prefocusing is needed). The pH gradient is stable and does not drift in an electric field because the buffers that form the pH gradient are immobilized within the gel matrix. Isoelectric focusing experiments using IPG are more reproducible. The resolution possible with immobilized pH gradient gels is 10- 100 times greater than that obtained with carrier ampholytes. 53 What is 2D Gel Electrophoresis? Two modes of electrophoresis are combined in a single gel. Usually, proteins are separated by isoelectric focusing in 1 dimension, based on pI. This is followed by SDS-PAGE in a perpendicular direction, based on size. Mixtures of thousands of proteins can be separated with high resolution. The result can be compared to electronic databases. 54 Application of 2D Gel Electrophoresis The resulting "maps" of proteins can be compared for example between the experimental and control sample or among samples from patients with specific disease and their healthy controls and thus identify differentially expressed proteins that can be linked with the pathogenesis of the studied disease. (to identify differentially 55 produced proteins) Capillary Electrophoresis Separation method carried out in a buffer-filled capillary tube. The tube extends between two buffer reservoirs. The sample is introduced into one end of the tube. A dc potential is applied between the two electrodes throughout the separation. Charged analytes migrate at different rates in the presence of the electric field. The separated analytes are observed by a detector at the opposite end of the capillary. 56 Sample Injection - Hydrodynamic Hydrodynamic injection can be achieved by using pressure to force the sample into the capillary. Alternatively, it can be achieved by natural gravity flow. In gravity flow injection (also known as siphoning injection), the inlet end of the capillary is raised so that the liquid level in the sample vial is at a height h above the level of the cathodic buffer, and is held in this position for a fixed time t. 57 Sample Injection - Electrokinetic Electrokinetic injection involves drawing sample ions into the capillary interior with an applied potential. A high voltage is applied over the capillary between the sample vial and the destination vial for a given time. This causes the sample to move into the capillary according to its apparent mobility, µ. 58 DNA Capillary Electrophoresis DNA sequencers separate strands by size (or length) using capillary electrophoresis. The fluorescently labeled products of the cycle sequencing reaction are injected electrokinetically into capillaries filled with polymer. High voltage is applied so that the negatively charged DNA fragments move through the polymer in the capillaries towards the positive electrode. In capillary sequencing of DNA, the sieving polymer (typically polydimethylacrylamide) suppresses electroosmotic flow to very low levels. 59 Micellar Electrokinetic Chromatography (MEKC) Micellar electrokinetic chromatography (MEKC) can be thought of as a hybrid of electrophoresis and chromatography. The solution contains a surfactant (most commonly SDS) at a concentration that is greater than the critical micelle concentration. MEKC is performed under alkaline conditions to generate a strong electroosmotic flow. Since SDS is negatively charged, the micelles have an electrophoretic mobility that is counter to the electroosmotic flow. Hence, the micelles migrate quite slowly, though their net movement is still toward the cathode. 60 Migration of analytes in MEKC During a MEKC separation, analytes distribute themselves between the hydrophobic interior of the micelle and hydrophilic buffer solution. The analyte migration velocity depends on the partition coefficient between the micelle and the aqueous buffer: (1) Uncharged analytes that are insoluble in the interior of micelles should migrate at the electroosmotic flow velocity. (2) Analytes that solubilize completely within the micelles (analytes that are highly hydrophobic) should migrate at the micelle velocity. (3) For electrically charged solutes, the migration velocity depends not only on the partition coefficient and the electroosmotic flow, but also on the electrophoretic mobility, µep, of the solute in the absence of the micelle. 61 CH4306 Bioanalytical Techniques Assoc Prof TAN Meng How N1.2-B2-33 [email protected] 1 Lecture 3: Optical Spectroscopy & Molecular Recognition 2 Today’s Outline 1) Optical Spectroscopy Absorption of UV and visible light Fluorescence spectroscopy 2) Molecular Recognition Modes of detection Recognition of nucleic acids Recognition of proteins Biosensors 3 Today’s Outline 1) Optical Spectroscopy Absorption of UV and visible light Fluorescence spectroscopy 2) Molecular Recognition Modes of detection Recognition of nucleic acids Recognition of proteins Biosensors 4 Light – an introduction Light is electromagnetic radiation that can be described either as a wave with a particular wavelength or as particles (photons) with a particular energy. (This concept is known as wave-particle duality in quantum mechanics.) The wavelength or its corresponding photon energy is the most important property of light. In bioanalysis, optical methods (such as absorption spectroscopy and fluorescence spectroscopy) are typically applied in the UV and visible wavelengths (200–800 nm). In a vacuum, light travels in a straight line with a speed of 3 x 108 m s-1. However, light can interact with matter. When light is exposed to a material, we can observe interactions such as transmission, absorption, absorption and re-emission, scattering, reflection, refraction, interference, and diffraction of light. The detection of light after these interactions reveals information about the sample or properties of its surface. 5 Light absorption Absorption of light results in the transitions of a molecule from a ground state into an excited state. The absorption reduces the light intensity from Io (its initial value) to I (intensity of transmitted light). 6 Absorption spectrum The absorbance and transmittance of light depends on the wavelength. Therefore, the wavelength should be given together with the measured values. A scan over wavelengths is called an absorption spectrum. Every substance has a typical absorption spectrum, with peak absorbance at particular wavelengths. Absorption measurements at different wavelengths can be used to assess the purity of a solution, e.g. of DNA. To determine the protein contamination in a DNA solution, the absorbance at 260 nm is compared to the absorbance at 280 nm. DNA absorbs most strongly at 260nm, while the aromatic amino acids tyrosine and tryptophan absorb at larger wavelengths than DNA, namely at 274 nm and 280 nm respectively. The ratio A260/A280 equals 1.8 for a pure DNA solution; smaller values indicate the presence of proteins. For a pure RNA solution, one can expect A260/A280 = 2. 7 Beer-Lambert law The Beer-Lambert law relates the attenuation of light to the properties of the material through which the light is travelling: The extinction coefficient (also called absorptivity or proportionality constant), , is specific to a molecular species at a given wavelength. It is often given in 8 L mol-1 cm-1 (note that 1M = 1 mol L-1), but can also be given in L g-1 cm-1. Measuring nucleic acid concentrations The linear relation between concentration and absorbance as described in the Beer–Lambert law is frequently used to determine the concentration of a chemical species. To obtain a good sensitivity, these measurements are preferably done at the maximum absorption wavelength of the species. For example, DNA concentration is estimated by measuring the absorbance at 260 nm and the concentration can be determined by using the extinction coefficient of 0.020 L mg–1 cm–1 for double stranded DNA and 0.027 L mg–1 cm–1 for single stranded DNA. Often, the absorbance of a sample like DNA is measured using a cuvette. Here, a portion of the initial light intensity is lost due to reflection and scattering of the light at the cuvette. In particular, these losses occur at the boundaries of the different materials (air–cuvette and cuvette–solvent). In addition, the solvent may absorb light or may contain absorbing contaminations or scattering particles. These reductions are considered by measuring both the absorbance of the sample solution and the absorbance of the sample-free solvent (“blank”). We then subtract the blank measurement from the sample measurement. 9 NanoDrop A NanoDrop is an increasingly common lab spectrophotometer (from Thermo Fisher) that can measure DNA, RNA, and protein concentrations with only ~ 1 µL of sample. The NanoDrop instrument does not use a cuvette (which most traditional UV-Vis spectrophotometers do). Instead, it uses the surface tension of aqueous solutions to form a column of sample between two pedestals and directs the light through it. This enables the instrument to measure the concentrations of tiny volumes of sample (~ 1 µL; even the smallest 1 cm path length cuvette requires ~ 50 µL). Hence, the concentration of precious samples can be quantified. Just like traditional cuvette-based UV-Vis spectrophotometers, Nanodrop also works by the principle of the Beer-Lambert law. 10 Today’s Outline 1) Optical Spectroscopy Absorption of UV and visible light Fluorescence spectroscopy 2) Molecular Recognition Modes of detection Recognition of nucleic acids Recognition of proteins Biosensors 11 What is luminescence? Emission of light from a chemical species or a material is called luminescence. Photon emission as a result of a chemical or biochemical reaction is referred to as chemiluminescence or biochemiluminescence respectively. A prominent example is luciferase, a class of oxidative enzymes that produce bioluminescence. Many organisms regulate their light production using different luciferases in a variety of light-emitting reactions. The majority of studied luciferases have been found in animals, including fireflies, and many marine animals such as jellyfish. Luciferases are widely used in biotechnology, for example as reporter genes. Although luciferases do not require an external light source, they require the addition of a consumable substrate (e.g. luciferin). 12 What is photoluminescence? Photon emission after photon absorption is termed photoluminescence. There are two types of photoluminescence: fluorescence or phosphorescence. Fluorescence is the emission of light by a substance that has absorbed light or other electromagnetic radiation (in nanoseconds). The emitted light usually has a longer wavelength than the absorbed radiation (Stokes shift). Phosphorescence is a process in which energy absorbed by a substance is released relatively slowly in the form of light. This is because the electron which absorbed the photon (energy) undergoes an unusual “intersystem crossing” and gets trapped in a higher energy state with only rare, kinetically unfavored transitions available to return to its original lower energy state. Fluorescent materials cease to glow nearly immediately when the radiation source stops, unlike phosphorescent materials, which continue to emit light for some time after. 13 Examples of photoluminescence A striking example of fluorescence occurs when the absorbed radiation is in the ultraviolet region of the spectrum, and thus invisible to the human eye, while the emitted light is in the visible region, which gives the fluorescent substance a distinct color that can be seen only when exposed Fluorescent minerals emit visible light to UV light. when exposed to ultraviolet light. Everyday examples of phosphorescent materials are the glow-in-the-dark toys, stickers, paint, and clock dials that glow after being charged with a bright light such as in any normal reading or room light. Glow-in-the-dark body paint. 14 Jablonski Diagram Professor Alexander Jablonski (1898-1980) was a Polish physicist who, in 1933, first illustrated the absorption and emission of light by fluorophores in his now famous diagram, which illustrates the activation from ground state to excited state and the emission of a photon on return to ground state once more. There is not a direct return to ground state as the flurophore can pass through alternative states of energy. After an electron absorbs a high-energy photon, the system is excited electronically and vibrationally. The system relaxes vibrationally, and eventually fluoresces at a longer wavelength. 15 What is a fluorochrome? A fluorochrome is a chemical that fluoresces, especially one used as a label in biological research. Each fluorochrome has unique and characteristic spectra for absorption (usually similar to excitation) and emission. These absorption and emission spectra show relative fluorescence intensities. For a given fluorochrome, the manufacturer indicates the wavelength for the peak of the illumination excitation intensity and the wavelength for the peak of fluorescence emission intensity. 16 When a fluorochrome absorbs a photon, its electrons are excited to a higher energy state. The electrons of the fluorochrome remain in Stokes Shift the excited state for about 10−8 seconds before returning back to the ground state, with the concomitant emission of another photon. This emitted photon usually has less energy than the absorbed photon due to two reasons: (1) Some of the energy is dissipated as heat. (2) A fluorophore is surrounded by water molecules. Part of the excess energy of the excited vibrational mode can be transferred to the surrounding water molecules (in a process known as vibrational relaxation). The energy difference between the absorbed photon and the emitted photon is known as the Stokes shift. 17 Why is there an increase in wavelength? where h is the Planck constant (6.63 x 10-34 J.s) f is the frequency c is the speed of light in vacuum (3.00 x 108 m s-1)  Is the wavelength When electrons drop from the excited state to the ground state, there is some loss of vibrational energy. From the Planck-Einstein equation, we can see that the photon energy varies inversely with wavelength. Hence, the emission spectrum is shifted to longer wavelengths than the excitation spectrum The emission intensity peak is usually lower than the excitation peak. The emission curve is often a mirror image of the excitation curve, but shifted to longer wavelengths. 18 Maximising fluorescence The greater the Stokes shift, the easier it is to separate excitation light from emission light. Any spectral overlap must be eliminated, in fluorescence microscopy, by means of the appropriate selection of an excitation filter. Otherwise, the much brighter excitation light will overwhelm the weaker emitted fluorescence light, significantly diminishing specimen contrast. In order to achieve maximum fluorescence intensity, the fluorochrome is usually excited at the wavelength at the peak of the excitation curve, and the emission detection is selected at the peak wavelength of the emission curve. The selections of excitation wavelengths and emission wavelengths are controlled by appropriate filters. 19 Some excitation light sources Xenon arc lamp - Produces light by passing electricity through ionized xenon gas at high pressure. - Produces a bright white light that closely mimics natural sunlight. - Used in movie projectors in theaters, in searchlights, and for specialized uses in industry and research to simulate sunlight. Mercury vapor lamp - Uses an electric arc through vaporized mercury to produce light. - Longer bulb lifetime than incandescent light bulbs (around 24,000 hours). - The mercury in the lamp is a liquid at room temperatures. It takes 4-7 minutes to heat up and become ionized (so mercury vapor lamps are 20 considered slow-starting). Some excitation light sources Metal halide lamp - Produces light by an electric arc through a gaseous mixture of vaporized mercury and metal halides (usually compounds of metals with bromine or iodine). - Similar to mercury vapor lamps, but contain additional metal halide compounds in the quartz arc tube, which improve the efficiency and color rendition of the light. High power light-emitting diode (LED) - Lower energy consumption and much longer lifetime than mercury lamp or metal halide lamp. (Currently very popular in fluorescence microscopes.) Lasers - Lasers are most widely used for more complex fluorescence microscopy techniques like confocal microscopy and total internal reflection fluorescence microscopy, while xenon lamps, mercury lamps, and LEDs are commonly used for widefield epifluorescence microscopes. 21 Semiconductors An (intrinsic) semiconductor like silicon has low electrical conductivity at room temperature. Doping (by adding a small amount of impurity to silicon) can improve its electrical conductivity. The new semiconductor formed is called an extrinsic semiconductor. There are two types of extrinsic semiconductors: (1) A p-type semiconductor is formed by doping silicon with a trivalent (number of valence electrons=3) element like indium, boron or aluminium (2) An n-type semiconductor is formed by doping silicon with a pentavalent (number of valence electrons=5) element like arsenic or antimony. Since silicon is tetravalent (number of valence electrons=4), an n-type semiconductor will have an excess of electrons or negative charge carriers (surplus of electrons that can be donated to other elements), whereas a p-type semiconductor will have a surplus of holes or positive charge carriers. 22 A p-n junction To make a p-n junction, we dope a wafer of silicon with a trivalent impurity on one side and a pentavalent impurity on the other side. Three phenomena occur at a p-n junction: (1) Diffusion: A p-type semiconductor can accept electrons from an n-type semiconductor. When an electron leaves the n-side region, it leaves behind an ionised donor (a positive charge) at the n-side. Similarly when a hole is diffused to n-side, it leaves behind an ionised acceptor (a negative charge) at the p-side. This movement of electrons from n-side to p-side and the movement of holes from p-side to n-side is called diffusion. (2) Formation of space charge: When more and more electrons leaves the n-region & more and more holes leaves the p-region, positive charges get accumulated near the n- side junction and negative charges get accumulated near the p-side junction, giving rise to a depletion region. (3) Drift: An electric field directed from positive charge to negative charge is formed in the depletion region. This electric field causes electrons to move from p side to n side and holes to move from n side to p side. This motion of charge carriers due to electric field is known as drift. The drift current is opposite in direction to the diffusion current. At equilibrium, diffusion current is exactly equal and opposite to drift current. 23 What is a diode? A diode is a piece of semiconductor material with a p–n junction connected to two electrical terminals. It is a circuit element that allows a flow of electricity in one direction but not in the other (opposite) direction. Bias is the application of a voltage across a p–n junction; forward bias is in the direction of easy current flow, and reverse bias is in the direction of little or no current flow. Forward bias: The p-type is connected with the positive terminal of the power supply and the n-type is connected with the negative terminal. The holes in the p-type region and the electrons in the n-type region are pushed toward the junction and start to neutralize the depletion zone, reducing its width. Reverse bias: The p-type is connected to the negative terminal of the power supply, causing the holes in the p side to be pulled away from the junction and widening the depletion region. Likewise, because the n-type is connected to the positive terminal, the electrons will also be pulled away from the junction, with similar effect. This increases the voltage barrier causing a high resistance to the flow of charge carriers, thus allowing minimal electric current to cross the p–n junction. 24 Illustration of bias Forward bias: Reverse bias: 25 Detecting fluorescence emission Photodiode - A photodiode is a semiconductor device that converts light into current. - The common, traditional solar cell used to generate electric solar power is a large area photodiode. - A photodiode is basically a p–n junction (or a variant of it). - It is usually operated with no or reverse bias. - When a photon of sufficient energy strikes the diode, it creates an electron- hole pair. - If the absorption occurs in the junction's depletion region, these carriers are immediately separated by the built-in electric field of the depletion region. Holes move toward the anode (p-type), and electrons toward the cathode (n- type), and a photocurrent is produced. 26 Detecting fluorescence emission Photomultiplier tube (PMT) - A PMT contains a photocathode, several dynodes, and an anode. - When the photocathode is struck by a photon, the absorbed energy causes an electron to be emitted (photoelectric effect). - The electrons emitted from the cathode are accelerated toward the first dynode, which is maintained 90 to 100 V positive with respect to the cathode. Each accelerated photoelectron that strikes the dynode surface produces several electrons. These electrons are then accelerated toward the second dynode, held 90 to 100 V more positive than the first dynode, and each electron that strikes the surface of the second dynode produces several more electrons, which are then accelerated toward the third dynode, and so on. - The current produced by incident light is multiplied by as much as 100 million 27 times due to the secondary emissions from the dynodes. Fluorescein Fluorescein is a common dye for labelling DNA and protein. It was first synthesized in 1871. Fluorescein has an absorption maximum at 494 nm and emission maximum of 512 nm (in water). 28 Cyanine Cyanine is a synthetic dye family belonging to polymethine group. (Polymethines are compounds made up from an odd number of methine groups (CH) bound together by alternating single and double bonds.) Cyanines have many uses as fluorescent dyes, particularly in biomedical imaging. Depending on the structure, they cover the spectrum from infrared (IR) to ultraviolet (UV). Cy3 and Cy5 are the most popular. Cy3 fluoresces greenish yellow (~550 nm excitation, ~570 nm emission), while Cy5 is fluorescent in the red region (~650 excitation, 670 nm emission). 29 Some terminology The quantum yield of a molecule is defined as the ratio of emitted photons to absorbed photons. It is usually below 1, because the relaxation from the first excited state to the ground state can occur via non-radiative processes. A good fluorophore is a molecule that has a high quantum yield, for example fluorescein with a quantum yield of 0.93 (in 0.1 M NaOH). Fluorescence intensity is the number of detected photons per unit time, collected from a given sample. The spectrum of the fluorophore shows the wavelength-resolved fluorescence intensity. The fluorescence region and the shape is characteristic of a fluorophore. Fluorescence lifetime () is the time required for the molecule to return from the excited state to the ground state. This characteristic decay time is very sensitive to the environment, e.g. solvent or the presence of quenchers. Note that the relaxation of a molecule is a random process and hence, the lifetime refers to a statistic value (some fluorophores emit earlier and others later than . For fluorescence lifetime measurements, a pulsed excitation light source is required and the time between excitation and photon arrival on the detector is determined. This process occurs within nanoseconds and the detector must be able to measure in this fast time range. 30 Bleaching and saturation The more photons are detected from a sample, the better is the signal and hence, the sensitivity of the measurements. In principle, we can measure with stronger excitation light intensities to obtain a higher fluorescence signal. However, there are limitations: (i) Photo bleaching. With continuous excitation, the fluorescence signal will decrease over time. The reason for this is a potential photochemical reaction of the fluorophore when it is in the excited state. The reaction destroys the fluorophore irreversibly. Photostable dyes are able to undergo more excitation-emission cycles before bleaching. The higher the light intensities, the more probable the reaction will occur and thus the faster the bleaching will be. Therefore, there is a compromise between increasing the excitation light intensity (getting more signal) and reducing it to minimize photo bleaching. The phenomenon of photo bleaching has to be taken into consideration for data acquisition and analysis. (ii) Photo saturation. At very high excitation light intensities it may happen in rare cases that all dyes are already in the excited state or in the “triplet state”. As a result, the incoming light is not absorbed anymore. 31 Quenching The fluorescence of a solution can be reduced or annihilated by a molecular process called quenching, of which there are two main types: dynamic quenching and static quenching. (i) Dynamic quenching occurs when the fluorophore is in the excited state. Collisions with other species (ions, molecules) in solution result in relaxation of the fluorophore without photon emission, i.e. the energy is transferred to the so-called quencher. Neither the fluorophore nor the quencher is chemically altered. The reduction in fluorescence depends on the probability of collisions and is therefore a function of the quencher concentration. (ii) Static quenching. A quencher can also form a non-fluorescent complex with the fluorophore in the ground state. Here, the reduction of the fluorescence intensity depends on the binding constant of the quencher to the fluorophore Static Quenching Dynamic Quenching Fluorophore Quencher 32 Fluorescence resonance energy transfer (FRET) Fluorescence (or Förster) resonance energy transfer (FRET) is the transfer of energy from a fluorophore in the excited state to a fluorescent or non-fluorescent acceptor molecule. This acceptor molecule is typically in close proximity to the donor fluorophore, e.g. bound to the same molecule such as DNA. FRET occurs efficiently, when the donor’s emission spectrum strongly overlaps with the acceptor’s absorption spectrum. The process does not involve the emission of a photon by the donor and reabsorption by the acceptor. Instead, the energy is transferred via dipole–dipole interactions. Hereby, the orientation of the dipoles of donor and acceptor molecules determines the efficiency of the energy transfer. Since the energy transfer also strongly depends on the distance of donor and acceptor molecules, FRET measurements can be used as a distance indicator. 33 Some FRET equations The rate of energy transfer (ket) is given by the Förster equation: d: fluorescence lifetime of the donor in the absence of the acceptor molecule R: the distance of donor and acceptor molecules R0: the Förster distance, at which the fluorescence energy transfer is 50 % efficient. The efficiency of the energy transfer (E) is the fraction of photon energy that is absorbed by the donor and transferred to the acceptor. It can be expressed as: In FRET experiments, the donor is excited at a wavelength at which the acceptor is not or only weakly absorbing. 34 Green fluorescent protein (GFP) & its mutants GFP was found in the jellyfish Aequorea victoria in 1961 by Osamu Shimomura. It has been genetically modified since to improve photostability and quantum yield. The chromophoric group of GFP is formed autocatalytically during protein folding. The chromophore is surrounded by the protein backbone forming a β-barrel structure. The importance of GFP lies in the possibility of it attaching to other proteins in the cell. This allows visualisation of target proteins in living cells and organisms, e.g. to observe protein movements or accumulations in specific parts of the cell or organism. Developed mutants of GFP are fluorescent in other wavelength regions, e.g. the yellow fluorescent protein, YFP, or the cyan fluorescent protein, CFP. 35 Application of GFP & its variants The major application of fluorescence spectroscopy in biology is the visualisation and tracking of tagged molecules in complex environments such as cells, smaller organisms and tissue. By means of fluorescence microscopy, it is possible to observe distinct parts of the cells, e.g. the nucleus, the cytosol, the cell membrane and organelles. One can see the presence of target molecules in distinct areas and can visualise changes or movements of these molecules. 36 Limitation of conventional fluorescence microscopes Bright-field or wide-field fluorescence microscopes are commonly used to obtain images of individual cells (e.g. bacterial cells or mammalian cells). Individual cells are thin. Additionally, mammalian cells are typically grown in the lab as an adherent two-dimensional culture in a dish or multi-well plate. Issues arise when one tries to take images of thick specimens, such as tissue samples or three-dimensional organoids. 37 Confocal microscopy Confocal microscopy was developed to overcome some limitations of traditional wide-field fluorescence microscopes. In this technique, the sample is illuminated by the light of a tightly focused (monochromatic) laser (point illumination). No excitation occurs at all other positions outside of the focus. By use of a pinhole, emitted light from above and below the focus is blocked and does not reach the detector. Hence, in confocal microscopy, the background signals from regions away from the laser focus are low. However, as much of the light from sample fluorescence is blocked at the pinhole, this increased resolution is at the cost of decreased signal intensity. To offset this drop in signal after the pinhole, long exposures are often required and the light intensity is detected by a very sensitive detector. As only one point in the sample is illuminated at a time, 2D or 3D imaging requires scanning over a regular raster (i.e., a rectangular pattern of parallel scanning lines) in the specimen. In confocal laser scanning microscopes, the laser focus is moved quickly all over the sample. At every point, the fluorescence intensity is measured and finally put together to a full image. This process requires a few hundred microseconds for a sample size of a few tens of micrometers. Images of the specimen can be taken at various heights of the sample (“z-stack”) and combined to obtain a 3D image at high resolution. 38 Microscope schematics (a) In a conventional wide-field fluorescence microscope, the entire specimen is flooded evenly in light from a light source. All parts of the specimen in the optical path are excited at the same time and the resulting fluorescence is detected by the microscope's photodetector or camera, including a large unfocused background part. (b) A confocal microscope uses point illumination and a pinhole in an optically conjugate plane in front of the detector to eliminate out-of-focus signal. As only light produced by fluorescence very close to the focal plane can be detected, the image's optical resolution, particularly in the sample depth direction, is much better than that of wide-field microscopes. 39 Image Comparison These are a series of images that compare traditional widefield and laser scanning confocal fluorescence microscopy. (a) A thick section of fluorescently stained human medulla in widefield fluorescence exhibits a large amount of glare from fluorescent structures above and below the focal plane. (d) When imaged with a confocal microscope, the medulla thick section reveals a significant degree of structural detail. (b) Widefield fluorescence imaging of whole rabbit muscle fibers stained with fluorescein produce blurred images lacking in detail. (e) Confocal microscopy reveals a highly striated topography in the same specimen. (c) Autofluorescence in a sunflower pollen grain produces an indistinct outline of the basic external morphology, but yields no indication of the internal structure. (f) A thin optical section of the same grain acquired with confocal techniques displays a dramatic difference between the particle core and the surrounding envelope. 40 Today’s Outline 1) Optical Spectroscopy Absorption of UV and visible light Fluorescence spectroscopy 2) Molecular Recognition Modes of detection Recognition of nucleic acids Recognition of proteins Biosensors 41 Modes of Detection Many methods have been developed to probe for the presence of certain DNA or RNA sequences or certain proteins. Ideally, we should be able to “see” the final results, although DNA, RNA, and proteins are really far too small to be seen by the naked eye. There are two modes of detection that are commonly used: (1) Fluorescence (2) Radioactivity 42 Radioactivity A radioactive isotope (also called radioisotope) is any of several species of the same chemical element with different masses whose nuclei are unstable and dissipate excess energy by spontaneously emitting radiation. The isotopes have the same number of protons but different number of neutrons. Radioisotopes are commonly used to detect very small amounts of DNA, RNA, or protein (in the femtogram [1x10-15 gram] to picogram [1x10-12 gram] levels). A radioactive isotope is introduced into DNA, RNA, or protein for quantification purposes and its presence can be detected by: - sensitive radiation detectors such as Geiger counters and liquid scintillation counters - exposure to X-ray films (autoradiography) or phosphor storage screens 43 Common radioisotopes used 32Pis frequently used for detecting DNA and RNA and has the highest emission energy of all common research radioisotopes. This is a major advantage in experiments for which sensitivity is a primary consideration. Its maximum specific activity is 9131 Ci/mmol. 35S is used to label proteins and nucleic acids. Cysteine is an amino acid containing a thiol group (-SH), which can be labelled by S-35. Since nucleotides do not contain a sulfur group, the oxygen on one of the phosphate groups can be substituted with a sulfur. This thiophosphate acts the same as a normal phosphate group, although there is a slight bias against it by most polymerases. The maximum theoretical specific activity is 1,494 Ci/mmol. 3His used to detect DNA and RNA. It is a very low energy emitter, with a maximum theoretical specific activity of 28.8 Ci/mmol. However, there is often more than one tritium atom per molecule: for example, tritiated UTP is sold by most suppliers with carbons 5 and 6 each bonded to a tritium atom. 125Iis used to radiolabel proteins, usually at tyrosine residues. Unbound iodine is volatile and must be handled in a fume hood. Its maximum specific activity is 2,176 Ci/mmol. 44 How long do radioisotopes last? The term half-life is defined as the time it takes for one-half of the atoms of a radioactive material to decay. It is independent of the original quantity. Different radioisotopes have different half-lifes. For example, P-32 has a half-life of 14.29 days, S-35 has a half-life of 87 days, while C-14 has a long half-life of 5,730 years. 45 Radiation sickness Radiation sickness is a condition where there is damage to the body occurring as a result of large doses of radiation received by the body over a short period of time. The radiation causes cellular degradation due to damage to DNA and other key molecular structures within the cells. Eyes: High doses can trigger cataracts months later. Thyroid: Hormone glands vulnerable to cancer. Radioactive iodine builds up in thyroid. Children most at risk. Lungs: Vulnerable to DNA damage when radioactive material is breathed in. Stomach: Vulnerable if radioactive material is swallowed. Reproductive organs: High doses can cause sterility. Skin: High doses cause redness and burning. Bone marrow: Site of production of red and white blood cells. Radiation can lead to leukemia and other immune system diseases. 46 Safety in using radioactivity We need to be careful when using radioactivity in the lab. For example, the high-energy beta emissions from 32P can present a substantial skin and eye dose hazard. Common safety measures include the following: - Designate special area for handling radioisotopes and clearly label all containers. - Store radioisotopes like 32P behind lead shielding. - Work behind acrylic glass shields and wear safety goggles to protect eyes from radiation. - Practise routine operations to improve dexterity and speed before using radioisotopes like 32P. - Handle potentially volatile chemical forms in ventilated enclosures. - Isolate radioactive waste in clearly labelled shielded containers and hold for decay. - Always check work area and work clothes (e.g. lab coat) for accidental spills after every experiment. 47 - Go for regular health checks to detect symptoms of radiation sickness. Today’s Outline 1) Optical Spectroscopy Absorption of UV and visible light Fluorescence spectroscopy 2) Molecular Recognition Modes of detection Recognition of nucleic acids Recognition of proteins Biosensors 48 What are restriction endonucleases? Restriction endonucleases (or restriction enzymes) recognize and cleave specific DNA sequences, i.e. they are DNA-cutting enzymes that only cleave at particular positions. The sequence of nucleotides that is recognized by each restriction enzyme is known as its restriction site. For example, BamHI recognizes GGATCC, while EcoRI recognizes GAATTC. The enzymes can cut their DNA substrate within the recognition site or outside the recognition site (i.e. the recognition and cleavage sites may be separate from each other). 49 Where do restriction enzymes come from? They are usually isolated from bacteria Restriction enzymes are usually named after the bacteria they are isolated from. For example: EcoRI – isolated from E. coli strain R HindIII – isolated from Haemophilus influenzae strain Rd Nobel Prize in 1978 was awarded for discovery of restriction enzymes 50 If bacteria produce these enzymes, why isn’t the bacterial DNA digested by them?  Probability of finding a target of: 4 bases – 1/256 5 bases – 1/1024 6 bases – 1/4096 8 bases – 1/65,536  Restriction endonucleases protect bacteria from invasion by foreign DNA (like phage) (Restrict the host range of the virus)  Bacteria use methylation to protect their chromosomal DNA; invading viral DNA is not methylated  Many restriction enzymes cannot cleave methylated DNA 51 Restriction enzymes often function as homodimers 5' EcoRI on DNA G C A T A T T A T A C G 5' a cartoon view Hence, restriction sites are often (but not always) palindromic. 52 Sticky ends vs. blunt ends Upon cleavage, restriction enzymes often leave a single stranded overhang (“sticky” end). The overhang can be either 5’ or 3’. Sometimes restriction enzymes leave a “blunt” end. (5’ overhang) 53 5’-TCAGATCGTACTTGAGAATTCGGGCT-3’ 3’-AGTCTAGCATGAACTCTTAAGCCCGA-5’ EcoRI 5’-TCAGATCGTACTTGAG AATTCGGGCT-3’ 3’-AGTCTAGCATGAACTCTTAA GCCCGA-5’ Sticky ends 54 5’-TCAGATCGTACTTGAGAATTCGGGCT-3’ Fragment 1 3’-AGTCTAGCATGAACTCTTAAGCCCGA-5’ Fragment 2 5’-CTAGGACCGAATTCAAGTACGGACC-3’ 3’-GATCCTGGCTTAAGTTCATGCCTGG-5’ EcoRI 5’-TCAGATCGTACTTGAG AATTCGGGCT-3’ 3’-AGTCTAGCATGAACTCTTAA GCCCGA-5’ 5’-CTAGGACCG AATTCAAGTACGGACC 3’ 3’-GATCCTGGCTTAA GTTCATGCCTGG 5’ 55 A new recombinant DNA molecule 5’-TCAGATCGTACTTGAGAATTCAAGTACGGACC-3’ 3’-AGTCTAGCATGAACTCTTAAGTTCATGCCTGG-5’ DNA ligase seals the nicks between the two strands, reforming covalent phosphodiester bonds 56 What is happening in the DNA backbone? Restriction Enzyme DNA Ligase 57 Analogies Restriction Enzyme DNA Ligase Summary of DNA cut-and-paste Examples of restriction enzymes EcoRI SmaI BamHI XmaI KpnI NotI Hundreds of enzymes are available (for example, see: 60 http://www.neb.sg/products/restriction-endonucleases) Type II vs type IIs restriction enzymes Type II enzymes cut within their recognition sequences. EcoRI: BamHI: Type IIs enzymes cut outside of their recognition sequences. BsmBI: The “N”s are useful BbsI: for designing unique overhangs BsaI: Overview of Southern Blot A Southern blot is a method for detection of a specific DNA sequence in DNA samples. It was invented by Sir Edwin Southern, Professor Emeritus at University of Oxford. 62 Details of South

CH4306 Bioanalytical Techniques Lecture Notes PDF

Document Details

Tags

Related

Summary

Full Transcript