Principles and Techniques of Biochemistry and Molecular Biology PDF

Principles and Techniques of Biochemistry and Molecular Biology Seventh edition EDITED BY KEITH WILSON AND JOHN WALKER This new edition of the bestselling textbook integrates the theoretical principles and experimental techniques common to all undergraduate courses in the bio- and medical sciences. Three of the 16 chapters have new authors and have been totally rewritten. The others have been updated and extended to reﬂect developments in their ﬁeld exempliﬁed by a new section on stem cells. Two new chapters have been added. One on clinical biochemistry discusses the principles underlying the diagnosis and management of common biochemical disorders. The second one on drug discovery and development illustrates how the principles and techniques covered in the book are fundamental to the design and development of new drugs. In-text worked examples are again used to enhance student understanding of each topic and case studies are selectively used to illustrate important examples. Experimental design, quality assurance and the statistical analysis of quantitative data are emphasised throughout the book. Motivates students by including cutting-edge topics and techniques, such as drug discovery, as well as the methods they will encounter in their own lab classes Promotes problem solving by setting students a challenge and then guiding them through the solution Integrates theory and practise to ensure students understand why and how each technique is used. K E I T H W I L S O N is Professor Emeritus of Pharmacological Biochemistry and former Head of the Department of Biosciences, Dean of the Faculty of Natural Sciences, and Director of Research at the University of Hertfordshire. J O H N W A L K E R is Professor Emeritus and former Head of the School of Life Sciences at the University of Hertfordshire. Cover illustration Main image Electrophoresis gel showing recombinant protein. Photographer: J.C. Revy. Courtesy of Science Photo Library. Top inset Transcription factor and DNA molecule. Courtesy of: Laguna Design/Science Photo Library. Second inset Microtubes, pipettor (pipette) tip & DNA sequence. Courtesy of Tek Image/Science Photo Library. Third inset Stem cell culture, light micrograph. Photographer: Philippe Plailly. Courtesy of Science Photo Library. Fourth inset Embryonic stem cells. Courtesy of Science Photo Library. Bottom inset Herceptin breast cancer drug, molecular model. Photographer: Tim Evans. Courtesy of Science Photo Library. Principles and Techniques of Biochemistry and Molecular Biology Seventh edition Edited by KEITH WILSON AND JOHN WALKER CAMBRIDGE UNIVERSITY PRESS Cambridge, New York, Melbourne, Madrid, Cape Town, Singapore, São Paulo, Delhi, Dubai, Tokyo Cambridge University Press The Edinburgh Building, Cambridge CB2 8RU, UK Published in the United States of America by Cambridge University Press, New York www.cambridge.org Information on this title: www.cambridge.org/9780521516358 First and second editions # Bryan Williams and Keith Wilson 1975, 1981 Third edition # Keith Wilson and Kenneth H. Goulding 1986 Fourth edition # Cambridge University Press 1993 Fifth edition # Cambridge University Press 2000 Sixth edition # Cambridge University Press 2005 Seventh edition # Cambridge University Press 2010 This publication is in copyright. Subject to statutory exception and to the provisions of relevant collective licensing agreements, no reproduction of any part may take place without the written permission of Cambridge University Press. First published by Edward Arnold 1975 as A Biologist’s Guide to Principles and Techniques of Practical Biochemistry Second edition 1981; Third edition 1986 Third edition first published by Cambridge University Press 1992; Reprinted 1993 Fourth edition published by Cambridge University Press 1994 as Principles and Techniques of Practical Biochemistry; Reprinted 1995, 1997; Fifth edition 2000 Sixth edition first published by Cambridge University Press 2005 as Principles and Techniques of Biochemistry and Molecular Biology; Reprinted 2006, 2007 Seventh edition first published by Cambridge University Press 2010 Printed in the United Kingdom at the University Press, Cambridge A catalogue record for this publication is available from the British Library Library of Congress Cataloging-in-Publication Data Principles and techniques of biochemistry and molecular biology / edited by Keith Wilson, John Walker. – 7th ed. p. cm. ISBN 978-0-521-51635-8 (hardback) – ISBN 978-0-521-73167-6 (pbk.) 1. Biochemistry–Textbooks. 2. Molecular biology–Textbooks. I. Wilson, Keith, 1936– II. Walker, John M., 1948– III. Title. QP519.7.P75 2009 6120.015–dc22 2009043277 ISBN 978-0-521-51635-8 Hardback ISBN 978-0-521-73167-6 Paperback Cambridge University Press has no responsibility for the persistence or accuracy of URLs for external or third-party internet websites referred to in this publication, and does not guarantee that any content on such websites is, or will remain, accurate or appropriate. CONTENTS Preface to the seventh edition page xi List of contributors xiii List of abbreviations xv 1 Basic principles 1 K. WILSON 1.1 Biochemical and molecular biology studies 1 1.2 Units of measurement 3 1.3 Weak electrolytes 6 1.4 Quantitative biochemical measurements 16 1.5 Safety in the laboratory 35 1.6 Suggestions for further reading 37 2 Cell culture techniques 38 A. R. BAYDOUN 2.1 Introduction 38 2.2 The cell culture laboratory and equipment 39 2.3 Safety considerations in cell culture 43 2.4 Aseptic techniques and good cell culture practice 44 2.5 Types of animal cell, characteristics and maintenance in culture 49 2.6 Stem cell culture 61 2.7 Bacterial cell culture 68 2.8 Potential use of cell cultures 71 2.9 Suggestions for further reading 72 3 Centrifugation 73 K. O H L EN D I EC K 3.1 Introduction 73 3.2 Basic principles of sedimentation 74 3.3 Types, care and safety aspects of centrifuges 79 3.4 Preparative centrifugation 86 3.5 Analytical centrifugation 95 3.6 Suggestions for further reading 99 v vi Contents 4 Microscopy 100 S. W. PADDOCK 4.1 Introduction 100 4.2 The light microscope 103 4.3 Optical sectioning 116 4.4 Imaging living cells and tissues 123 4.5 Measuring cellular dynamics 126 4.6 The electron microscope (EM) 129 4.7 Image archiving 133 4.8 Suggestions for further reading 136 5 Molecular biology, bioinformatics and basic techniques 138 R. RAPLEY 5.1 Introduction 138 5.2 Structure of nucleic acids 139 5.3 Genes and genome complexity 145 5.4 Location and packaging of nucleic acids 149 5.5 Functions of nucleic acids 152 5.6 The manipulation of nucleic acids – basic tools and techniques 162 5.7 Isolation and separation of nucleic acids 164 5.8 Molecular biology and bioinformatics 170 5.9 Molecular analysis of nucleic acid sequences 171 5.10 The polymerase chain reaction (PCR) 178 5.11 Nucleotide sequencing of DNA 187 5.12 Suggestions for further reading 194 6 Recombinant DNA and genetic analysis 195 R. RAPLEY 6.1 Introduction 195 6.2 Constructing gene libraries 196 6.3 Cloning vectors 206 6.4 Hybridisation and gene probes 223 6.5 Screening gene libraries 225 6.6 Applications of gene cloning 229 6.7 Expression of foreign genes 234 6.8 Analysing genes and gene expression 240 6.9 Analysing whole genomes 254 6.10 Pharmacogenomics 259 6.11 Molecular biotechnology and applications 260 6.12 Suggestions for further reading 262 7 Immunochemical techniques 263 R. BURNS 7.1 Introduction 263 7.2 Making antibodies 273 vii Contents 7.3 Immunoassay formats 283 7.4 Immuno microscopy 291 7.5 Lateral ﬂow devices 291 7.6 Epitope mapping 292 7.7 Immunoblotting 293 7.8 Fluorescent activated cell sorting (FACS) 293 7.9 Cell and tissue staining techniques 294 7.10 Immunocapture polymerase chain reaction (PCR) 295 7.11 Immunoafﬁnity chromatography (IAC) 295 7.12 Antibody-based biosensors 296 7.13 Therapeutic antibodies 297 7.14 The future uses of antibody technology 299 7.15 Suggestions for further reading 299 8 Protein structure, puriﬁcation, characterisation and function analysis 300 J. WALKER 8.1 Ionic properties of amino acids and proteins 300 8.2 Protein structure 304 8.3 Protein puriﬁcation 307 8.4 Protein structure determination 328 8.5 Proteomics and protein function 340 8.6 Suggestions for further reading 351 9 Mass spectrometric techniques 352 A. AITKEN 9.1 Introduction 352 9.2 Ionisation 354 9.3 Mass analysers 359 9.4 Detectors 377 9.5 Structural information by tandem mass spectrometry 379 9.6 Analysing protein complexes 390 9.7 Computing and database analysis 394 9.8 Suggestions for further reading 397 10 Electrophoretic techniques 399 J. WALKER 10.1 General principles 399 10.2 Support media 403 10.3 Electrophoresis of proteins 407 10.4 Electrophoresis of nucleic acids 422 10.5 Capillary electrophoresis 427 10.6 Microchip electrophoresis 431 10.7 Suggestions for further reading 432 viii Contents 11 Chromatographic techniques 433 K. WILSON 11.1 Principles of chromatography 433 11.2 Chromatographic performance parameters 435 11.3 High-performance liquid chromatography 446 11.4 Adsorption chromatography 453 11.5 Partition chromatography 455 11.6 Ion-exchange chromatography 459 11.7 Molecular (size) exclusion chromatography 462 11.8 Afﬁnity chromatography 465 11.9 Gas chromatography 470 11.10 Suggestions for further reading 476 12 Spectroscopic techniques: I Spectrophotometric techniques 477 A. HOFMANN 12.1 Introduction 477 12.2 Ultraviolet and visible light spectroscopy 482 12.3 Fluorescence spectroscopy 493 12.4 Luminometry 507 12.5 Circular dichroism spectroscopy 509 12.6 Light scattering 514 12.7 Atomic spectroscopy 516 12.8 Suggestions for further reading 519 13 Spectroscopic techniques: II Structure and interactions 522 A. HOFMANN 13.1 Introduction 522 13.2 Infrared and Raman spectroscopy 523 13.3 Surface plasmon resonance 527 13.4 Electron paramagnetic resonance 530 13.5 Nuclear magnetic resonance 536 13.6 X-ray diffraction 546 13.7 Small-angle scattering 549 13.8 Suggestions for further reading 551 14 Radioisotope techniques 553 R. J. SLATER 14.1 Why use a radioisotope? 553 14.2 The nature of radioactivity 554 14.3 Detection and measurement of radioactivity 561 14.4 Other practical aspects of counting of radioactivity and analysis of data 573 14.5 Safety aspects 577 14.6 Suggestions for further reading 580 ix Contents 15 Enzymes 581 K. WILSON 15.1 Characteristics and nomenclature 581 15.2 Enzyme steady-state kinetics 584 15.3 Analytical methods for the study of enzyme reactions 602 15.4 Enzyme active sites and catalytic mechanisms 611 15.5 Control of enzyme activity 615 15.6 Suggestions for further reading 624 16 Principles of clinical biochemistry 625 J. F Y F F E A N D K. W I L S ON 16.1 Principles of clinical biochemical analysis 625 16.2 Clinical measurements and quality control 629 16.3 Examples of biochemical aids to clinical diagnosis 640 16.4 Suggestions for further reading 658 16.5 Acknowledgements 659 17 Cell membrane receptors and cell signalling 660 K. WILSON 17.1 Receptors for cell signalling 660 17.2 Quantitative aspects of receptor–ligand binding 663 17.3 Ligand-binding and cell-signalling studies 680 17.4 Mechanisms of signal transduction 685 17.5 Receptor trafﬁcking 703 17.6 Suggestions for further reading 707 18 Drug discovery and development 709 K. WILSON 18.1 Human disease and drug therapy 709 18.2 Drug discovery 718 18.3 Drug development 727 18.4 Suggestions for further reading 734 Index 736 The colour ﬁgure section is between pages 128 and 129 PREFACE TO THE SEVENTH EDITION In designing the content of this latest edition we continued our previous policy of placing emphasis on the recommendations we have received from colleagues and academics outside our university. Above all, we have attempted to respond to the invaluable feedback from student users of our book both in the UK and abroad. In this seventh edition we have retained all 16 chapters from the previous edition. All have been appropriately updated to reﬂect recent developments in their ﬁelds, as exempliﬁed by the inclusion of a section on stem cells in the cell culture chapter. Three of these chapters have new authors and have been completely rewritten. Robert Burns, Scottish Agricultural Science Agency, Edinburgh has written the chapter on immunochemical techniques, and Andreas Hofmann, Eskitis Institute of Molecular Therapies, Grifﬁth University, Brisbane, Australia has written the two chapters on spectroscopic techniques. We are delighted to welcome both authors to our team of contributors. In addition to these changes of authors, two new chapters have been added to the book. Our decision taken for the sixth edition to include a section on the biochemical principles underlying clinical biochemistry has been well received and so we have extended our coverage of the subject and have devoted a whole chapter (16) to this subject. Written in collaboration with Dr John Fyffe, Consultant Biochemist, Royal Hospital for Sick Children, Yorkhill, Glasgow, new topics that are discussed in the chapter include the diagnosis and management of kidney disease, diabetes, endocrine disorders including thyroid dysfunction, conditions of the hypothalamus–pituitary– adrenal axis such as pregnancy, and pathologies of plasma proteins such as myeloma. Case studies are included to illustrate how the principles discussed apply to the diagnosis and treatment of individual patients with the conditions. Our second major innovation for this new edition is the introduction of a new chapter on drug discovery and development. The strategic approaches to the discovery of new drugs has been revolutionised by developments in molecular biology. Pharma- ceutical companies now rely on many of the principles and experimental techniques discussed in the chapters throughout the book to identify potential drug targets, screen chemical libraries and to evaluate the safety and efﬁcacy of selected candidate drugs. The new chapter illustrates the principles of target selection by reference to current drugs used in the treatment of atherosclerosis and HIV/AIDS, emphasises the strategic decisions to be taken during the various stages of drug discovery and xi xii Preface to the seventh edition development and discusses the issues involved in clinical trials and the registration of new drugs. We continue to welcome constructive comments from all students who use our book as part of their studies and academics who adopt the book to complement their teaching. Finally, we wish to express our gratitude to the authors and publishers who have granted us permission to reproduce their copyright ﬁgures and our thanks to Katrina Halliday and her colleagues at Cambridge University Press who have been so supportive in the production of this new edition. KEITH WILSON AND JOHN WALKER CONTRIBUTORS PROFESSOR A. AITKEN Division of Biomedical & Clinical Laboratory Sciences University of Edinburgh George Square Edinburgh EH8 9XD Scotland, UK D R A. R. B A Y D O UN School of Life Sciences University of Hertfordshire College Lane Hatfield Herts AL10 9AB, UK DR R. BURNS Scottish Agricultural Science Agency 1 Roddinglaw Road Edinburgh EH12 9FJ Scotland, UK DR J. FYFFE Consultant Clinical Biochemist Department of Clinical Biochemistry Royal Hospital for Sick Children Yorkhill Glasgow G3 8SF Scotland, UK PROFESSOR ANDREAS HOFMANN Structural Chemistry Eskitis Institute for Cell & Molecular Therapeutics Griffith University Nathan Brisbane, Qld 4111 Australia xiii xiv List of contributors P R O F E S S O R K. OH L EN D I EC K Department of Biology National University of Ireland Maynooth Co. Kildare Ireland DR S. W. PADDOCK Howard Hughes Medical Institute Department of Molecular Biology University of Wisconsin 1525 Linden Drive Madison, WI 53706 USA DR R. RAPLEY School of Life Sciences University of Hertfordshire College Lane Hatfield Herts AL10 9AB, UK PROFESSOR R. J. SLATER School of Life Sciences University of Hertfordshire College Lane Hatfield Herts AL10 9AB, UK PROFESSOR J. M. WALKER School of Life Sciences University of Hertfordshire College Lane Hatfield Herts AL10 9AB, UK P R O F E S S O R K. W I L S ON Emeritus Professor of Pharmacological Biochemistry School of Life Sciences University of Hertfordshire College Lane Hatfield Herts AL10 9AB, UK ABBREVIATIONS The following abbreviations have been used throughout this book. AMP adenosine 50 -monophosphate ADP adenosine 50 -diphosphate ATP adenosine 50 -triphosphate bp base-pairs cAMP cyclic AMP CHAPS 3-[(3-chloroamidopropyl)dimethylamino]-1-propanesulphonic acid c.p.m. counts per minute CTP cytidine triphosphate DDT 2,2-bis-(p-chlorophenyl)-1,1,1-trichloroethane DMSO dimethylsulphoxide DNA deoxyribonucleic acid e electron EDTA ethylenediaminetetra-acetate ELISA enzyme-linked immunosorbent assay FAD ﬂavin adenine dinucleotide (oxidised) FADH2 ﬂavin adenine dinucleotide (reduced) FMN ﬂavin mononucleotide (oxidised) FMNH2 ﬂavin mononucleotide (reduced) GC gas chromatography GTP guanosine triphosphate HAT hypoxanthine, aminopterin, thymidine medium Hepes 4(2-hydroxyethyl)-1-piperazine-ethanesulphonic acid HPLC high-performance liquid chromatography kb kilobase-pairs Mr relative molecular mass min minute NADþ nicotinamide adenine dinucleotide (oxidised) NADH nicotinamide adenine dinucleotide (reduced) NADPþ nicotinamide adenine dinucleotide phosphate (oxidised) NADPH nicotinamide adenine dinucleotide phosphate (reduced) Pipes 1,4-piperazinebis(ethanesulphonic acid) xv xvi List of abbreviations Pi inorganic phosphate p.p.m. parts per million p.p.b. parts per billion PPi inorganic pyrophosphate RNA ribonucleic acid r.p.m. revolutions per minute SDS sodium dodecyl sulphate Tris 2-amino-2-hydroxymethylpropane-1,3-diol 1 Basic principles K. WILSON 1.1 Biochemical and molecular biology studies 1.2 Units of measurement 1.3 Weak electrolytes 1.4 Quantitative biochemical measurements 1.5 Safety in the laboratory 1.6 Suggestions for further reading 1.1 BIOCHEMICAL AND MOLECULAR BIOLOGY STUDIES 1.1.1 Aims of laboratory investigations Biochemistry involves the study of the chemical processes that occur in living organ- isms with the ultimate aim of understanding the nature of life in molecular terms. Biochemical studies rely on the availability of appropriate analytical techniques and on the application of these techniques to the advancement of knowledge of the nature of, and relationships between, biological molecules, especially proteins and nucleic acids, and cellular function. In recent years huge advances have been made in our under- standing of gene structure and expression and in the application of techniques such as mass spectrometry to the study of protein structure and function. The Human Genome Project in particular has been the stimulus for major developments in our understand- ing of many human diseases especially cancer and for the identiﬁcation of strategies that might be used to combat these diseases. The discipline of molecular biology overlaps with that of biochemistry and in many respects the aims of the two disciplines complement each other. Molecular biology is focussed on the molecular understanding of the processes of replication, transcription and translation of genetic material whereas biochemistry exploits the techniques and ﬁndings of molecular biology to advance our understanding of such cellular processes as cell signalling and apoptosis. The result is that the two disciplines now have the opportunity to address issues such as: the structure and function of the total protein component of the cell (proteomics) and of all the small molecules in the cell (metabolomics); the mechanisms involved in the control of gene expression; 1 2 Basic principles the identiﬁcation of genes associated with a wide range of human diseases; the development of gene therapy strategies for the treatment of human diseases; the characterisation of the large number of ‘orphan’ receptors, whose physiological role and natural agonist are currently unknown, present in the human genome and their exploitation for the development of new therapeutic agents; the identiﬁcation of novel disease-speciﬁc markers for the improvement of clinical diagnosis; the engineering of cells, especially stem cells, to treat human diseases; the understanding of the functioning of the immune system in order to develop strategies for the protection against invading pathogens; the development of our knowledge of the molecular biology of plants in order to engineer crop improvements, pathogen resistance and stress tolerance; the application of molecular biology techniques to the nature and treatment of bacterial, fungal and viral diseases. The remaining chapters in this book address the major experimental strategies and analytical techniques that are routinely used to address issues such as these. 1.1.2 Experimental design Advances in biochemistry and molecular biology, as in all the sciences, are based on the careful design, execution and data analysis of experiments designed to address speciﬁc questions or hypotheses. Such experimental design involves a discrete number of compulsory stages: the identiﬁcation of the subject for experimental investigation; the critical evaluation of the current state of knowledge (the ‘literature’) of the chosen subject area noting the strengths and weaknesses of the methodologies previously applied and the new hypotheses which emerged from the studies; the formulation of the question or hypothesis to be addressed by the planned experiment; the careful selection of the biological system (species, in vivo or in vitro) to be used for the study; the identiﬁcation of the variable that is to be studied; the consideration of the other variables that will need to be controlled so that the selected variable is the only factor that will determine the experimental outcome; the design of the experiment including the statistical analysis of the results, careful evaluation of the materials and apparatus to be used and the consequential potential safety aspects of the study; the execution of the experiment including appropriate calibrations and controls, with a carefully written record of the outcomes; the replication of the experiment as necessary for the unambiguous analysis of the outcomes; 3 1.2 Units of measurement the evaluation of the outcomes including the application of appropriate statistical tests to quantitative data where applicable; the formulation of the main conclusions that can be drawn from the results; the formulation of new hypotheses and of future experiments that emerge from the study. The results of well-designed and analysed studies are ﬁnally published in the scientiﬁc literature after being subject to independent peer review, and one of the major challenges facing professional biochemists and molecular biologists is to keep abreast of current advances in the literature. Fortunately, the advent of the web has made access to the literature easier than it once was. 1.2 UNITS OF MEASUREMENT 1.2.1 SI units The French Système International d0 Unités (the SI system) is the accepted convention for all units of measurement. Table 1.1 lists basic and derived SI units. Table 1.2 lists numerical values for some physical constants in SI units. Table 1.3 lists the commonly used preﬁxes associated with quantitative terms. Table 1.4 gives the interconversion of non-SI units of volume. 1.2.2 Molarity – the expression of concentration In practical terms one mole of a substance is equal to its molecular mass expressed in grams, where the molecular mass is the sum of the atomic masses of the constituent atoms. Note that the term molecular mass is preferred to the older term molecular weight. The SI unit of concentration is expressed in terms of moles per cubic metre (mol m3) (see Table 1.1). In practice this is far too large for normal laboratory purposes and a unit based on a cubic decimetre (dm3, 103 m) is preferred. However, some textbooks and journals, especially those of North American origin, tend to use the older unit of volume, namely the litre and its subunits (see Table 1.4) rather than cubic decimetres. In this book, volumes will be expressed in cubic decimetres or its smaller counterparts (Table 1.4). The molarity of a solution of a substance expresses the number of moles of the substance in one cubic decimetre of solution. It is expressed by the symbol M. It should be noted that atomic and molecular masses are both expressed in daltons (Da) or kilodaltons (kDa), where one dalton is an atomic mass unit equal to one- twelfth of the mass of one atom of the 12C isotope. However, biochemists prefer to use the term relative molecular mass (Mr). This is deﬁned as the molecular mass of a substance relative to one-twelfth of the atomic mass of the 12C isotope. Mr therefore has no units. Thus the relative molecular mass of sodium chloride is 23 (Na) plus 4 Basic principles Table 1.1 SI units – basic and derived units Symbol Deﬁnition Equivalent Quantity SI unit (basic SI units) of SI unit in SI units Basic units Length metre m Mass kilogram kg Time second s Electric current ampere A Temperature kelvin K Luminous intensity candela cd Amount of mole mol substance Derived units Force newton N kg m s2 J m1 Energy, work, heat joule J kg m2 s2 Nm Power, radiant ﬂux watt W kg m2 s3 J s1 Electric charge, coulomb C As J V1 quantity Electric potential volt V kg m2 s3A1 J C1 difference Electric resistance ohm O kg m2 s3A2 V A1 Pressure pascal Pa kg m1 s2 N m2 Frequency hertz Hz s1 Magnetic ﬂux tesla T kg s2 A1 V s m2 density Other units based on SI Area square metre m2 Volume cubic metre m3 Density kilogram per kg m3 cubic metre Concentration mole per cubic mol m3 metre 5 1.2 Units of measurement Table 1.2 SI units – conversion factors for non-SI units Unit Symbol SI equivalent Avogadro constant L or NA 6.022 1023 mol1 Faraday constant F 9.648 104 C mol1 Planck constant h 6.626 1034 J s Universal or molar gas constant R 8.314 J K1 mol1 Molar volume of an ideal gas at s.t.p. 22.41 dm3 mol1 Velocity of light in a vacuum c 2.997 108 m s1 Energy calorie cal 4.184 J erg erg 107 J electron volt eV 1.602 1019 J Pressure atmosphere atm 101 325 Pa bar bar 105 Pa millimetres of Hg mm Hg 133.322 Pa Temperature centigrade C (t C þ 273.15) K Fahrenheit F (t F – 32)5/9 þ 273.15 K Length Ångström Å 1010 m inch in 0.0254 m Mass pound lb 0.4536 kg Note: s.t.p., standard temperature and pressure. 35.5 (Cl) i.e. 58.5, so that one mole is 58.5 grams. If this was dissolved in water and adjusted to a total volume of 1 dm3 the solution would be one molar (1 M). Biological substances are most frequently found at relatively low concentrations and in in vitro model systems the volumes of stock solutions regularly used for experimental purposes are also small. The consequence is that experimental solutions are usually in the mM, mM and nM range rather than molar. Table 1.5 shows the interconversion of these units. 6 Basic principles Table 1.3 Common unit preﬁxes associated with quantitative terms Multiple Preﬁx Symbol Multiple Preﬁx Symbol 1024 yotta Y 101 deci d 1021 zetta Z 102 centi c 1018 exa E 103 milli m 15 6 10 peta P 10 micro m 1012 tera T 109 nano n 109 giga G 1012 pico p 106 mega M 1015 femto f 3 18 10 kilo k 10 atto a 102 hecto h 1021 zepto z 101 deca da 1024 yocto y Table 1.4 Interconversion of non-SI and SI units of volume Non-SI unit Non-SI subunit SI subunit SI unit 1 litre (l) 103 ml ¼ 1 dm3 ¼ 103 m3 1 millilitre (ml) 1 ml ¼ 1 cm3 ¼ 106 m3 1 microlitre (ml) 103 ml ¼ 1 mm3 ¼ 109 m3 1 nanolitre (nl) 106 ml ¼ 1 nm3 ¼ 1012 m3 Table 1.5 Interconversion of mol, mmol and mmol in different volumes to give different concentrations Molar (M) Millimolar (mM) Micromolar (mM) 1 mol dm3 1 mmol dm3 1 mmol dm3 1 mmol cm3 1 mmol cm3 1 nmol cm3 1 mmol mm3 1 nmol mm3 1 pmol mm3 1.3 WEAK ELECTROLYTES 1.3.1 The biochemical importance of weak electrolytes Many molecules of biochemical importance are weak electrolytes in that they are acids or bases that are only partially ionised in aqueous solution. Examples include 7 1.3 Weak electrolytes the amino acids, peptides, proteins, nucleosides, nucleotides and nucleic acids. It also includes the reagents used in the preparation of buffers such as ethanoic (acetic) acid and phosphoric acid. The biochemical function of many of these molecules is depen- dent upon their precise state of ionisation at the prevailing cellular or extracellular pH. The catalytic sites of enzymes, for example, contain functional carboxyl and amino groups, from the side chains of constituent amino acids in the protein chain, which need to be in a speciﬁc ionised state to enable the catalytic function of the enzyme to be realised. Before the ionisation of these compounds is discussed in detail, it is necessary to appreciate the importance of the ionisation of water. 1.3.2 Ionisation of weak acids and bases One of the most important weak electrolytes is water since it ionises to a small extent to give hydrogen ions and hydroxyl ions. In fact there is no such species as a free hydrogen ion in aqueous solution as it reacts with water to give a hydronium ion (H3Oþ): H2 OÐHþ þ HO Hþ þ H2 OÐH3 Oþ Even though free hydrogen ions do not exist it is conventional to refer to them rather than hydronium ions. The equilibrium constant (Keq) for the ionisation of water has a value of 1.8 1016 at 24 C: ½Hþ ½OH Keq ¼ ¼ 1:8 1016 ð1:1Þ ½H2 O The molarity of pure water is 55.6 M. This can be incorporated into a new constant, Kw: 1:8 1016 55:6 ¼ ½Hþ ½HO ¼ 1:0 1014 ¼ Kw ð1:2Þ Kw is known as the autoprotolysis constant of water and does not include an expression for the concentration of water. Its numerical value of exactly 1014 relates speciﬁcally to 24 C. At 0 C Kw has a value of 1.14 1015 and at 100 C a value of 5.45 1013. The stoichiometry in equation 1.2 shows that hydrogen ions and hydroxyl ions are produced in a 1 : 1 ratio, hence both of them must be present at a concentration of 1.0 107 M. Since the Sörensen deﬁnition of pH is that it is equal to the negative logarithm of the hydrogen ion concentration, it follows that the pH of pure water is 7.0. This is the deﬁnition of neutrality. Ionisation of carboxylic acids and amines As previously stressed, many biochemically important compounds contain a carboxyl group (-COOH) or a primary (RNH2), secondary (R2NH) or tertiary (R3N) amine which can donate or accept a hydrogen ion on ionisation. The tendency of a weak acid, generically represented as HA, to ionise is expressed by the equilibrium reaction: HA Ð Hþ þ A weak acid conjugate base ðanionÞ 8 Basic principles This reversible reaction can be represented by an equilibrium constant, Ka, known as the acid dissociation constant (equation 1.3). Numerically, it is very small. ½Hþ ½A Ka ¼ ð1:3Þ ½HA Note that the ionisation of a weak acid results in the release of a hydrogen ion and the conjugate base of the acid, both of which are ionic in nature. Similarly, amino groups (primary, secondary and tertiary) as weak bases can exist in ionised and unionised forms and the concomitant ionisation process is represented by an equilibrium constant, Kb (equation 1.4): RNH2 þ H2 O Ð RNHþ 3 þ HO weak base conjugate acid ðprimary amineÞ ðsubstituted ammonium ionÞ ½RNHþ 3 ½HO Kb ¼ ð1:4Þ ½RNH2 ½H2 O In this case, the non-ionised form of the base abstracts a hydrogen ion from water to produce the conjugate acid that is ionised. If this equation is viewed from the reverse direction it is of a similar format to that of equation 1.3. Equally, equation 1.3 viewed in reverse is similar in format to equation 1.4. A speciﬁc and simple example of the ionisation of a weak acid is that of acetic (ethanoic) acid, CH3COOH: CH3 COOH Ð CH3 COO þ Hþ acetic acid acetate anion Acetic acid and its conjugate base, the acetate anion, are known as a conjugate acid– base pair. The acid dissociation constant can be written in the following way: ½CH3 COO ½Hþ ½conjugate base½Hþ Ka ¼ ¼ ð1:5aÞ ½CH3 COOH ½weak acid Ka has a value of 1.75 105 M. In practice it is far more common to express the Ka value in terms of its negative logarithm (i.e. logKa) referred to as pKa. Thus in this case pKa is equal to 4.75. It can be seen from equation 1.3 that pKa is numerically equal to the pH at which 50% of the acid is protonated (unionised) and 50% is deprotonated (ionised). It is possible to write an expression for the Kb of the acetate anion as a conjugate base: CH3 COO 3 þ H2 OÐCH3 COOH þ HO ½CH3 COOH½HO ½weak acid½OH ð1:5bÞ Kb ¼ ¼ ½CH3 COO ½conjugate base Kb has a value of 1.77 1010 M, hence its pKb (i.e. log Kb) ¼ 9.25. Multiplying these two expressions together results in the important relationship: Ka Kb ¼ ½Hþ ½OH ¼ Kw ¼ 1:0 1014 at 24 C 9 1.3 Weak electrolytes Table 1.6 pKa values of some acids and bases that are commonly used as buffer solutions Acid or base pKa Acetic acid 4.75 Barbituric acid 3.98 Carbonic acid 6.10, 10.22 Citric acid 3.10, 4.76, 5.40 Glycylglycine 3.06, 8.13 Hepesa 7.50 Phosphoric acid 1.96, 6.70, 12.30 Phthalic acid 2.90, 5.51 Pipesa 6.80 Succinic acid 4.18, 5.56 Tartaric acid 2.96, 4.16 Trisa 8.14 Note: aSee list of abbreviations at the front of the book. hence pKa þ pKb ¼ pKw ¼ 14 ð1:6Þ This relationship holds for all acid–base pairs and enables one pKa value to be calculated from knowledge of the other. Biologically important examples of conjugate acid–base pairs are lactic acid/lactate, pyruvic acid/pyruvate, carbonic acid/bicarbon- ate and ammonium/ammonia. In the case of the ionisation of weak bases the most common convention is to quote the Ka or the pKa of the conjugate acid rather than the Kb or pKb of the weak base itself. Examples of the pKa values of some weak acids and bases are given in Table 1.6. Remember that the smaller the numerical value of pKa the stronger the acid (more ionised) and the weaker its conjugate base. Weak acids will be predominantly unionised at low pH values and ionised at high values. In contrast, weak bases will be predominantly ionised at low pH values and unionised at high values. This sensitivity to pH of the state of ionisation of weak electrolytes is important both physiologically and in in vitro biochemical studies em- ploying such analytical techniques as electrophoresis and ion-exchange chromatography. Ionisation of polyprotic weak acids and bases Polyprotic weak acids and bases are capable of donating or accepting more than one hydrogen ion. Each ionisation stage can be represented by a Ka value using the convention that Ka1 refers to the acid with the most ionisable hydrogen atoms and Kan the acid with the least number of ionisable hydrogen atoms. One of the most important 10 Basic principles biochemical examples is phosphoric acid, H3PO4, as it is widely used as the basis of a buffer in the pH region of 6.70 (see below): H3 PO4 ÐHþ þ H2 PO 4 pKa1 1:96 H2 PO 4 ÐH þ þ HPO42 pKa2 6:70 HPO42 ÐHþ þ PO43 pKa3 12:30 Example 1 CALCULATION OF pH AND THE EXTENT OF IONISATION OF A WEAK ELECTROLYTE Question Calculate the pH of a 0.01 M solution of acetic acid and its fractional ionisation given that its Ka is 1.75 105. Answer To calculate the pH we can write: ½acetate ½Hþ Ka ¼ ¼ 1:75 105 ½acetic acid Since acetate and hydrogen ions are produced in equal quantities, if x ¼ the concentration of each then the concentration of unionised acetic acid remaining will be 0.01 x. Hence: ðxÞðxÞ 1:75 105 ¼ 0:01 x 1:75 107 1:75 105 x ¼ x2 This can now be solved either by use of the quadratic formula or, more easily, by neglecting the x term since it is so small. Adopting the latter alternative gives: x2 ¼ 1:75 107 hence x ¼ 4:18 104 M hence pH ¼ 3:38 The fractional ionisation (a) of the acetic acid is deﬁned as the fraction of the acetic acid that is in the form of acetate and is therefore given by the equation: ½acetate ¼ ½acetate þ ½acetic acid 4:18 104 ¼ 4:18 104 þ 0:01 4:18 104 4:18 104 ¼ 0:01 ¼ 4:18 102 or 4:18% Thus the majority of the acetic acid is present as the unionised form. If the pH is increased above 3.38 the proportion of acetate present will increase in accordance with the Henderson–Hasselbalch equation. 11 1.3 Weak electrolytes 1.3.3 Buffer solutions A buffer solution is one that resists a change in pH on the addition of either acid or base. They are of enormous importance in practical biochemical work as so many biochemical molecules are weak electrolytes so that their ionic status varies with pH so there is a need to stabilise this ionic status during the course of a practical experi- ment. In practice, a buffer solution consists of an aqueous mixture of a weak acid and its conjugate base. The conjugate base component would neutralise any hydrogen ions generated during an experiment whilst the unionised acid would neutralise any base generated. The Henderson–Hasselbalch equation is of central importance in the preparation of buffer solutions. It can be expressed in a variety of forms. For a buffer based on a weak acid: ½conjugate base pH ¼ pKa þ log ð1:7Þ ½weak acid or ½ionised form pH ¼ pKa þ log ½unionised form For a buffer based on the conjugate acid of a weak base: ½weak base pH ¼ pKa þ log ð1:8Þ ½conjugate acid or ½unionised form pH ¼ pKa þ log ½ionised form Table 1.6 lists some weak acids and bases commonly used in the preparation of buffer solutions. Phosphate, Hepes and Pipes are commonly used because of their optimum pH being close to 7.4. The buffer action and pH of blood is illustrated in Example 2 and the preparation of a phosphate buffer is given in Example 3. Buffer capacity It can be seen from the Henderson–Hasselbalch equations that when the concentration (or more strictly the activity) of the weak acid and base is equal, their ratio is one and their logarithm zero so that pH ¼ pKa. The ability of a buffer solution to resist a change in pH on the addition of strong acid or alkali is expressed by its buffer capacity (b). This is deﬁned as the amount (moles) of acid or base required to change the pH by one unit i.e. db da b¼ ¼ ð1:9Þ dpH dpH where db and da are the amount of base and acid respectively and dpH is the resulting change in pH. In practice, b is largest within the pH range pKa 1. 12 Basic principles Example 2 BUFFER ACTION AND pH OF BLOOD The normal pH of blood is 7.4 and is maintained at this value by buffer action in particular by the action of HCO3 and CO2 resulting from gaseous CO2 dissolved in blood and the resulting ionisation of carbonic acid: CO2 þ H2 OÐH2 CO3 H2 CO3 ÐHþ þ HCO 3 It is possible to calculate an overall equilibrium constant (Keq) for these two consecutive reactions and to incorporate the concentration of water (55.6 M) into the value: ½Hþ ½HCO3 Keq ¼ ¼ 7:95 107 hence pKeq ¼ 6:1 ½CO2 Rearranging: ½HCO3 pH ¼ pKeq þ log ½CO2 When the pH of blood falls due to the metabolic production of Hþ, these equilibria shift in favour of increased production of H2CO3 that in turn ionises to give increased CO2 that is then expired. When the pH of blood rises, more HCO3 is produced and breathing is adjusted to retain more CO2 in the blood thus maintaining blood pH. Some disease states may change this pH causing either acidosis or alkalosis and this may cause serious problems and in extreme cases, death. For example, obstructive lung disease may cause acidosis and hyperventilation alkalosis. Clinical biochemists routinely monitor patient’s acid–base balance in blood, in particular the ratio of HCO 3 and CO2. Reference ranges for these at pH 7.4 are ½HCO3 ¼ 18:0 26:0 mM and pCO2 ¼ 4.6–6.9 kPa, which gives ½CO2 in the range of 1.20 mM. Question A patient suffering from acidosis had a blood pH of 7.15 and ½CO2 of 1.15 mM. What was the patient’s ½HCO3 and what are the implications of its value to the buffer capacity of the blood? Answer Applying the above equation we get: ½HCO3 pH ¼ pKeq þ log ½CO2 ½HCO3 7:15 ¼ 6:10 þ log 1:15 ½HCO 3 1:05 ¼ log 1:15 Taking the antilog of this equation we get 11:22 ¼ ½HCO 3 =1:15 Therefore ½HCO 3 ¼ 12:90 mM indicating that the bicarbonate concentration in the patient’s blood had decreased by 11.1 mM i.e. 47% thereby severely reducing the buffer capacity of the patient’s blood so that any further signiﬁcant production of acid would have serious implications for the patient. 13 1.3 Weak electrolytes Example 3 PREPARATION OF A PHOSPHATE BUFFER Question How would you prepare 1 dm3 of 0.1 M phosphate buffer, pH 7.1, given that pKa2 for phosphoric acid is 6.8 and that the atomic masses for Na, P and O are 23, 31 and 16 daltons respectively? Answer The buffer will be based on the ionisation: H2 PO4 ÐHPO2 þ 4 þ H pKa ¼ 6:8 2 and will therefore involve the use of solid sodium dihydrogen phosphate (NaH2PO4) and disodium hydrogen phosphate (Na2HPO4). Applying the appropriate Henderson–Hasselbalch equation (equation 1.7) gives: ½HPO2 4 7:1 ¼ 6:8 þ log ½H2 PO 4 ½HPO2 4 0:3 ¼ log ½H2 PO 4 ½HPO2 4 2:0 ¼ ½H2 PO4 Since the total concentration of the two species needs to be 0.1 M it follows that 4 must be 0.067 M and ½H2 PO4 0.033 M. Their molecular masses are 142 and ½HPO2 120 daltons respectively; hence the weight of each required is 0.067 143 ¼ 9.46 g (Na2HPO4) and 0.033 120 ¼ 4.00 g (NaH2PO4). These weights would be dissolved in approximately 800 cm3 pure water, the pH measured and adjusted as necessary, and the volume ﬁnally made up to 1 dm3. Selection of a buffer When selecting a buffer for a particular experimental study, several factors should be taken into account: select the one with a pKa as near as possible to the required experimental pH and within the range pKa 1, as outside this range there will be too little weak acid or weak base present to maintain an effective buffer capacity; select an appropriate concentration of buffer to have adequate buffer capacity for the particular experiment. Buffers are most commonly used in the range 0.05–0.5 M; ensure that the selected buffer does not form insoluble complexes with any anions or cations essential to the reaction being studied (phosphate buffers tend to precipitate polyvalent cations, for example, and may be a metabolite or inhibitor of the reaction); 14 Basic principles ensure that the proposed buffer has other desirable properties such as being non-toxic, able to penetrate membranes, and does not absorb in the visible or ultraviolet region. 1.3.4 Measurement of pH – the pH electrode The pH electrode is an example of an ion-selective electrode (ISE) that responds to one speciﬁc ion in solution, in this case the hydrogen ion. The electrode consists of a thin glass porous membrane sealed at the end of a hard glass tube containing 0.1 M hydrochloric acid into which is immersed a silver wire coated with silver chloride. This silver/silver chloride electrode acts as an internal reference that gener- ates a constant potential. The porous membrane is typically 0.1 mm thick, the outer and inner 10 nm consisting of a hydrated gel layer containing exchange-binding sites for hydrogen or sodium ions. On the inside of the membrane the exchange sites are predominantly occupied by hydrogen ions from the hydrochloric acid whilst on the outside the exchange sites are occupied by sodium and hydrogen ions. The bulk of the membrane is a dry silicate layer in which all exchange sites are occupied by sodium ions. Most of the coordinated ions in both hydrated layers are free to diffuse into the surrounding solution whilst hydrogen ions in the test solution can diffuse in the opposite direction replacing bound sodium ions in a process called ion-exchange equilibrium. Any other types of cations present in the test solution are unable to bind to the exchange sites thus ensuring the high speciﬁcity of the electrode. Note that hydrogen ions do not diffuse across the dry glass layer but sodium ions can. Thus effectively the membrane consists of two hydrated layers containing different hydrogen ion activities separated by a sodium ion transport system. The principle of operation of the pH electrode is based upon the fact that if there is a gradient of hydrogen ion activity across the membrane this will generate a potential the size of which is determined by the hydrogen ion gradient across the membrane. Moreover, since the hydrogen ion concentration on the inside is constant (due to the use of 0.1 M hydrochloric acid) the observed potential is directly dependent upon the hydrogen ion concentration of the test solution. In practice a small junction or asymmetry potential (E*) is also created in part as a result of linking the glass electrode to a reference electrode. The observed potential across the membrane is therefore given by the equation: E ¼ E þ 0:059 pH Since the precise composition of the porous membrane varies with time so too does the asymmetry potential. This contributes to the need for the frequent recalibration of the electrode commonly using two standard buffers of known pH. For each 10-fold change in the hydrogen ion concentration across the membrane (equivalent to a pH change of 1 in the test solution) there will be a potential difference change of 59.2 mV across the membrane. The sensitivity of pH measurements is inﬂuenced by the prevailing absolute temperature. The most common forms of pH electrode are the glass electrode (Fig. 1.1a) and the combination electrode (Fig. 1.1b) which contains an in-built calomel reference electrode. 15 1.3 Weak electrolytes (a) (b) Shielded insulated cable Glass stem Ag/AgCI Inner electrode internal (Ag/AgCl wire) electrode Salt bridge solution ‘External’ (usually KCI) reference electrode Porous plug HCI solution HCI (0.1 M) Glass (0.1 M) membrane Thin-walled glass bulb Fig. 1.1 Common pH electrodes: (a) glass electrode; (b) combination electrode. 1.3.5 Other electrodes Electrodes exist for the measurement of many other ions such as Liþ, Kþ, Naþ, Ca2þ, Cl and NO þ 3 in addition to H. The principle of operation of these ion-selective electrodes (ISEs) is very similar to that of the pH electrode in that permeable membranes speciﬁc for the ion to be measured are used. They lack absolute speciﬁcity and their selectivity is expressed by a selectivity coefﬁcient that expresses the ratio of the response to the competing ions relative to that for the desired ion. Most ISEs have a good linear response to the desired ion and a fast response time. Biosensors are derived from ISEs by incorporating an immobilised enzyme onto the surface of the electrode. An important example is the glucose electrode that utilises glucose oxidase to oxidise glucose (Section 15.3.5) in the test sample to generate hydrogen peroxide that is reduced at the anode causing a current to ﬂow that is then measured amperometrically. Micro sensor versions of these electrodes are of great importance in clinical biochemistry laboratories (Section 16.2.2). The oxygen electrode measures molecular oxygen in solution rather than an ion. It works by reducing the oxygen at the platinum cathode that is separated from the test solution by an oxygen-permeable membrane. The electrons consumed in the process are compensated by the generation of electrons at the silver anode hence the oxygen tension in the test sample is directly proportional to the current ﬂow between the two electrodes. Optical sensors use the enzyme luciferase (Section 15.3.2) to measure ATP by generating light and detecting it with a photomultiplier. 16 Basic principles 1.4 QUANTITATIVE BIOCHEMICAL MEASUREMENTS 1.4.1 Analytical considerations and experimental error Many biochemical investigations involve the quantitative determination of the con- centration and/or amount of a particular component (the analyte) present in a test sample. For example, in studies of the mode of action of enzymes, trans-membrane transport and cell signalling, the measurement of a particular reactant or product is investigated as a function of a range of experimental conditions and the data used to calculate kinetic or thermodynamic constants. These in turn are used to deduce details of the mechanism of the biological process taking place. Irrespective of the experi- mental rationale for undertaking such quantitative studies, all quantitative experi- mental data must ﬁrst be questioned and validated in order to give credibility to the derived data and the conclusions that can be drawn from them. This is particularly important in the ﬁeld of clinical biochemistry in which quantitative measurements on a patient’s blood and urine samples are used to aid a clinical diagnosis and monitor the patient’s recovery from a particular disease. This requires that the experimental data be assessed and conﬁrmed as an acceptable estimate of the ‘true’ values by the application of one or more standard statistical tests. Evidence of the validation of quantitative data by the application of such tests is required by the editors of refereed journals for the acceptance for publication of draft research papers. The following sections will address the theoretical and practical considerations behind these statistical tests. Selecting an analytical method The nature of the quantitative analysis to be carried out will require a decision to be taken on the analytical technique to be employed. A variety of methods may be capable of achieving the desired analysis and the decision to select one may depend on a variety of issues. These include: the availability of speciﬁc pieces of apparatus; the precision, accuracy and detection limits of the competing methods; the precision, accuracy and detection limit acceptable for the particular analysis; the number of other compounds present in the sample that may interfere with the analysis; the potential cost of the method (particularly important for repetitive analysis); the possible hazards inherent in the method and the appropriate precautions needed to minimise risk; the published literature method of choice; personal preference. The most common biochemical quantitative analytical methods are visible, ultraviolet and ﬂuorimetric spectrophotometry, chromatographic techniques such as HPLC and GC coupled to spectrophotometry or mass spectrometry, ion-selective electrodes and 17 1.4 Quantitative biochemical measurements immunological methods such as ELISA. Once a method has been selected it must be developed and/or validated using the approaches discussed in the following sections. If it is to be used over a prolonged period of time, measures will need to be put in place to ensure that there is no drift in response. This normally entails an internal quality control approach using reference test samples covering the analytical range that are measured each time the method is applied to test samples. Any deviation from the known values for these reference samples will require the whole batch of test samples to be re-assayed. The nature of experimental errors Every quantitative measurement has some uncertainty associated with it. This uncer- tainty is referred to as the experimental error which is a measure of the difference between the ‘true’ value and the experimental value. The ‘true’ value normally remains unknown except in cases where a standard sample (i.e. one of known composition) is being analysed. In other cases it has to be estimated from the analyt- ical data by the methods that will be discussed later. The consequence of the existence of experimental errors is that the measurements recorded can be accepted with a high, medium or low degree of conﬁdence depending upon the sophistication of the technique employed, but seldom, if ever, with absolute certainty. Experimental error may be of two kinds: systematic error and random error. Systematic error (also called determinate error) Systematic errors are consistent errors that can be identiﬁed and either eliminated or reduced. They are most commonly caused by a fault or inherent limitation in the apparatus being used but may also be inﬂuenced by poor experimental design. Common causes include the misuse of manual or automatic pipettes, the incorrect preparation of stock solutions, and the incorrect calibration and use of pH meters. They may be constant (i.e. have a ﬁxed value irrespective of the amount of test analyte present in the test sample under investigation) or proportional (i.e. the size of the error is dependent upon the amount of test analyte present). Thus the overall effect of the two types in a given experimental result will differ. Both of these types of systematic error have three common causes: Analyst error: This is best minimised by good training and/or by the automation of the method. Instrument error: This may not be eliminable and hence alternative methods should be considered. Instrument error may be electronic in origin or may be linked to the matrix of the sample. Method error: This can be identiﬁed by comparison of the experimental data with that obtained by the use of alternative methods. Identiﬁcation of systematic errors Systematic errors are always reproducible and may be positive or negative i.e. they increase or decrease the experimental value relative to the ‘true’ value. The crucial 18 Basic principles characteristic, however, is that their cause can be identiﬁed and corrected. There are four common means of identifying this type of error: Use of a ‘blank’ sample: This is a sample that you know contains none of the analyte under test so that if the method gives a non-zero answer then it must be responding in some unintended way. The use of blank samples is difﬁcult in cases where the matrix of the test sample is complex, for example, serum. Use of a standard reference sample: This is a sample of the test analyte of known composition so the method under evaluation must reproduce the known answer. Use of an alternative method: If the test and alternative methods give different results for a given test sample then at least one of the methods must have an inbuilt ﬂaw. Use of an external quality assessment sample: This is a standard reference sample that is analysed by other investigators based in different laboratories employing the same or different methods. Their results are compared and any differences in excess of random errors (see below) identify the systematic error for each analyst. The use of external quality assessment schemes is standard practice in clinical biochemistry laboratories (see Section 16.2.3). Random error (also called indeterminate error) Random errors are caused by unpredictable and often uncontrollable inaccuracies in the various manipulations involved in the method. Such errors may be variably positive or negative and are caused by such factors as difﬁculty in the process of sampling, random electrical ‘noise’ in an instrument or by the analyst being inconsist- ent in the operation of the instrument or in recording readings from it. Standard operating procedures The minimisation of both systematic and random errors is essential in cases where the analytical data are used as the basis for a crucial diagnostic or prognostic decision as is common, for example, in routine clinical biochemical investigations and in the development of new drugs. In such cases it is normal for the analyses to be conducted in accordance with standard operating procedures (SOPs) that deﬁne in full detail the quality of the reagents, the preparation of standard solutions, the calibration of instruments and the methodology of the actual analytical procedure which must be followed. 1.4.2 Assessment of the performance of an analytical method All analytical methods can be characterised by a number of performance indicators that deﬁne how the selected method performs under speciﬁed conditions. Knowl- edge of these performance indicators allows the analyst to decide whether or not the method is acceptable for the particular application. The major performance indicators are: Precision (also called imprecision and variability): This is a measure of the reproducibility of a particular set of analytical measurements on the same sample 19 1.4 Quantitative biochemical measurements of test analyte. If the replicated values agree closely with each other, the measurements are said to be of high precision (or low imprecision). In contrast, if the values diverge, the measurements are said to be of poor or low precision (or high imprecision). In analytical biochemical work the normal aim is to develop a method that has as high a precision as possible within the general objectives of the investigation. However, precision commonly varies over the analytical range (see below) and over periods of time. As a consequence, precision may be expressed as either within-batch or between- batch. Within-batch precision is the variability when the same test sample is analysed repeatedly during the same batch of analyses on the same day. Between-batch precision is the variability when the same test sample is analysed repeatedly during different batches of analyses over a period of time. Since there is more opportunity for the analytical conditions to change for the assessment of between-batch precision, it is the higher of the two types of assessment. Results that are of high precision may nevertheless be a poor estimate of the ‘true’ value (i.e. of low accuracy or high bias) because of the presence of unidentiﬁed errors. Methods for the assessment of precision of a data set are discussed below. The term imprecision is preferred in particular by clinical biochemists since they believe that it best describes the variability that occurs in replicated analyses. Accuracy (also called trueness, bias and inaccuracy): This is the difference between the mean of a set of analytical measurements on the same sample of test analyte and the ‘true’ value for the test sample. As previously pointed out, the ‘true’ value is normally unknown except in the case of standard measurements. In other cases accuracy has to be assessed indirectly by use of an internationally agreed reference method and/or by the use of external quality assessment schemes (see above) and/or by the use of population statistics that are discussed below. Detection limit (also called sensitivity): This is the smallest concentration of the test analyte that can be distinguished from zero with a deﬁned degree of conﬁdence. Concentrations below this limit should simply be reported as ‘less than the detection limit’. All methods have their individual detection limits for a given analyte and this may be one of the factors that inﬂuence the choice of a speciﬁc analytical method for a given study. Thus the Bradford, Lowry and bicinchoninic acid methods for the measurements of proteins have detection limits of 20, 10 and 0.5 mg protein cm3 respectively. In clinical biochemical measurements, sensitivity is often deﬁned as the ability of the method to detect the analyte without giving false negatives (see Section 16.1.2). Analytical range: This is the range of concentrations of the test analyte that can be measured reproducibly, the lower end of the range being the detection limit. In most cases the analytical range is deﬁned by an appropriate calibration curve (see Section 1.4.6). As previously pointed out, the precision of the method may vary across the range. Analytical speciﬁcity (also called selectivity): This is a measure of the extent to which other substances that may be present in the sample of test analyte may interfere with the analysis and therefore lead to a falsely high or low value. A simple example is the ability of a method to measure glucose in the presence of other hexoses such as mannose and galactose. In clinical biochemical measurements, selectivity is an index 20 Basic principles of the ability of the method to give a consistent negative result for known negatives (see Section 16.1.2) Analytical sensitivity: This is a measure of the change in response of the method to a deﬁned change in the quantity of analyte present. In many cases analytical sensitivity is expressed as the slope of a linear calibration curve. Robustness: This is a measure of the ability of the method to give a consistent result in spite of small changes in experimental parameters such as pH, temperature and amount of reagents added. For routine analysis, the robustness of a method is an important practical consideration. These performance indicators are established by the use of well-characterised test and reference analyte samples. The order in which they are evaluated will depend on the immediate analytical priorities, but initially the three most important may be speciﬁcity, detection limit and analytical range. Once a method is in routine use, the question of assuring the quality of analytical data by the implementation of quality assessment procedures comes into play. 1.4.3 Assessment of precision After a quantitative study has been completed and an experimental value for the amount and/or concentration of the test analyte in the test sample obtained, the experimenter must ask the question ‘How conﬁdent can I be that my result is an acceptable estimate of the ‘true’ value?’ (i.e. is it accurate?). An additional question may be ‘Is the quality of my analytical data comparable with that in the published scientiﬁc literature for the particular analytical method?’ (i.e. is it precise?). Once the answers to such questions are known, a result that has a high probability of being correct can be accepted and used as a basis for the design of further studies whilst a result that is subject to unacceptable error can be rejected. Unfortunately it is not possible to assess the precision of a single quantitative determination. Rather, it is necessary to carry out analyses in replicate (i.e. the experiment is repeated several times on the same sample of test analyte) and to subject the resulting data set to some basic statistical tests. If a particular experimental determination is repeated numerous times and a graph constructed of the number of times a particular result occurs against its value, it is normally bell-shaped with the results clustering symmetrically about a mean value. This type of distribution is called a Gaussian or normal distribution. In such cases the precision of the data set is a reﬂection of random error. However, if the plot is skewed to one side of the mean value, then systematic errors have not been eliminated. Assuming that the data set is of the normal distribution type, there are three statistical parameters that can be used to quantify precision. Standard deviation, coefﬁcient of variation and variance – measures of precision These three statistical terms are alternative ways of expressing the scatter of the values within a data set about the mean, x-, calculated by summing their total value and dividing by the number of individual values. Each term has its individual merit. In all 21 1.4 Quantitative biochemical measurements three cases the term is actually measuring the width of the normal distribution curve such that the narrower the curve the smaller the value of the term and the higher the precision of the analytical data set. The standard deviation (s) of a data set is a measure of the variability of the population from which the data set was drawn. It is calculated by use of equation 1.10 or 1.11: sﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ ðxi xÞ2 s¼ ð1:10Þ n1 sﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ x2i ðxi Þ2 =n s¼ ð1:11Þ n1 (xi x-) is the difference between an individual experimental value (xi) and the calculated mean x- of the individual values. Since these differences may be positive or negative, and since the distribution of experimental values about the mean is symmetrical, if they were simply added together they would cancel out each other. The differences are therefore squared to give consistent positive values. To compen- sate for this, the square root of the resulting calculation has to be taken to obtain the standard deviation. Standard deviation has the same units as the actual measurements and this is one of its attractions. The mathematical nature of a normal distribution curve is such that 68.2% of the area under the curve (and hence 68.2% of the individual values within the data set) is within one standard deviation either side of the mean, 95.5% of the area under the curve is within two standard deviations and 99.7% within three standard deviations. Exactly 95% of the area under the curve falls between the mean and 1.96 standard deviations. The precision (or imprecision) of a data set is commonly expressed as 1 SD of the mean. The term (n 1) is called the degrees of freedom of the data set and is an important variable. The initial number of degrees of freedom possessed by a data set is equal to the number of results (n) in the set. However, when another quantity characterising the data set, such as the mean or standard deviation, is calculated, the number of degrees of freedom of the set is reduced by 1 and by 1 again for each new derivation made. Many modern calculators and computers include programs for the calculation of standard deviation. However, some use variants of equation 1.10 in that they use n as the denominator rather than n 1 as the basis for the calculation. If n is large, greater than 30 for example, then the difference between the two calculations is small, but if n is small, and certainly if it is less than 10, the use of n rather than n 1 will signiﬁcantly underestimate the standard deviation. This may lead to false conclusions being drawn about the precision of the data set. Thus for most analytical biochemical studies it is imperative that the calculation of standard deviation is based on the use of n 1. The coefﬁcient of variation (CV) (also known as relative standard deviation) of a data set is the standard deviation expressed as a percentage of the mean as shown in equation 1.12. 22 Basic principles s100% CV ¼ ð1:12Þ x Since the mean and standard deviation have the same units, coefﬁcient of variation is simply a percentage. This independence of the unit of measurement allows methods based on different units to be compared. The variance of a data set is the mean of the squares of the differences between each value and the mean of the values. It is also the square of the standard deviation, hence the symbol s2. It has units that are the square of the original units and this makes it appear rather cumbersome which explains why standard deviation and coefﬁcient of variation are the preferred ways of expressing the variability of data sets. The importance of variance will be evident in later discussions of the ways of making a statistical comparison of two data sets. To appreciate the relative merits of standard deviation and coefﬁcient of variation as measures of precision, consider the following scenario. Suppose that two serum samples, A and B, were each analysed 20 times for serum glucose by the glucose oxidase method (see Section 15.3.5) such that sample A gave a mean value of 2.00 mM with a standard deviation of 0.10 mM and sample B a mean of 8.00 mM and a standard deviation of 0.41 mM. On the basis of the standard deviation values it might be concluded that the method had given a better precision for sample A than for B. However, this ignores the absolute values of the two samples. If this is taken into account by calculating the coefﬁcient of variation, the two values are 5.0% and 5.1% respectively showing that the method had shown the same precision for both samples. This illustrates the fact that standard deviation is an acceptable assessment of preci- sion for a given data set but if it is necessary to compare the precision of two or more data sets, particularly ones with different mean values, then coefﬁcient of variation should be used. The majority of well-developed analytical methods have a coefﬁcient of variation within the analytical range of less than 5% and many, especially auto- mated methods, of less than 2%. 1.4.4 Assessment of accuracy Population statistics Whilst standard deviation and coefﬁcient of variation give a measure of the variabil- ity of the data set they do not quantify how well the mean of the data set approaches the ‘true’ value. To address this issue it is necessary to introduce the concepts of population statistics and conﬁdence limit and conﬁdence interval. If a data set is made up of a very large number of individual values so that n is a large number, then the mean of the set would be equal to the population mean mu (m) and the standard deviation would equal the population standard deviation sigma (s). Note that Greek letters represent the population parameters and the common alphabet the sample parameters. These two population parameters are the best estimates of the ‘true’ values since they are based on the largest number of individual measurements so that the inﬂuence of random errors is minimised. In practice the population parameters are seldom measured for obvious practicality reasons and the sample parameters have 23 1.4 Quantitative biochemical measurements Example 4 ASSESSMENT OF THE PRECISION OF AN ANALYTICAL DATA SET Question Five measurements of the fasting serum glucose concentration were made on the same sample taken from a diabetic patient. The values obtained were 2.3, 2.5, 2.2, 2.6 and 2.5 mM. Calculate the precision of the data set. Answer Precision is normally expressed either as one standard deviation of the mean or as the coefﬁcient of variation of the mean. These statistical parameters therefore need to be calculated. Mean 2:2 þ 2:3 þ 2:5 þ 2:5 þ 2:6 x¼ ¼ 2:42 mM 5 Standard deviation Using both equations (1.10) and (1.11) to calculate the value of s: xi xi–x (xi–x)2 xi2 2.2 0.22 0.0484 4.84 2.3 0.12 0.0144 5.29 2.5 þ0.08 0.0064 6.25 2.5 þ0.08 0.0064 6.25 2.6 þ0.18 0.0324 6.75 Sxi12.1 S0.00 S0.1080 S29.39 Using equation 1.10 pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ s¼ 0:108=4 ¼ 0:164 mM Using equation 1.11 sﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ rﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ 29:39 ð12:1Þ2 =5 29:39 29:28 s¼ ¼ ¼ 0:166 mM 4 4 Coefﬁcient of variation Using equation 1.12 0:165 100% CV ¼ 2:42 ¼ 6:82% Discussion In this case it is easier to appreciate the precision of the data set by considering the coefﬁcient of variation. The value 6.82% is moderately high for this type of analysis. Automation of the method would certainly reduce it by at least half. Note that it is legitimate to quote the answers to these calculations to one more digit than was present in the original data set. In practice, it is advisable to carry out the statistical analysis on a far larger data set than that presented in this example. 24 Basic principles a larger uncertainty associated with them. The uncertainty of the sample mean deviating from the population mean decreases in the proportion of the reciprocal of the square root of the number of values in the data set i.e. 1/√n. Thus to decrease the uncertainty by a factor of two the number of experimental values would have to be increased four-fold and for a factor of 10 the number of measurements would need to be increased 100-fold. The nature of this relationship again emphasises the importance of evaluating the acceptable degree of uncertainty of the experimental result before the design of the experiment is completed and the practical analysis begun. Modern automated analytical instruments recognise the importance of multiple results by facilitating repeat analyses at maximum speed. It is good practice to report the number of measurements on which the mean and standard deviation are based as this gives a clear indication of the quality of the calculated data. Conﬁdence intervals, conﬁdence limits and the Student’s t factor Accepting that the population mean is the best estimate of the ‘true’ value, the question arises ‘How can I relate my experimental sample mean to the population mean?’ The answer is by using the concept of conﬁdence. Conﬁdence level expresses the level of conﬁdence, expressed as a percentage, that can be attached to the data. Its value has to be set by the experimenter to achieve the objectives of the study. Conﬁdence interval is a mathematical statement relating the sample mean to the population mean. A conﬁdence interval gives a range of values about the sample mean within which there is a given probability (determined by the conﬁdence level) that the population mean lies. The relationship between the two means is expressed in terms of the standard deviation of the data set, the square root of the number of values in the data set and a factor known as Student’s t (equation 1.13): ts ¼xp ð1:13Þ n where x is the measured mean, m is the population mean, s is the measured standard deviation, n is the number of measurements and t is the Student’s t factor. The term s/√n is known as the standard error of the mean and is a measure of the precision of the sample mean. Unlike standard deviation, standard error depends on the sample size and will fall as the sample size increases. The two measurements are sometimes confused, but in essence, standard deviation should be used if we want to know how widely scattered are the measurements and standard error should be used if we want to indicate the uncertainty around a mean measurement. Conﬁdence level can be set at any value up to 100%. For example, it may be that a conﬁdence level of only 50% would be acceptable for a particular experiment. However, a 50% level means that that there is a one in two chance that the sample mean is not an acceptable estimate of the population mean. In contrast, the choice of a 95% or 99% conﬁdence level would mean that there was only a one in 20 or a one in 100 chance respectively that the best estimate had not been achieved. In practice, most analytical biochemists choose a conﬁdence level in the range 90–99% and most commonly 95%. Student’s t is a way of linking probability with the size of the data set and is used in a number of statistical tests. Student’s t values for varying numbers in a data set 25 1.4 Quantitative biochemical measurements Table 1.7 Values of Student’s t Conﬁdence level (%) Degrees of freedom 50 90 95 98 99 99.9 2 0.816 2.920 4.303 6.965 9.925 31.598 3 0.765 2.353 3.182 4.541 5.841 12.924 4 0.741 2.132 2.776 3.747 4.604 8.610 5 0.727 2.015 2.571 3.365 4.032 6.869 6 0.718 1.943 2.447 3.143 3.707 5.959 7 0.711 1.895 2.365 2.998 3.500 5.408 8 0.706 1.860 2.306 2.896 3.355 5.041 9 0.703 1.833 2.262 2.821 3.250 4.798 10 0.700 1.812 2.228 2.764 3.169 4.587 15 0.691 1.753 2.131 2.602 2.947 4.073 20 0.687 1.725 2.086 2.528 2.845 3.850 30 0.683 1.697 2.042 2.457 2.750 3.646 (and hence with the varying degrees of freedom) at selected conﬁdence levels are available in statistical tables. Some values are shown in Table 1.7. The numerical value of t is equal to the number of standard errors of the mean that must be added and subtracted from the mean to give the conﬁdence interval at a given conﬁdence level. Note that as the sample size (and hence the degrees of freedom) increases, the conﬁ- dence levels converge. When n is large and if we wish to calculate the 95% conﬁdence interval, the value of t approximates to 1.96 and some texts quote equation 1.13 in this form. The term Student’s t factor may give the impression that it was devised speciﬁcally with students’ needs in mind. In fact ‘Student’ was the pseudonym of a statistician, by the name of W. S. Gossett, who in 1908 ﬁrst devised the term and who was not permitted by his employer to publish his work under his own name. Criteria for the rejection of outlier experimental data – Q-test A very common problem in quantitative biochemical analysis is the need to decide whether or not a particular result is an outlier and should therefore be rejected before the remainder of the data set are subjected to statistical analysis. It is important to identify such data as they have a disproportionate effect on the calculation of the mean and standard deviation of the data set. When faced with this probl

Principles and Techniques of Biochemistry and Molecular Biology PDF

Document Details

Tags

Related

Summary

Full Transcript