Genetics and Evolutionary Bio PDF

After Week 2 you should: Understand how a sex-linked trait ended up corroborating the Chromosomal Theory of Heredity Understand the implication of genetic linkage on autosomes and sex chromosomes on Mendelian Inheritance Understand the mechanism underlying non-disjunction of chromosomes, and the implications of it Appreciate the diversity of sex determination mechanisms in nature Understand the importance of sex determination genes, and the implications of their mutation Know the reasons for, and mechanisms of, X inactivation, and how this results in mosaicism in mammalian females Understand why X chromosome aneuploidies are less severe than autosome aneuploidies Understand some of the alternative dosage compensation mechanisms that occur in non-mammalian animals Be able to draw and interpret pedigrees to infer the mode of inheritance of human Mendelian traits Module 2: Wks 2-3 Lecture 1: Central Dogma DNA, RNA and protein Replication, transcription and translation Central dogma = sequence of events Gene expression = these things need to happen for a gene to be expressed Genetic material must perform three essential functions: 1. The genotypic function/replication → each cell must be replicated from a single cell into a daughter cell 1 2. The phenotypic function/gene expression → genes control shape and colour 3. The evolutionary function/mutation What's a gene? Changed through time as different scientists achieved more theories Modern definition = a region of DNA that encodes for at least one transcript and/or at least one polypeptide Chromosome composition ○ Contains proteins and nucleic acid (DNA and RNA) Lecture 2: Which component is the genetic material? Four major experiments – 1. Griffith, Sia, Dawson – transformation principal Principle was able to transform a non-pathogenic bacteria into a pathogenic strain (Streptococcus pneumoniae) 2. Avery, MacLeod, McCarty – DNA is the transforming agent Also used Streptococcus pneumoniae DNA is the substance that causes bacterial transformation 2 3. Hershey and Chase – DNA is the genetic material Bacteriophage - viruses that infect bacteria Concluded that protein was not genetic material and the DNA was 4. Fraenkel-Conrat – RNA can also act as genetic material Prokaryotic gene: Eukaryotic gene: 3 Lecture 3: Structure of DNA Nucelotides: Phosphate group, 5-carbon sugar, nitrogenous base C-T, G-A RNA → C-U, G-A If an organism’s genome contains 27% Adenine ○ A = T therefore A = 27%, T = 27% ○ 27 + 27 = 54% 4 ○ 100-27 = 46% G + C ○ 46/2 = 23% G and 23% C DNA Structure: Stacked Two grooves (major and minor) Right handed double helix (B-DNA) DNA supercoiling - when one or both strands are cleaved and when the complementary strands at one end are rotated or twisted around each other with the other end held fixed in space—and thus not allowed to spin Supercoiling causes DNA molecule to collapse into a tightly coild structure ○ They are introduced and removed from DNA molecules by enzymes that play essential roles in DNA replication DNA double helix has a hydrophobic core and hydrophilic exterior Most DNA binding proteins interact with DNA at major groove Lecture 4: Eukaryotic chromosome structure Eukaryote chromosome structure: 1000mm DNA/haploid chromosome = 2000mm/diploid call Chromosomes contain: DNA, histones and non-histone proteins Amount of histones = amount of DNA Histones: 5 major histone proteins → H1, H2a, H2b, H3, H4 Basic proteins have a positive charge and DNA is negatively charged Lysine (H1) and Arginine (H3, H4) are abundant Nucleosomes: Linker DNA between nucleosomes Complete nucleosome = core + H1 2 full turns of DNA super helix Structure of Nucleosome core: Structure of helix important to the packing of the DNA Position of major and minor grooves also helps in the proper coiling of DNA around the core 5 Core nucleosome – octamer of histones with 146bp DNA Complete nucleosome = core + Histone H1; 166bp DNA; 11nm diameter ‘Beads on a string’ nucleosome forming 11nm fibres DNA structure - 30nm fibres 30nm diameter chromatin fibres Packed together nucleosomes Structure is quite variable and depends on the procedures used to isolate them 2 most popular models are the solenoid and zigzag models Scaffolding non-histone proteins Packaging 30nm fibres Made up of non-histone proteins Absence of any apparent ends of DNA molecules/one giant DNA molecule per chromosome Summary of chromosome structure and packaging: At least three levels of condensation are required to package the 2m of DNA in a eukaryotic chromosome into a metaphase structure a few μm long Stage 1: 11-nm-diameter nucleosomes formed. This involves negative supercoiling of DNA around an octamer of histone molecules (two each of histones H2a, H2b, H3, and H4) Stage 2: Core nucleosomes further packaged with histone H1 and condensed into 30nm chromatin fibers. Stage 3: Scaffolding with non-histone proteins and further condensation leads to metaphase chromosome structure Centromeres: Centre of the metaphase chromosome The production of functional centromeres is a key step in the transition from metaphase to anaphase Attachment points for spindle fibres Non-disjunction happens if centromeres do not function properly Kinetochore - large protein complex at centromere that attaches to spindle fibre Long tandem arrays - common DNA sequences found in centromeres e.g. centromeres of human chromosomes contain 5,000 to 5,000 copies of a 171bp long sequence called the alpha satellite sequence 6 Telomeres: end of chromosome Functions: ○ Protect the ends of the chromosomes ○ Prevent fusion of chromosomes ○ Facilitate replication of ends of DNA Contain repetitive sequences e.g. TTAGGG in many vertebrates Repeat tracts in somatic cells get shorter with age Telomeres of germ-line cells and cancer cells do not shorteb Structure: ○ Form t-loop structure – fulfills those functions ○ Complementary repeat sequences base-pair ○ Strand displacement to form t-loop; single stranded DNA protected by POT-1 protein ○ Additional complexes (TRF) form ‘clamps’ Lecture 5a: DNA Replicaton DNA replication (basic concepts) DNA replication is semi-conservative Replication starts at the Origin of replication DNA polymerases catalyse DNA replication Replication occurs at the Replication fork DNA replication is semi-conservative ○ Each strand of the original DNA molecule is used as the template to synthesise a new strand of DNA ○ Each new strand is complementary to the original parent strand ○ New strand = daughter strand ○ Mechanism of DNA replication was first shown to occur in E. coli by Meselson and Stahl 7 DNA replication - DNA polymerases Catalysed by enzymes called DNA polymerases Requires: ○ Substrate = dNTPs (deoxyribonucleotide triphosphates - dTTP, dATP, dCTP, and dGTP) ○ Template DNA to specify the sequence of the new strand ○ Primer DNA with free 3'-OH ○ Mg2+ = cofactor Can proofread its own work 8 Occurs in 3’ → 5’ direction (opposite of DNA synthesis) Removes incorrectly incorporated nucleotides Lecture 5b: Eukaryotic DNA replication Shorter RNA primers and Okazaki fragments Replication only during S phase Multiple origins of replication Nucleosomes Telomeres Multiple origins: —-------------------------------------------------> DNA polymerase α - DNA primase complex – initiation of replication; priming of Okazaki fragments (primase) DNA polymerase δ, (proliferating cell nuclear antigen) and Replication factor C – DNA synthesis (DNA Pol III) Ribonuclease H1 and Ribonuclease FEN-1 — removal of RNA primers (DNA Pol I) DNA polymerase ε involved in DNA repair Nucleosomes: Nucleosomes appear to have the same structure and spacing immediately behind a replication fork (post-replicative DNA) as they do in front of a replication fork (prereplicative DNA) Suggests that nucleosomes must be disassembled to let the replisome duplicate the DNA packaged in them and then be quickly reassembled DNA replication and nucleosome assembly must be tightly coupled Telomeres: No free 3’-OH once RNA primer is removed → unique structure to help replication or special enzyme to maintain solves this problem —-------> Telomerases: Facilitates the replication of the ends of eukaryotic chromosomes Contains unique DNA repeat - TTAGGG (which is recognised by telomerase) 9 Enzyme contains bound RNA template containing complementary sequence (CCCUAA) → can them synthesise extra bit of DNA (TTAGGG) which allows the binding of RNA primer and DNA polymerase to come in and synthesise DNA Telomeres: aging Most human somatic cells lack telomerase activity Short telomeres are associated with cellular senescence and death Diseases with premature aging are associated with short telomeres/many age related diseases are associated with short telomeres Telomeres - cancer In many cancer cells genes encoding telomerase are over-expressed therefore thelomeres dont degrade/age Lecture 6a: Transcription Makes an RNA molecule using a DNa template enzymes that make RNA - RNA polymerase = DNA-dependent RNA polymerase Transcription and translation - Prokaryotes The primary transcript is equivalent to the mRNA molecule Occurs in cytoplasm Transcription and translation are often coupled together Transcription and translation - eukaryotes Transcription in nucleus Translation in cytoplasm Genes contain the coding exon and non-coding intron regions Need to remove the introns/splicing Transcription: General features of rnA synthesis: Synthesises in 5’ - 3’ direction RNA polymerase forms phosphodiester bonds between 5’ phosphate and 3’ OH The precursors are ribonucleotide triphosphates (rNTPs); sugar = ribose Uracil replaces thymine Only one strand of DNA is used as a template RNA synthesis can be initiated de novo/no primer required Coding vs non-coding strands —---> Transcription bubble: 10 RNA polymerase unwinds a small section of DNA to allow RNA synthesis Allows a few nucleotides in the template strand to base-pair with the growing end of the RNA chain RNA synthesis is governed by the same base-pairing rules as DNA synthesis, except a with u instead of t Transcription in prokaryotes: Initiation: The start Occurs at specific sequences called promoters Contains specific sequences (-35 and -10 element) → are upstream of the transcription start point (tsp) Coding region downstream (3’) of tsp -35 is initial binding site for a protein called a sigma σ factor -10 is involved in unwinding Relative base at which actual RNA synthesis occurs Elongation: Begins at tsp Comprises the coding region Transcription bubble formed RNA polymerase synthesising RNA 5’ - 3’ Moving alond the template the non-coding strand base-pairing between RNA - DNA hybrid short Termination: 11 Stopping of transcription Occurs when RNA polymerase encounters a specific sequence Here has Rho protein (dependent and independent) Rho-dependent termination: Signal in nascent mRNA called the rut sequence Binds Rho hexamer Results in a stem loop forming ‘Knocks’ RNA polymerase off and terminates transcription Rho-independent termination: —-----------------------------------------------> Stem loop forms in nascent nRNA G-C rich sequence followed by six A-T bp ‘Knocks’ RNA polymerase off and terminates transcription Transcription and translation coupled in prokaryotes: Since mRNA molecules are synthesized, translated, and degraded in the 5′ to 3′ direction, all three processes can occur simultaneously on the same RNA molecule Lecture 6b: Eukaryotic Transcription Multiple RNA polymerase: RNA polymerase I:Synthesizes all but one rRNA. Located in the nucleolus (a protein structure within the nucleus) RNA polymerase II: Responsible for the expression of genes that encode for long transcripts, many of which are translated into protein. RNA polymerase III: Synthesizes many small RNAs such as the tRNAs, 5s rRNA and siRNA (a more recent discovery) important for gene regulation. Elongation: 12 Same as prokaryotes except a few differences in the transcript thats produced 7-MG cap: A guanosine that is methylated (CH3) at the 7 position Added to the 5’ end of the transcript when RNA ~30 bases long Protects mRNA transcripts from degradation by ribonucleases Recognised by factors that initiate translation polyA tail: Long run of adenine residues Added to 3’ end of transcript AAUAAA and GC rich sequence Cleavage by endonuclease ploy(A) polymerase adds up to 200 A residues Protects mRNA transcript Involved in transport to cytoplasm Introns and splicing: Most eukaryotic genes contain noncoding sequences called intron These interrupt the coding sequences, called exons Introns need to be excised from the RNA transcripts prior to their transport to the cytoplasm for translation (splicing) (if introns are present its called pre-mRNA) Once introns have been spliced this is the final mRNA transcript for translation Exons are composed of the sequences that remain in the mature mRNA after splicing - these can include noncoding sequences e.g. initiation and termination signals Introns differ in size Lecture 7a: Translation - the basics Translation: From RNA to protein - going from nucleic acid language (RNA) to an amino acid language (protein) - on elanguage translated to another Codons: Codon - a three nucleotide sequence (triplet) that specifies a certain amino acid Genetic code contains start and stop codons 13 Types of RNA: Messenger RNA (mRNA) - the molecule that is translated into protein Transfer RNA - tRNA - recognises a specific cidon; brings a specific amino acid ti the ribosome. 1-4 tRNA molecules for each amino acid Ribosomal RNA - rRNA - part of the ribosome The ribosome: large , highly complex multisubunit molecular machine Made up of RNA and proteins 50S and 30S subunits make up complete 70S prokaryotic ribosome rRNA encoded as sigle transcript Processed by nucleases into functional units Intitiation: Formation of translation complex occurs in specific order: 30S binds mRNA then 50S subunit binsa Formation of 30S subunit/mRNA complex depends on base-pairing between a specific nucleotide sequence at 3’ end the 16S rRNA and the 5’ end of the mRNA molecule to be translated Called shine-dalgarno sequence (AGGAGG) Seven nucleotides upstream of AUG start codon Once 16s rRNA and small ribosomal subunit have bound, initiator tRNA and large ribosomal subunit bind This forms the ‘translation complex/the complete ribosome’ Protein synthesis can then begin = translation The translation complex: Ribosomes have three tRNA binding sites 1. Aminoacyl (A) site binds the incoming aminoacyl-tRNA 2. Peptidyl (P) site nimds tRNA to which the growing polypeptide is attached 3. Exit (E) site binds the departing uncharged tRNA An mRNA molecule (orange) is attached to the 30S subunit (cream) tRNA-binding sites located largely on the 50S subunit (blue) The aminoacyl-tRNAs located in the P (green) and A (pink) sites The E site is unoccupied Protein synthesis: 1. tRNA enters at A site (codon 2); previous tRNA present at P site (codon 1) 2. Ribosome ‘ratchets’ one codon down mRNA to codon 3; 3. tRNA A to P; P to E 4. tRNA in E site dissociates; A site free for tRNA for codon 3 14 5. Peptide bind forms during translocation from A to P site Termination: Occurs when ribosomes encounters a stop codon The stiop codon are UAA, UAG, UGA When a stop codon is encountered a release factor binds to the A site A water molecule is added to the carboxyl terminus of the nascent polypeptide, causing termination Lecuture 7b: translation - the genetic code The genetic code/codons: Gc is: ○ Degenerated - more than one codon can specify the same amino acid ○ Ordered - codons that specify the same amino acids are more often similar (similar acids are encoded by similar codons) Degenerated: 2 types: ○ Partial degeneracy 3rd base either two pyrimidines (U or C) or two purines (A or G). changing the third base from purine to pyrimidine or vise versa, will change the amino acid specified by the codon ○ Complete degeneracy Any of the four bases may be present at the third position in the codon and the codon will still soecify the same amino acid 15 Wobble base pairing: Hydrogen bonding between bases in the anticodons of tRNAs and the codons of mRNAs follows strict basepairing rules only for the first two bases of the codon The base-pairing involving the third base of the codon is less stringent, allowing what is called wobble base pairing at this site Mutations at this site have less impact than the first two positions - decreasing the effect of mutations Often amino acids with similar physical properties share similar codon sequences - they differ by a single base Thus if a coding mutation is created then the subsitituted amino acid is more likely to share similar physical properties This ultimately minimises the effect of protein sequenc/structure/function Several tRNAs contain the base inosine Inosine is produced by a post transcriptional modification of adenosine Wobble hypothesis predicted that when inosine is present at the 5’ end of an anticodon it would base pair wth cytosine, uracil or adenine codon Lecture 8: Mutations Source of all genetic variation Refers to: ○ A change in genetic material ○ The process by which the change occurs Types: ○ Changes in chromosome number and structure ○ Point mutations - changes at specific sites in a gene (substitution, insertion or deletion) A mutant is an organism that exhibits a novel phenotype They are heritable changes in the genetic material that provide the raw material for evolution Recombination mechanisms rearrange genetic variability into new combinations Natural selection preserves the combination best adapted to the existing environment 4 key ideas: 1. Mutations can be somatic or germinal 2. Mutations are spontaneous or induces 3. Mutations are usually randon and non-adaptive 4. Mutations are reversible Germinal mutations - occurs in germ-line cells and will be transmitted through the gametes to the progeny Somatic mutations - occurs in somatic cells where the mutant phenotype will occur only in the descendants of that cell and will not be transmitted to the progeny Spontaneous mutations: Occurs without a known cause due to inherent metabolic errors or unknown agents in the environment 16 They’re infrequent Bacteria and phage: 10^–8 to 10^–10 per nucleotide pair per generation Eukaryotes: 10^–7 to 10^–9 per nucleotide pair per generation, or 10^–4 to 10^–7 per gene per generation Occur in absence of exogenous agents Are mistakes during DNA replication, recombination and repair Influenced by the accuracy of the DNA replication and repair machinery Induced mutations: Results from exposure of organisms to mutagens, physical and chemical agents that cause changes in DNA such as ionizing, irradiation, ultraviolet light or certain chemicals ○ Treatment of bacteria with mutagens can increase the mutation frequency to >1% per gene Degree of exposure to mutagenic agents in the environment Mutations induced by chemicals and radiation Efficiency of the mechansms for the repair of damaged DNA Random and non-adaptive: Mutation is usually a non-adaptive process - environmental stress simply selects organisms with pre-existing randomly occurring mutations Reversible: Forward mutation - mutation of a wild type allele to a mutant allele Reverse mutation (reversion) - a second mutation that restores the original phenotype 1. Back mutation - a second mutation at the same site 2. Suppressor mutation - a second mutation at a different location in the genome Some mutants can revert to wild-type by both back and suppressor mutations To distinguish between two, basckcross the phenotypic revertant with the wild type Back mutation - all progeny will have the wild-type phenotype Suppressor mutation - some of the progeny will have the mutant phenotype Effects on phenotype: The effects of mutations on phenotype range from no observable change to lethality Isoalleles have no effect on phenotype or small effects that can be recognized only by special techniques Null alleles result in no gene product or totally nonfunctional gene products Recessive lethal mutations affect genes required for growth; are lethal in the homozygous state Mutations may be dominant or recessive In diploids most recessive mutations are not be recognised X-linked recessive mutations are an exception Because of the degeneracy and order in the genetic code, many mutations have no effect on the phenotype of the organism. These are called neutral or silent mutations Lecyure 9: DNA repair and recombination 17 DNA repair mechanisms in E.Coli: Living organisms contain many enzymes than scan their DNA for damage and initiate repair processes when damage in detected These process are well described in E. coli All DNA repair mechanisms have 4 basic steps 1. Detect DNA damage Methods for detecting damaged DNA vary depending on how the DNA is damaged/mutated: ○ Light-dependent repair (photoreactivation) ○ Excision repair ○ Post-replication (including mismatch) repair ○ Error-prone repair system (SOS response) 2. Excise mutated/damaged DNA A DNA repair endonuclease or endonuclease-containing complex recognizes, binds to, and excised the damaged region 3. Fill in the gap in DNA A DNA Polymerase fills in the gap, using the undamaged complementary strand of DNA as a template 4. Stick the DNA back together DNA ligase seals the break left by DNA polymerase Post replication mismatch repair: Provides a backup to the replicative proofreading activity of DNA polymerase by correcting mismatched nucleotides remaining in DNA after replication System must be able to distinguish between the template (parental) strand and the newly synthesised strand (that contains the mismatched base) Dam – DNA adenine methyltransferase – adds a methyl group (CH3) to A residues in DNA When DNA is newly replicated, the parental strand is methylated, but the nascent strand is not. This difference allows the mismatch repair system to distinguish the new strand from the old strand Parent strand used as template to repair mismatch Induction of the SOS response: In the absence of DNA damage, LexA binds to DNA regions that regulate transcription of SOS response genes and keeps their expression levels low When extensive DNA damage occurs, RecA binds to singlestranded regions of DNA in damaged regions This activates RecA, which stimulates LexA to inactivate itself. When LexA is inactivated, the SOS response genes are expressed DNA recombination mechanisms: Recombination between homologous DNA molecules involves the activity of numerous enzymes that cleave, unwind, stimulate single-strand invasions of double helices, repair, and join strands of DNA 18 In eukaryotes, crossing over occurs during prophase of meiosis I Crossing over involves the breakage of parental chromosomes and rejoining of the parts in new combinations The Holliday model and the double- strand break model (won’t look at this model) are two explanations of the molecular basis of recombination Endonuclease cleaves single strand of each parental DNA molecule Segments of the single strand on one side of each cut displaced from their complementary strand by action of helicase Helicase unwinds the two strands of DNA in the region adjacent to single-strand incision Module 3 Lecture 1: Recombinant DNA technology To manipulate DNA with control we need to be able to make the piece we want, cut it up, and glue it back together. Cloning is the isolation and amplification of a DNA fragment A recombinant DNA molecule is a DNA molecule made by joining two or more different DNA molecules. Accomplished using: ○ DNA “copier”: DNA polymerase (PCR) ○ Molecular "scissors": Restriction Endonucleases ○ Molecular "glue": DNA Ligase Cloning: 1. Polymerase chain reaction – how do we make the DNA we want 2. Plasmid vectors – what we clone into 3. Restriction enzymes – how we cut our DNA 4. DNA ligase – how we stick our DNA back together to make our recombinant DNA 5. Transformation and screening – how we make sure we have the correct recombinant DNA Polymerase Chain Reaction (PCR): Synthesises DNA in a 5′ to 3′ direction Needs a primer: ○ RNA in DNA replication ○ DNA oligo in PCR ○ dNTPs = building blocks What do we need to make DNA using PCR: ○ Target DNA that contains the DNA of interest (template DNA) ○ DNA Polymerase ○ Magnesium (Mg2+) ions (important cofactor for DNA polymerase) ○ Primers – in vivo = RNA is the primer created by Primase – in vitro PCR reaction = oligonucleotide DNA primers ○ Nucleotides – specifically we need deoxyribonucleoside triphosphates (dNTPS←the n means all four bases – A, T, G, C Primers – oligonucleotide primers ‘oligos’ Short pieces of DNA (20-30 bases long) that bind, or anneal, either side of piece of DNA of interest 19 Provide free OH group for DNA polymerase to start DNA synthesis Visualising DNA in the lab ○ Agarose gel - molecular ‘sieve’ to separate fragment according to size using electrical current (DNA is –ve; therefore runs to +ve electrode) ○ Stain – usually fluorescent, and can visualise under UV light; ethidium bromide, SYBR green; bind DNA non-specifically Step 1: Mix DNA sample, e.g., PCR reaction with loading dye Loading dye allows us to see where we are loading Load sample into well on agarose gel Step 2: Apply current to gel Negative (-ve) electrode at top DNA is –ve charged – repelled from –ve electrode Separated according to size – smaller fragments run further Step 3: After suitable time (30-60 minutes), DNA will be separated according to size Can then visualise Step 4: Visulise w/Uv light and camera or gel viewer 20 Lecture 2: Restriction endonucleases/resitition enzyme: Restriction enzymes have evolved to protect bacterial cells from incoming foreign DNA, typically bacteriophage 1. They cut at specific DNA sequences – usually a specific 6bp sequence 2. Their recognition sequence is palindromic 3. They can make staggered cuts 4. They are named after the species in which the enzyme is produced Plasmid vectors: Plasmids - 1. MCS – multiple cloning site; unique restriction enzyme sites 2. Must be able to replicate 3. Need a means of selection usually an antibiotic resistance gene Therefore, if you plate on media WITH this antibiotic, only cells WITH plasmid will survive Ligations: 21 The last step to generate our recombinant DNA Complementary bases anneal DNA ligase joins 5’=phosphate and 3’-OH to reform phosphodieste bond between adjacent bases (A-T) Recombinant DNA molecule formed following ligation Plasmid vectors: All plasmid vectors need 3 basic features: 1. unique restriction enzyme sites (MCS) 2. Origin of replication 3. Selectable marker Lecture 3: Transformations: Standard Tool: Transformation is now commonly used in laboratories to propagate recombinant DNA. Bacteria, such as Escherichia coli (E. coli), are used as hosts to maintain and replicate plasmid vectors containing foreign DNA. Plasmid Vectors: These plasmids need an origin of replication (Ori) to ensure they are copied within bacterial cells. However, plasmid maintenance is energetically costly for the bacteria, and the plasmid would be lost unless it provides an advantage, such as antibiotic resistance. Positive Selection in Transformation Selectable Marker: To ensure the maintenance of plasmids, a positive selection marker (usually antibiotic resistance) is included. This allows only bacteria with the plasmid to survive when grown on selective media containing antibiotics. E. coli as a Model Organism Usage in Transformation: E. coli, especially strains like DH5α, is a standard organism used for recombinant DNA production due to its high transformation efficiency and well-characterized genome. Competence: Before transformation, E. coli must be made "competent" to take up DNA. This is typically done by washing the cells in cold buffers with calcium chloride and then either heat shocking at 42°C or using electroporation to introduce the DNA. Transformation Efficiency and Outcomes Efficiency: Transformations and ligations (where DNA fragments are joined) are not 100% efficient. Sometimes vectors do not cut as intended, or they may re-ligate without incorporating the desired insert. Three Possible Outcomes in Transformation: 1. No Vector: Cells without the plasmid. 2. Vector Only: Cells containing the plasmid without the desired DNA insert. 3. Vector with Insert: Cells containing the recombinant plasmid with the inserted DNA. Selection Using Antibiotic Resistance and Blue/White Screening 22 Antibiotic Resistance: Both vectors with and without inserts can provide antibiotic resistance, allowing growth on selective media. Colony Identification: ○ No vector: No growth on selective media. ○ Vector-only and vector+insert: Both grow, but need further differentiation. Blue/White Screening: ○ Special Vector with lacZ Gene: The lacZ gene encodes β-galactosidase (LacZ protein), which metabolizes lactose into glucose and galactose. It can also act on lactose analogs, like X-gal. ○ X-gal: A lactose analog that is colorless but turns blue when cleaved by LacZ. ○ LacZ Gene and MCS: The multiple cloning site (MCS) is placed within the lacZ gene. If a DNA insert is successfully ligated into the MCS, the lacZ gene is disrupted, and β-galactosidase is not produced, leading to white colonies. If no insert is present, the lacZ gene remains intact, producing β-galactosidase and resulting in blue colonies. Outcome: ○ Blue Colonies: Vector without an insert; lacZ gene is intact. ○ White Colonies: Recombinant vector with an insert; lacZ gene is disrupted. Both types of colonies are resistant to the antibiotic, but only the white colonies contain the desired recombinant DNA. Summary of Transformation Outcomes 1. No Vector: No growth on selective media. 2. Vector Only (Blue Colonies): Growth with antibiotic resistance; lacZ gene intact. 3. Vector + Insert (White Colonies): Growth with antibiotic resistance; lacZ gene disrupted. Lecture 4: just lab things Lecture 5: DNA tech Recombinant DNA Analysis: Objective: Analyze recombinant DNA (plasmid constructs) to verify if it contains the desired DNA and produces the correct protein. Key Question: How do we ensure that the DNA and the protein produced are correct? Separation of Biomolecules by Size: General Concept: Biomolecules are separated based on size using a molecular ‘sieve.’ Biomolecules Charge: ○ Nucleic Acids (DNA/RNA): Net negative charge, linear structure (easier to separate). ○ Proteins: Complex 3D structures, variable charge (harder to separate). Gel Electrophoresis: Principle: Use of a gel matrix to separate molecules by size under an electric current. Smaller molecules move faster through the gel. Types of Gel Electrophoresis: Agarose Gel: Used for nucleic acids (DNA and RNA). Acrylamide Gel (PAGE): Used for proteins. 23 Nucleic Acids – Agarose Gel Electrophoresis: 1. Preparation: ○ DNA sample (e.g., PCR product) mixed with loading dye to visualize placement. ○ Loaded into wells in agarose gel. 2. Running the Gel: ○ Electric current applied: negative electrode at the top, DNA is repelled from it due to its negative charge. ○ DNA separates by size: smaller fragments travel further. 3. Visualisation: ○ Stain gel with a fluorescent dye (e.g., ethidium bromide, SYBR green) to visualize under UV light. ○ A marker with known DNA fragment sizes is used to compare and determine the size of separated fragments. Process: DNA/RNA is separated by size using agarose gel electrophoresis. Result: Smaller fragments travel faster and further. Visualization: Stain the gel and visualize under UV light using dyes like EtBr. Proteins – Acrylamide Gel Electrophoresis (PAGE): Challenge: Proteins have a more complex structure and variable charges. Solution: ○ Denaturation: Proteins are unfolded into linear forms and given a net negative charge to facilitate separation based on size. ○ Method: 1. Reducing Agents: β-mercaptoethanol or dithiothreitol break disulfide bonds. 2. Detergent (SDS): Binds to proteins, giving them a uniform negative charge. 3. Heat Treatment: Heat at 95°C for ~20 minutes to complete denaturation. Steps for PAGE: 1. Denaturation: Treat proteins with reducing agents, detergent (SDS), and heat to unfold them. 2. Electrophoresis: ○ Apply current to the acrylamide gel. ○ Proteins, now linear and negatively charged, move through the gel. Smaller proteins travel faster. 3. Staining: ○ Stain proteins with bromophenol blue, which binds non-specifically to proteins. ○ More protein = more stain binding = darker bands on the gel. Process: Proteins are denatured, charged, and separated by size in acrylamide gels. Staining: Visualized using bromophenol blue stain. Key Comparisons: Nucleic Acids: ○ Linear, negatively charged, separated easily by size in agarose gels. ○ Use fluorescent staining (e.g., EtBr). Proteins: ○ Require denaturation and charge equalization with SDS. ○ Separated in acrylamide gels (PAGE). ○ Visualized with protein-specific stains like bromophenol blue. Lecture 6: molecular hybridisation, blotting RT-qPCR Molecular Hybridisation: The process of annealing or binding a probe to a biomolecule of interest in a sequence specific way 24 Nucleic acids probed with nucleic acid probes Proteins probed with antibodies Short DNA probes (up to 50bp) are chemically synthesied - oligonucleotide probes (oligos) Probes are complementary to the sequence of interest An oligo probe that is 25 bp long is called 25mer During the chemical synthesis of the probe, the sequence of nucleotide bases are added in a controlled, specific manner Can be labelled - radioactive ot fluorescent Fluorescence in situ hybridisation (FISH): Cells and tissues are subjected to hybridization conditions, and single-stranded DNA or RNA is added (probe). Hybridization is monitored using fluorescence microscopy Refinement of molecular hybridization technique DNA presentin cytological (cell) preparations are the “target” for hybrid formation Use of fluorescent probes Antibodies - probing for proteins: Antibodies recognise specific short protein sequences - epitopes Antibodies are typically raised against a purified protein of interest - an antigen Antibodies recognise specific short protein sequences Routinely raised in mice and rabbits Primary antibody – specific for our protein of interest Secondary antibody that is labelled that recognises the primary antibody, we can detect our protein of interest Blotting: The process of actively transferring biomolecules froma gel to a membrane Membrane - an optimised material for your biomolecule 25 Western blotting: This technique relies on a combination of the following: ○ The denaturation and separation of proteins by sodium dodecyl-sulfate polyacrylamide gel electrophoresis (SDS-PAGE) ○ Transfer using a current ○ Probe using antibodies Southern blotting: If genomic DNA from a cell is cut by restriction enzymes, 1000’s of fragments of differentsizes are produced. Separating the large set of fragments by electrophoresis creates a continuoussmear. How do we locate a desired fragment or target? ○ Use a probe created from DNA or RNA complementary to a sequence within our target Key steps 1. The DNA of interest is subjected to restriction enzyme digestion to generate smaller DNA fragments 2. DigestedDNA is separated by agarose gel electrophoresis 3. Separated DNA fragments are transferredonto a solid support (nylon or PVDF membrane) 4. The transfer of DNA is done under denaturing conditions 5. Hybridise single stranded DNA with labelled DNA probe containing the sequence of interest Northern blotting: Southern Blot: locates a gene from a mixed population of DNA (e.g. organism’s genomic DNA) Northern Blot: locates mRNA that corresponds to gene of interest (e.g. used to determine level of gene expression in tissue) RT-qPCR: Reverse Transcriptase quantitative Polymerase Chain Reaction Isolate transcripts, i.e., genes currently being expressed Make complementary DNA (cDNA) using REVERSE TRANSCRIPTASE Then use gene specific primers to amplify a region of the gene you are interested in AND COUNT, i.e., QUANTITATE the amount of product during a PCR reaction 26 By comparing to a standard – e.g., a housekeeping gene or standard curve you can determine the level of gene expression Steps 1. Reverse transcription If eukaryotic mRNA – can use polyT primer for RT If prokaryotic – use ‘random’ oligos RT then makes cDNA copy of mRNA The amount of cDNA is proportional to the amount of mRNA present 2. Quantitative PCR Standard PCR but, you include a fluorescent dye – SYBR Green This only fluoresces when bound to double stranded (ds) DNA Therefore, after every cycle of PCR, you can count how much dsDNA you have Run in a PCR machine with a fluorescence detector in the lid Measure fluorescence after every cycle of PCR The more template you start with, the more dsDNA, so the higher fluorescent signal sooner To compare samples - set a threshold – the Ct value Ct = CYCLE THRESHOLD; an arbitrary fluorescence reading ‘how many cycles of PCR does it take to hit a certain amount of dsDNA?’ To compare samples - set a threshold – the Ct value Ct = CYCLE THRESHOLD; an arbitrary fluorescence reading ‘how many cycles of PCR does it take to hit a certain amount of dsDNA?’ Lecture 7: DNA libraries and Sequencing DNA libraries: A DNA library is a collection of DNA fragments that have been cloned using vectors There are two main types of DNA libraries: ○ Genomic libraries ○ Complementary DNA (cDNA) libraries Clone DNA fragments into plasmids, then transforms into bacteria and screen using method of choice A genomic DNA library is a set of DNA clones that collectively contain the entire genome of any given organism Preparation: 27 1. Isolation of multiple copies of genomic DNA, followed by a partial digestion 2. Fragments are joined into vectors and transformed into bacteria 3. Clone first and search later = “shotgun cloning” cDNA library: A cDNA library is a set of DNA clones of all the mRNAs expressed in the tissue at the specific time point at which the mRNA was originally prepared. Advantages: ○ Removes all non-coding DNA ○ Enriches actively transcribed genes ○ No introns Disadvantages: ○ Contains only sequences that are present in mature RNA ○ Regulatory elements, such as promoters, are not present ○ If a gene was not being expressed (or in very low frequency) it may be absent from cDNA library Screening of genomic and cDNA libraries: Direct selection: ○ Looks for protein product created by transformed plasmids ○ Uses selection pressure Molecular hybridisation: ○ Uses probes specifically designed to base pair with DNA of interest DNA sequencing: dNTP - building blocks for synthesis of DNA The OH group on the 3’ carbon is needed to form a phosphodiester bond with the 5’ Pi on the incoming nucleotide ddNTP - dideoxynucleotides Sanger sequencing: ddNTPs are the basis of Sanger sequencing If you include ddNTPs in your PCR mix, they would ‘kill’ synthesis of the complementary strand when they were incorporated This would lead to the complementary strand being ‘terminated’ at every position 28 If you could label the ddNTP, and separate each complementary strand on the basis of size, you could read the DNA sequence In a Sanger sequencing reaction, ddNTPs are included in a ratio of 1 ddNTP:100 dNTPs ddNTPs originally labelled with radioactive P32 this meant that four different reactions needed to be carried out – one per nucleotide (A,G,C,T) Separate products on basis of size (agarose gel) Now - label each ddNTP with different coloured fluorescent label All four in one pot Separate products on basis of size (gene scanner) Sanger sequencing good on the small scale, but quite labour intensive (that’s one of the reasons why the HGP cost so much!), and can only sequence over short sequence (~1000 bases) in a single read Next generation sequencing - NGS Now we use Next Generation Sequencing (NGS) technology for large scale projects Most still use the ‘sequencing by synthesis’ approach of the original Sanger sequencing methodology BUT – way sequence is read varies Still make a library – Genomic DNA library for whole genome sequencing; cDNA library for RNA sequencing NGS adds ‘adapters’ to ends of dsDNA molecules rather than subcloning into a vector during library prep Adapters = short DNA oligos of known sequence Knowing adapter sequence allows use of common primer for sequencing, and ‘barcoding’ Lecture 8: Third generation sequences: Long read methodologies NGS – includes an amplification step when making the library, and still sequences by synthesis Third generation – directly sequencing the nucleic acids you have prepared – no amplification 29 Third generation technology also allows much longer reads to be carried out Why is read length important? ○ What if the piece of DNA contained a highly repetitive element, i.e., a microsatellite – same short DNA sequence repeated over and over again? ○ Long read Third Gen sequencing also directly sequences the nucleic acid – no amplification step Pacific Biosciences Single-Molecule, Real Time (SMRT) Oxford Nanopore technology ○ Measure ‘electrical current intensity’ as nucleic acid passes through a nanopore ○ No synthesis ○ Resequence each piece of library hundreds of times Lecture 9: Analysis of sequence data: Thousands of reads then need to be put back together to give final sequence Library prep – breaks up your sample, e.g., a whole genome, a whole transcriptome into much smaller fragments Sequencing (NGS, 3rdGS) Then you need to put everything back together - assembly Shotgun assembly: If you break up your DNA into lots of smaller fragments, you need to then ‘put it all back together’ – a bit like ripping up a book into individual sentences, then having to put them back together in the right order So you will slowly build up the final, complete sequence of your sample (a complete genome for example) If you ‘barcode’ with different DNA sequences during library prep, you can also pool samples for assembly This means you can sequence multiple samples during the same sequencing reaction, then sort ‘sample 1’, from ‘sample 2’ by reading which barcode is on that read during assembly Metagenomics: 30 What if your sample is, e.g., and environmental sample, or a sample of bacteria from the gut – there are going to be lots of different organisms there You can identify different species by just sequencing a highly conserved gene, e.g., the gene encoding 16s rRNA = AMPLICON SEQUENCING Or you can sequence the entire genome of every organism - METAGENOMICS It helps if you have a REFERENCE GENOME for some of the species, as this is a ‘known unknown’ , i.e., you can identify known species from your new sample. This is also called ‘map based alignments’ This will also allow you to identify new species – ‘unknown unknowns’ from mixed samples, as these WILL NOT align to any current reference samples Read mapping: If you are sequencing a sample with a reference, then you can ‘map’ these reads back to this reference This is good for metagenomics – map genomes back to your ‘reference’ genome – your ‘known unknowns’ from metagenomic samples Mapping also used for ‘transcriptomics’ – analysis of the transcriptome, i.e., RNA The more of a particular sequence you have, the more RNA template was there = you can study gene expression differences by counting how much of a particular transcript is there So you can compare expression of genes between different samples and study different splice variants Bioinformatic analysis of sequencing data: It is possible to e.g., identify different individuals, or different species, from sequence data using a variety of techniques RFLP relies on polymorphisms – differences in single bases in restriction sites What if there are NO restriction sites, or the polymorphisms do not occur in a restriction site? Sequencing allows us to get around these problems Can look for all polymorphisms in an entire genome, or just in a small region of the genome that is highly conserved Analyse SINGLE NUCLEOTIDE POLYMORPHISMS - SNPs The further away two SNPs are, the less likely they are to be ‘linked’ Haplotype: —---> A specific set of SNPs occurring in a particular region of the genome 31 Short tandem repeats: Very short DNA sequences repeated in tandem (adjacent). The repeats are of varying lengths. Power of STR analysis is simultaneously looking at multiple STR loci which are independently assorted. The combination of repeats are unique for each person. Multiple regions of our genome have these tandem repeat regions; each region is a variable size Therefore, if enough are included, you will get a unique combination of repeats to identify an individual - identification They are also heritable – the length of your repeats will be very similar to the length of the same repeat tract from your parents – useful in paternity cases Nowadays use fluorescently labelled oligos – different coloured label for each tract length Module 4: Evolution and Population Genetics Lecture 1: Darwin’s Five Theories Darwin's theories form the foundation of evolutionary biology and are evidenced through various scientific observations: Perpetual Change: The idea that the world is in a constant state of change, supported by fossil records and other evolutionary data. Common Descent: All life forms derive from a single ancestor through a branching pattern of descent. Evidence includes fossil records and molecular data. Multiplication of Species: New species evolve through the splitting and transformation of older species, driven by variation and adaptive divergence. Gradualism: Evolution occurs slowly and continuously over time. However, this has been debated, and some evidence supports rapid changes (punctuated equilibrium) such as antibiotic resistance. Natural Selection: Organisms accumulate favorable traits over long periods, helping them adapt to their environments. Evidence for Darwin’s Theories Fossil records, molecular biology, homologous anatomy, and geographic data all support Darwin’s theories. Homologous structures (e.g., limbs in humans, dogs, birds, and whales) suggest descent from a common ancestor. 2. Weismann’s Separation of the Germ and the Soma Weismann proposed the concept of the separation between germ cells (which pass on genetic information) and somatic cells (which do not). This theory supported the understanding of hereditary mechanisms and was critical for neo-Darwinian evolution theory. 3. Modern Theory of Evolution (Neo-Darwinism) The modern synthesis of Darwin’s theories with genetics introduced key concepts such as: Theory of Allele Frequency: Evolution is driven by changes in allele frequencies in populations. 32 Genotype Frequency: The proportion of different genotypes in a population. Hardy-Weinberg Principle: This principle provides a mathematical model to predict allele and genotype frequencies in a population under certain conditions (no mutation, random mating, etc.). Hardy-Weinberg Equilibrium: Populations in equilibrium will maintain constant allele frequencies over time unless acted upon by evolutionary forces. Simple Measures of Genetic Diversity Allele and genotype frequencies are used to assess genetic diversity within populations, critical for understanding evolution and adaptation. Sameness and Difference: A Fundamental Question Sameness (Homology): Similar structures across species that are conserved over time indicate descent from a common ancestor (e.g., limb structures in various animals). Difference (Adaptation): Species exhibit differences based on environmental challenges, with some changes driven by adaptation, while others may arise without adaptive significance. Microevolution vs. Macroevolution Microevolution: Refers to small-scale changes within populations, often related to genetic diversity and allele frequency. Macroevolution: Involves large-scale changes, including speciation and divergence over long periods. Evolution Evidence: Fossils, Speciation, and Homology Fossil evidence, homologous structures, and molecular data all support the concept of evolution. Speciation: Darwin’s finches, for example, show how environmental factors and variation within species can lead to the formation of new species through adaptive divergence. Natural Selection: The Core Mechanism of Evolution Natural selection explains how organisms accumulate traits suited to their environments: Variation and Inheritance: Genetic variation is the basis for natural selection, and mutations provide the raw material for evolution. Non-Random Survival and Reproduction: While mutations are random, the survival and reproduction of organisms are non-random and shaped by environmental pressures. Misconceptions About Natural Selection Natural selection does not act on individuals but on populations. Over many generations, populations evolve to become better adapted to their environments. 33 Example: Insect populations exposed to insecticides evolve resistance over time through non-random survival of resistant variants. Alcohol Metabolism in Humans and Drosophila: Evolutionary Insights Human Alcohol Metabolism: The enzyme alcohol dehydrogenase (ADH) and aldehyde dehydrogenase (ALDH) play a role in alcohol metabolism, with genetic variations (e.g., ALDH2*2 allele) causing different physiological responses, such as alcohol flushing in eastern Asian populations. Drosophila Alcohol Metabolism: Variations in ADH activity (Fast and Slow alleles) show how balancing selection maintains genetic diversity in response to competing environmental factors (ethanol concentration vs. temperature). Key Concepts Summary Darwin’s Five Theories: Perpetual change, common descent, multiplication of species, gradualism, and natural selection. Separation of Germ and Somatic Cells: Critical to understanding heredity. Natural Selection and Adaptation: Populations evolve through changes in allele frequencies, not individuals. Adaptation is a result of environmental pressures on genetic variation. Alcohol Metabolism and Genetic Evolution: Examples of how genetic variation contributes to evolutionary processes in both humans and Drosophila. Lecture 2: populations, gene pools, and equilibriums Population Genetics: Study of alleles within a population and the mechanism that can cause allele frequencies to change over time Mendelian populations A population evolves through changes in its gene pool, therefore population genetics is also the study of evolution Alleles and their frequency within populations: Frequency, proportions and percentages 34 0.25 = 1 in 4 = 25% Allele frequencies (diploid) Refers to the frequency of an allele (e.g. “A” or “a”) Symbols p and q are usually used for the two alleles. ○ p = dominant allele, q = recessive allele ○ p+q=1 ○ p=1–qq=1–p Q: A population has two alleles, and the dominant allele has a frequency of 0.7, what is the frequency of the recessive allele? Population and gene pools: The gene pool is the sum of all the alleles of all genes of all individuals in the population. If only one allele exists at a particular locus or gene in a population, the allele is said to be fixed. But if there are two or more alleles for a gene in a population, individuals will be either homozygous or heterozygous Genotype frequencies (diploid): If A and a are at the frequency of p and q respectively the frequency of the AA genotype is p × p or p2 the frequency of the aa genotype is q × q or q2 and the frequency of the heterozygote genotypes (e.g. Aa) is 2pq – why? The Hardy-Weinberg Principle: Population genetics defines evolution as changes in allelefrequencies. In a population that is not evolving, allele and genotype frequencies will remain constant from one generation to the next. A stable population is said to be in Hardy-Weinberg equilibrium. The Hardy-Weinberg equation allows us to calculate the expected genotype frequencies given the observed allele frequencies. To determine if a population is in Hardy-Weinberg equilibrium we need to know the genotypes of all individuals. 35 If these principles hold: ○ Allele and genotype frequencies do not change across generations. ○ The population is stable (Hardy-Weinberg equilibrium) If they do not: ○ Allele and genotype frequencies will change across generations. ○ The population will be evolving ○ Note, we can’t tell which principle has been met. 36 37 38 39

Genetics and Evolutionary Bio PDF

Document Details

Tags

Related

Summary

Full Transcript

Upgrade to continue