Week 2 Lectures PDF
Document Details
Uploaded by Deleted User
Tags
Summary
This document details lecture notes on the structure and evolution of genes, covering prokaryotic and eukaryotic examples. It explains concepts like introns, exons, and alternative splicing. The lecture notes also touch upon how alternative splicing can lead to diverse gene products.
Full Transcript
The gene What constitutes a gene? The gene - A DNA sequence that produces a functional product – Product RNA (MRNA IS URNA SIRNA SURNA) , tRNA , , , Which can be translated Into prote...
The gene What constitutes a gene? The gene - A DNA sequence that produces a functional product – Product RNA (MRNA IS URNA SIRNA SURNA) , tRNA , , , Which can be translated Into proteins If MRNA Is produced - Usually also includes regulatory sequences that machinery recognize to transcribe the gene and also may include sequences that control the transcription. Prokaryotic gene 26' AT's Promoter region Loading… Prokaryotic gene mature n - open reading frame Functional Units of Genes A gene can ↓ Coperate together ↓ alone operate independently of others or together as an unit called more man I gene Loading… under control of an ________. operon one promoter Some operons can Regulated > - b a be co-regulated by protein transcribed from _______ Regulou even if diff direction , gene - can be controlled by same transcr factor. Eukaryotic gene Interrupted genes are expressed via a precursor RNA. Introns are removed when the exons are spliced together. The mature mRNA has only the sequences of the exons. ordersameaMRNA from pre anged introns gone bC Eukaryotic gene regulate removalexons to join - of intron Exons remain in the same order in mRNA as in DNA, but distances along the gene do not correspond to distances along the mRNA or polypeptide products. How did they find out mRNA has not co-linear with DNA sequence? Compared DNA and RNA of same gene 1. USING DNA-RNA hybridization 2. using restriction endonuclease maps expits and electron microscopy ↑ experiments Typical Eukaryotic gene – B-globin gene All functional globin genes have an interrupted structure with three exons. The lengths indicated in the figure apply to the mammalian b-globin genes. Variation in gene lengths for same trait is usually due to variations in intron lengths and not exons lengths, WHY? ↑ - due to less selection pressure during evolution? because change in exons results Introns not part of function DNA sea. In exon means changes In function , no protein development EX: Mammalian genes from different species for DIHYDROFOLATE REDUCTASE (DHFR) have the same relative organization of rather short exons and very long introns, but vary extensively in the lengths of introns. no histones ? Loading… Almost all eukaryotic genes contain introns – splicing them all one exception 9 enes that is the code for > -> is too much work histone proteins bc compared to other Proteins Whole DNA = histones histones need to So lots of them always be around Advantage?: possible vestiges of ancient molecular parasites or greatly ____________________________________that increase # of protein coding can be sequences produced Introns there be they can bring useful sea. Typical mammalian gene: contains 8 introns - about 5 to 10 times the length of the exons, ex: introns are 50 – 20000 nucleotides long and exons are usually 100-200 nucleotides long, humans can have 100’s of introns in single gene Evolution of multicellular organisms comes with longer genes with more introns but shorter exons! Most genes are uninterrupted in yeast, but most genes are interrupted in flies and mammals. Increasing gene products without increasing the number of genes: 1) Alternative Splicing - Order of exons does not change - About 90% of human genes go thru alternative splicing - increase # of functions from Same DNA Sequence exons tig to make Splice both introns variations OR 2 with Splice exon more intron/exon Introns 1 , 2 means chance of making more to make smaller protein , middle proteins * piece gone multiple splicing With diff function. of each splicing Alternative Splicing – EX; Alternative splicing generates the a and b variants of troponin T. ?? only in t eukaryotes 2) Alternative Start and Stops ↑ has to do w, now transcription ends Start at diff AVG. and still be in reading frame now you X diff Stop site Too. 2 get diff. proteins Two proteins can be generated from a single gene by starting at different genes some points. also have alternative Stop points andPoly(A) alternate Site ↳ Both make diversity in RNA primary 3) Alternative Reading frame ) ↳ completely diff Protein. ? not in enk. > - In viruses, some bac. Two genes might overlap by reading the same DNA sequence in different frames. Exons/Intron organization tells us something about evolution: Ex: Gene conservation in Globin genes found in all. diff Introns allow diff. splicing so instead of globins entire exon you want over time intron in > - middle of exon Just half of It The exon structure of globin genes corresponds to protein function, but leghemoglobin (from plants) has an extra intron in the central domain. Exons/Introns organization tells us something about evolution Ex: gene duplication: Insulin gene in chickens and rats ↑ In rats intron Spliced , zna versionas no intron Chick / rat had. one Insulin gene evolution , caused extra COPY The rat insulin gene with one intron evolved by loss of an intron from an ancestor with two introns. Ex: Evolution has been the result of insertions or deletions of introns? ex: Actin gene evolve , the intrns , as species + allows part swapping for snow more introns more proteins wi diff functions. Actin genes vary widely in their organization. The sites of introns are indicated by dark boxes. The bar at the top summarizes all the intron positions among the different orthologs (genes that are homologous in different species – usually related genes). Summary of what an eukaroytic east gene looks like? 1 intron for processing = - made little change in exons , few more aa ↳ give diff function. 4 Excellent example of alternative splicing > - happens DNA level at Immunogobulin genes in humans mas fo ora unity - in gut ? 1st released after infected (antibodies) all cells do not have same genome 4 nence variety of antibodies Protein consists of tetramer of 2 heavy chains and 2 light chains, (2 large polypeptides and 2 small polypeptides) antibody D as IC : 2 heavy , I light chains ↑ 4 5 differentheavy chains allergic membrane , expressed during diff times. reas. makes It antigen-presenting In life , or diff places In body, - cell or diff. Stages of infection Variability of antibody due to alternative splicing c-domain binding epitopes ? complement 3 constant I light : , Variable can bind 2 antigens that came from Jame B-cell Need variable region to produce Variable shapes to interact WI many differently shaped antigens Each protein domain corresponds to an exon Leader so protein co-trans gets Into ER ↑ two In light parts VJ neavy : VDJ For bend ~ Immunoglobulin light chains and heavy chains are encoded by genes whose structures (in their expressed forms) correspond to the distinct domains in the protein introns are numbered I1 to I5. Immunoglobulin G gene splicing constant has # of VDS exon it ac receptors to make V , D, , genes t recog them. only rearranging - variation level from VD] IS DNA = RNA SplICING level ↑ doneevel at - ↑ Evolutionary consequence of alternative splicing? Ig gene segments in mammals are arranged in groups of variable (V), diversity (D), joining (J), and constant (C) exons. A leader sequence (L) at the beginning of each VH segment encodes a signal sequence which is used to transport the newly synthesized chains into the endoplasmic reticulum; it is not present in the final chain. Combinatorial diversity is - generated by the random formation of many different VJL and VDJH combinations. - increased by the ability of any VH region to pair with any VL region to bind antigen. light ↓ -heavy - Random pairing of 320 diff. VI with almost 11 000 diff Vi ,. results in -3 5 x100 diff.. possible antibody specificities The genome All the genes of an organism New terminology Transcriptome – all the transcribed genes expressed under a certain condition - may be larger # of diff RNA. molec. than genes due to alternative splicing etc. New terminology Proteome – all the polypeptides produced in a certain cell or tissue Loading… - may be larger than transcriptome If more than I protein produced from Single MRNA Interactome – all the protein-protein interactions Genomes -1st completed genome was Haemophilus influenzae (1995) -1st eukaryotc genome was Saccharomyces cervisiae (1996) ↳ yeast (1997) E. Coli – 1997 How did they sequence the Loading…human genome? Watson and Collins headed up the project bricks joined to not one long brick ,. 20 sequencing centres from across the world they are joined ! STS – sequence tagged sites Same w/BAC Clones Is a 4g all put EST – expressed sequence tag. contig , contiguous. Both used to help order cloned segment into contigs together one long but not one contiguous ↑ meaning uninterrupted ↑ overlap Sequencing the human genome Sequencing the human genome- summary (check out video) Entire genome is a composite Different parts of Human genomes sequenced in different facilities Sequenced from many different donors Commercial effort to sequence genome was also started by Craig Venter who use his own DNA for most of the sequencing Genomic Annotation Gives us a listing of into about location/ function of genes or critical sequences Can be compared: HOW? – Phenotypic function – effect of gene product on organism – Cellular function – the metabolic processes or interaction of the gene product – Molecular function – the activity of the gene product Assigning gene function – generating linkage map comparative genomics – compare to other genes that have been sequenced – if the sequence is similar then called homologs (function may, not be same If sequence and function are similar – called orthologs (Common ancestors If the sequence and function are similar and in same species -called paralogs (gene duplication Conserved gene order - synteny What about Individual genome variations? Polymorphisms – many versions of wild-type allele – saw that last week SNP – Single nucleotide polymorphism Repetitive sequences repeats - in sequences SNPs can be used for: A. Forensics/ paternity suits B. Construct phylogenic trees (evolution) C. Map disease genes Loading… - SNPs -Single nucleotide polymorphisms - Haplotypes – SNPs inherited together - tag SNPs – subset of SNPs that define an entire halpotype Which base was original? Look at outgroup B) Genomic differences can be used to construct phylogenetic tree Progesterone receptor protein changes over time Note: Humans and chimp genomes only differ by 1.23%, human to human variations are 0.1% Genome comparisons show evolution Why have so many bases changed in humans? Accelerated evolution? Genome comparisons can be used to locate a disease gene Linkage analysis – link gene to well know SNP In a genome-wide association study, both patients and non-patient controls for a particular disorder (such as heart disease, schizophrenia, or a single-gene disorder) are screened for SNPs across their genomes. Those SNPs that are statistically more frequently found in patients than in non-patients can be identified. Example: early onset Alzheimers Marker 8/2 Repetitive DNA sequences The proportions of different sequence components vary in eukaryotic genomes. The absolute content of nonrepetitive DNA increases with genome size but reaches a plateau at about 2 × 109 bp. Genomic alterations occur by several mechanisms not just SNPs Fusion Pseudogenes: - use related orthologous sequences in other species to identify function - genes with out ORF or in non-syntenic location is most likely a ___________ When part of gene duplicated + gets Pseudogene mutation Mouse chromosome 1 has 21 segments between 1 and 25 Mb in length that are syntenic with regions corresponding to parts of six human chromosomes. Organelles with their own DNA: Mitochrondria and Chloroplasts Humans: Inherit Mitochrondrial DNA only from their mother – an all mitochondrial genes example of non-Mendelian ↓ from mom genetics Mutations in mitochondrial DNA happens approx. 10 times faster than nuclear DNA – canbeuse eaton - t In animals, DNA from the sperm enters the oocyte to form the male pronucleus in the fertilized egg, but all the mitochondria are provided by the oocyte. What genes are coded on mitochondrial DNA? not in prokaryotes ? ~ Mitochondrial genomes have genes encoding (mostly complex I– IV) proteins, rRNAs, and tRNAs. What genes are coded on chloroplast DNA? more energy , more copies of Mitochondria ) not all proteins ↳ for Mito. are in mito DNA translated. In to Some in nucleus transferred Cytoplasm to. Mito The chloroplast genome in land plants encodes 4 rRNAs, 30 tRNAs, and about 60 proteins. Similar to mitochondria but usually more genes Endosymbiosis: mitochondria and chloroplasts originated by an endosymbiotic event when a bacterium was captured by a eukaryotic Chloroplast evolved When cyanobacteria was engulfed Loading… Centromeric DNA ? - DNA diff density because. Short repeats = microsatellite AT ratio of CG vs. more co in centromeres =Types of sequences in DNA more dense more part of DNA = more Introns than exons Types of genes In the end….. Humans have only about 20,000 protein encoding genes – Less than twice the number in a fruit fly – Not many more than a nematode – Less than a rice plant - less than 1 5 % of human DNA. Is Coding "exons' - HOWEVER many single base Variations in human Pop can be used in forensics Human genome project Started in 1990 Goal: determination of the complete nucleotide sequence for every chromosome in the human genome Cost: 3 billion dollars Two strategies used: Hierarchical shotgun sequencing Whole genome shotgun sequencing Sequencing the first human genome (4.43 mins) https://www.youtube.com/watch? v=ZQifx8BpaqE Taken from http://www.ornl.gov/sci/techresources/Human_Genome/project/priva tesector.shtml Physical maps Constellation of overlapping DNA fragments that are ordered and orientated and span each of the chromosomes in the genome. Make chromosome-specific libraries by isolating chromosomes Randomly shear or partially cut DNA with restriction enzymes and clone into vectors (bacterial artificial chromosomes (BAC) (inserts average 200kb) Overlapping clones are assembled into contigs, ideally, one per chromosome Mapping regions of the genome in BAC genomic libraries If carefully chosen to minimize overlap, it takes about 20,000 different BAC clones to contain the 3 billion pairs of bases of the human genome. In the BAC-based method, each BAC clone is "mapped" to determine where the DNA in BAC Loading… clones comes from in the human genome. Using this approach ensures that scientists know both the precise location of the DNA letters that are sequenced from each clone and their spatial relation to sequenced human DNA in other BAC clones https://www.genome.gov/11006943/human-genome-project-completion-frequently-asked- questions/ Alignment of the genome with the genetic and physical map Genes have positions on chromosomes and researchers need to align the genes with the physical map containing pieces of genomic DNA http://www.yourgenome.org/facts/how-do-you-map-a-genome Hierarchical shotgun sequencing Loading… Figure Hierarchical shotgun sequencing Griffiths (2005) Introduction to Genetic Analysis Subcloning of the BAC library For sequencing, each BAC clone is cut into still smaller fragments that are about 2,000 bases in length. These pieces are called "subclones. " A "sequencing reaction" is carried out on these subclones. The products of the sequencing reaction are then loaded into the sequencing machine (sequencer). The sequencer generates about 500 to 800 base pairs of A, T, C and G from each sequencing reaction, so that each base is sequenced about 10 times. A computer then assembles these short sequences into contiguous stretches of sequence representing the human DNA in the BAC clone. www.genome.gov/11006943/human-genome-project-completion-frequently-asked- questions/ Subcloning for sequencing universal primer DNA of interest Ampicillin Resistance pUC18 gene or 19 universal primer pUC18 or 19 are universal vectors used to take inserts that are subcloned from the BAC or YAC libraries. Selection depends on the presence of the Ampicillin gene Whole genome shotgun sequencing Figure Whole genome shotgun sequencing Griffiths (2005) Introduction to Genetic Analysis The story of the human genome project includes interviews with Francis Collins and Dr. Craig Venter Video overview of lessons from the Human Genome project (7.26 mins.) https://www.youtube.com/watch?v=qOW5e4BgEa4 Loading… The Human Race by PBS Episode 3 (One hour). https://www.youtube.com/watch?v=8YJjEtZX-r4 Cracking the code of Life is a video about the original human genome sequencing project (2 hours from PBS, 2001) https://www.youtube.com/watch?v=ObGUes6c1eU