Summary

This document provides an overview of genes and genomes. It covers topics such as identification of genetic material, various types of genes, and the structure of genomes. It also discusses the different types of DNA sequences. The document is likely a learning resource for students.

Full Transcript

Genes & genomes Zsolt Fábián M.D., Ph.D., Dr. Habil. Genes & genomes Identification of the genetic material – Griffith & Avery experiments Craig et al., Molecular Biology - Principles of Genome Function, 2e, Oxford Universi...

Genes & genomes Zsolt Fábián M.D., Ph.D., Dr. Habil. Genes & genomes Identification of the genetic material – Griffith & Avery experiments Craig et al., Molecular Biology - Principles of Genome Function, 2e, Oxford University Press, (2014) Genes & genomes Identification of the genetic material – Hershey-Chase experiment Craig et al., Molecular Biology - Principles of Genome Function, 2e, Oxford University Press, (2014) Genes & genomes Genes in the genome The genome is the entire DNA content of a cell Genes code for one or more specific proteins or RNAs that have structural, catalytic, or regulatory activities Protein coding genes – Escherichia coli: ~4300 genes in 4.6 Mbp genome – Homo sapiens: only about 20,000 protein-coding genes that code for an mRNA that is translated into a protein from a 3547 Mbp genome In eukaryotes, genes make up only a fraction of the genome Non-protein coding transcribed genes – Escherichia coli: only ~90 small, non-coding genes – Humans: 23,429 genes known (so far) that code for a known RNA that are not translated into proteins – these genes encode functional RNAs Repeated sequences are common (about 60 - 73% of the genome) Some of these genomic regions seem structural, others are more “random” Genes & genomes Genes in the genome Other “non-protein coding” DNA sequences 1. Regulatory DNA base sequences – Signals defining the start or end of a gene – Signals influencing transcription & translation – Initiation points for DNA replication 2. Introns (“intervening sequences”) - typical of modular use of eukaryotic genes 3. DNA with unknown function – Repetitive DNA (not found in bacteria) – Unique sequences of different lengths Genes & genomes Eukaryotic vs. prokaryotic protein-coding genes 1 2 3 Prokaryotic genes are arranged in closely spaced clusters regulatory regions are shown in green Eukaryotic genes have introns, prokaryotic genes do NOT Genes & genomes Prokaryotic genome Prokaryotic genomes are about 1/1000 the size of human nuclear genome Genes are densely arranged Smaller circular prokaryotic DNAs are called plasmids Circular DNA, only 4000- 24,000 bp in size Replicate autonomously Horizontal gene transfer in bacteria Genes & genomes Human nuclear genome 3.1 x 109 bp per genome 19,000 - 20,000 protein-coding genes Genes are sparse (1 gene/100,000 bp) DNA is packaged up with histone proteins Linear DNA organized into discrete Chromosomes 23 different pairs of chromosomes (2n, diploid) in most cells of the body giving 22 autosomal chromosomes with a sex pair that is XX or XY 23 homologous PAIRS of chromosomes for a total of 46 chromosomes in a diploid (2n) cell Genes & genomes Homologous pair of chromosome 8 Chromosomes only “condense” and become visible right before a nucleus divides; all packaged up Each chromosome has been replicated = each chromosome has 2 arms/legs Paternal Maternal chromosome chromosome Genes & genomes Chromosome territories Craig et al., Molecular Biology - Principles of Genome Function, 2e, Oxford University Press, (2014) Genes & genomes Human mitochondrial genome Circular DNA > 10 copies per mitochondrion > 1000 copies per cell 16,500 bp per genome 37 genes encoded Gene-rich (1 gene/445 bp) Inherited maternally lack histones Genes & genomes Genes and genomes Craig et al., Molecular Biology - Principles of Genome Function, 2e, Oxford University Press, (2014) Genes & genomes Satellite DNA Britten & Kohne, Repeated Sequences in DNA. Science, (1968) 161(3841), 529–540 Genes & genomes Satellite DNA Classification of satellite DNA based on length of repeat – Satellite 171 to 68 bp repeats extending over millions of bp Found in centromeres where spindles attach (kinetochore); seem structural – Minisatellite 6-64 bp repeats, highly variable total repeat size (polymorphic) Used as DNA markers in DNA fingerprinting & allele tracking Telomeric repeats fall into this category – these are structural – Microsatellite Aka, Short Tandem Repeats (STRs), Simple Sequence Repeats (SSRs) 2-, 3- or 4 bp units, with a highly variable total repeat size (polymorphic) – Usually a million copies scattered around genome 13% of the nuclear genome Alu elements: – Most abundant sequence in the human genome – Often plays a role in unequal crossing over, leading to chromosomal abnormalities Alu elements labeled green with a DNA probe Genes & genomes Satellite DNA - Interspersed repeats Long Interspersed Nuclear Elements (LINE) Most are ~6000 bp long, = 21% of the nuclear genome; 10X longer than the SINES Only 0.5% to 0.1% of these remain active and are the source of the famous “jumping genes” Class I uses a "copy and paste" mechanism LINEs include genes for reverse transcriptase and an endonuclease (needed for reintegration) – hence their large size Class II transposons (~ 3% of genome) are smaller Use a "cut and paste" mechanism require only “transposase” Genes & genomes Repetitive sequences Doggett, Overview of Human Repetitive DNA Sequences. Current Protocols in Human Genetics (2001) A.1B.1-A.1B.5 Genes & genomes Repetitive sequences – tandem genes gene spacer gene spacer gene dsDNA A few genes are repeated verbatim in the genome These comprise only a very small fraction of all the repetitive DNA Many other genes are present in multiple copies that differ slightly from each other – and so are considered to be gene families e.g. olfactory receptor gene family >900 related genes Doggett, Overview of Human Repetitive DNA Sequences. Current Protocols in Human Genetics (2001) A.1B.1-A.1B.5 Genes & genomes Gene families Adult α globin family – chr 16 β β globin family – chr 11 ψβ α1 δ α2 Fetus ψβ ψβ ψβ Aγ ψβ Gγ ζ Embryo ε Doggett, Overview of Human Repetitive DNA Sequences. Current Protocols in Human Genetics (2001) A.1B.1-A.1B.5 Genes & genomes Human genetic variation Human genomic sequences average ~0.1% variation – ~3 million base-pairs differ out of 3 billion base-pairs – Individuals carry ~10 unique ‘sequence variants’ – variations in sequence arise by mutation & then spread throughout population during evolution causing genetic “polymorphism” – Alleles are different versions of an identified gene normally differ by only a small number of nucleotides – “Wild-type allele” = the most common allele for a gene within a population – “a polymorphism” = an allele that is less common than the wild-type allele, but occurs more than 1% of the time – “Variant” – generally just refers to a change in DNA sequence (that may or may not produce changes in observed features) – ”Rare variant:” found at a frequency of < 1% Genes & genomes Detection of nucleic acids Craig et al., Molecular Biology - Principles of Genome Function, 2e, Oxford University Press, (2014) Fragment length Mother Child 1 Child 2 Child 3 Child 4 Genes & genomes Father DNA fingerprinting using minisatellites Genes & genomes Key points Genome sizes vary There is no close correlation between genome size and complexity of species Prokaryotic genomes are more genic Genomes have repetitive sequences Based on the size of the repeats they could be: Microsatellites Minisatellites Based on their location they could be: Tandem repeats Interspersed repeats Satellite DNA is used for DNA profiling Coding sequences can be arranged as: tandem repeats or interpersed genes Based on their sequences they could belong to families

Use Quizgecko on...
Browser
Browser