Human Genetic Diversity & Genomics Lecture Notes - PDF

Summary

These lecture notes cover various topics in human genetic diversity and genomics, delving into the types of genetic variations, single nucleotide polymorphisms, and copy number variants. It explores mutations, disease penetrance, gene expression, and genetic diversity between humans. The notes also discuss genomics, genetic testing, personalized medicine, and therapeutic approaches such as recombinant DNA and genome editing.

Full Transcript

301 Human Genetic Diversity I. Human Diversity a. Genetic speaking, humans are very similar i. ~ 0.1% di>erent II. Polymorphisms a. Any common genetic variation within a population i. Occurs in at least 1% of the population...

301 Human Genetic Diversity I. Human Diversity a. Genetic speaking, humans are very similar i. ~ 0.1% di>erent II. Polymorphisms a. Any common genetic variation within a population i. Occurs in at least 1% of the population ii. This is the arbitrary threshold that we use to determine what is common b. Explains a very large portion of human phenotypic variation c. Types of Polymorphisms (3 types) i. Single nucleotide polymorphisms (SNP) 1. Di>erence of 1 nucleotide amongst individuals 2. Most common human genetic variant 3. Contributes to major human traits ii. Insertion – Deletion a. The insertion or deletion of nucleotides to DNA b. Some single nucleotide indels are referred to as SNPs iii. Copy number variants (CNV) 1. Repeating sequences of DNA a. Caused by duplication (expansion) or deletion (contraction) of regions of DNA b. Most CNV are between 2 and 7 bp in length i. But can be repeated 100’s of time c. E.g. microsatellite, trinucleotide repeats 2. Structural variant a. A>ects a considerable number of DNA bases b. Roughly 5-10% of human genetic variations are classified as CNV c. May or may not play a role in phenotype expression III. Mutations a. Uncommon or rare genetic variants within a population i. Less than 1% of a population b. Types of mutations i. Point mutation 1. Di>erence of 1 nucleotide amongst individuals 2. Include synonymous, non-synonymous, and nonsense mutations a. Synonymous à A DNA mutation that changes a codon but does not change the amino acid it codes for (aka silent mutation) b. Non-Synonymous à A DNA mutation that does change the amino acid coded for and changed the codon. Includes missense (one amino acid changes) and nonsense (codon becomes a stop codon) mutations c. Nonsense àA mutation that changes a codon into a stop codon, causing the protein to be cut short (truncated) 3. Many diseases result from point mutations a. Not all point mutations result in disease i. Some point mutation are harmless ii. Point mutation à A change in a single nucleotide in the DNA sequence. Can be silent, missense or nonsense ii. Indels 1. Include frameshift mutations iii. Copy of number variants (CNV) 1. Trinucleotide repeat expansion (Huntington’s) 2. Gene duplications (increased copy number) IV. Polymorphism vs Mutations a. The di>erence between what is considered a polymorphism and what is considered a mutation is the frequency that the variant occurs within a population i. > 1% = polymorphism; common DNA variation ii. < 1% = Mutation; uncommon DNA variation b. May vary amongst di>erent populations i. Sickle Cell Allele 1. Polymorphism in Black American populations a. 9% of population are carriers 2. Mutation in Caucasian American populations a. 0.2% of population are carriers ii. Sometimes disease-causing point mutations are referred to as SNPs within human populations to avoid the stigma of labeling an a>ected individual as a mutant 1. “Mutant” is a term that should only be applied to those with blue eyes c. Polymorphisms and mutations likely had similar origins i. They started mutations 1. Mutations that conferred an advantage, or at least was not detrimental, were held within the population and eventually became polymorphisms 2. Mutations are rarely beneficial V. Single Nucleotide Polymorphisms and Gene Expression a. Coding SNPs i. Occur within the coding sequence of a gene (the protein coding region) 1. Non – synonymous coding SNPs a. Genetic variant that occurs within the reading frame and a>ects protein sequence i. Includes variants within an intron that a>ects splicing 2. Synonymous coding SNPs a. Genetic variant that occurs within the reading frame, but does not a>ect protein sequence b. Non – Coding SNPs i. Does NOT occur within the coding sequence of gene, so it does not a>ect protein sequence 1. May a>ect protein expression a. i.e. occur in a functional region of DNA other than the coding region i. e.g. promoter, regulatory region, ribosomal binding site ii. May occur in a functionless region of DNA iii. May alter transcription factor binding or promoter activity VI. Disease Penetrance a. Proportion of individuals within a population carry a disease – causing allele and express disease phenotype i. i.e. Likelihood of showing disease symptoms based on DNA sequence ii. Note: Multiple genes/alleles may contribute to a disease. All of these can a>ect disease penetrance (more in 304) b. Complete Penetrance i. If you have the disease – causing allele, you WILL show symptoms of the disease ii. Many (not all) single gene disorders have complete penetrance c. Incomplete Penetrance i. If you have the disease – causing allele, you might show symptoms of the disease 1. High penetrance: Most people with disease – causing allele will exhibit symptoms of the disease 2. Low penetrance: Most people with disease – causing allele will not exhibit symptoms of the disease ii. Additional factors, including additional disease – causing alleles and the environment, may play a role in expressing disease symptoms (more in 304) d. Some disorders, notably trinucleotide repeat expansion disorders, can have in complete penetrance i. E.g. Huntington’s disease 1. ≥ 40 CAG repeats in PolyQ tract results in complete penetrance a. i.e. you will show symptoms of the disease 2. Between 27 and 39 CAG repeats in PolyQ tract results in incomplete penetrance a. i.e. you may or may not show symptoms of the disease i. The more repeats, the more likely you will show symptoms VII. Disease Expressivity a. Extent of clinical symptoms (i.e. disease phenotypes) based on genotype i. i.e. How severe the disease symptoms will manifest based on DNA sequence ii. Note: Multiple genes / alleles may contribute to a disease. All of these can a>ect disease expressivity b. High expressivity: Allele is associated with severe symptoms of the disease c. Low expressivity: Allele is associated with mild symptoms of the disease d. Other factors: Including additional disease – causing alleles and the environment, may play a role in severity of disease symptoms (more in 304) e. Some disorders that exhibit complete penetrance can still show some variable expressivity i. E.g. Cystic Fibrosis 1. Many Cystic Fibrosis – causing alleles exist, but they all result in showing symptoms of the disease 2. Individuals with di>erent disease – causing variants may show variable expressivity a. i.e. some disease – causing variants result in less severe symptoms VIII. Disease Penetrance and Expressivity a. Both a>ected by i. Quantity of disease – causing variants in genome 1. The more disease – causing variants in the genome, the higher the likelihood oh exhibiting more severe symptoms ii. Quality of disease – causing variants in genomes 1. Some disease – causing variants will have a greater e>ect on the individual than other disease – causing variants iii. Environment and lifestyle iv. Epigenetics IX. Inherited cancers will always exhibit incomplete penetrance and variable expressivity a. Familial Retinoblastoma i. Incomplete Penetrance 1. 90% of carriers will develop ocular cancer (high penetrance) ii. Variable Expressivity 1. Symptoms depend on the quality and quantity of cancer – causing mutations 2. Some disease – causing variants in reduced expressivity b. Lynch Syndrome (inherited cancer that results from defective mismatch repair) i. Incomplete Penetrance 1. 80% of patients with disease – causing allele will develop colon cancer ii. Variable Expressivity 1. Symptoms depend on the quality and quality of cancer – causing mutations c. BRCA1 and BRCA2 – type Breast Cancer i. Incomplete Penetrance 1. Sex – dependent Penetrance a. Females: 65% for BRCA1; 50% for BRCA2 b. Males: 1.2% for BRCA1; 9% for BRCA2 ii. Variable Expressivity 1. Symptoms depend on the quality and quantity of cancer – causing mutations X. Evolution and Disease a. Hardy – Weinberg Equilibrium i. Predicts that both genotype and allele frequency will remain constant because they are in equilibrium ii. Applies if mating is random is large population and there are no disrupted circumstances 1. Unfortunately for Hardy and Weinberg, there are a lot of disrupted circumstances b. Inherited Recessive Disorder i. Need two alleles to exhibit symptoms, so disease – causing allele can be held within a population due too little to no stress against ii. May take many generations before disease manifestation c. Inherited Dominant Disorders i. Many are late onset (e.g. Huntington’s disease) 1. Symptoms do not appear until after individual has o>spring 2. Results in higher likelihood of disease transmission d. Disruptive Circumstances i. Random mutations, genetic drift, gene flow, bottlenecking, inbreeding, environment, heterozygote advantage e. Heterozygote Advantage i. Potential benefit to being carriers of recessive disorders 1. Very rare situation results in heterozygote’s advantage ii. E.g. Sickle – Cell carriers are resistant to malaria and do not exhibit symptoms of sickle cell disease 1. A>ected individuals show symptoms associated with sickle cell disease 2. Healthy non-carriers are susceptible to malaria iii. E.g. Cystic Fibrosis carriers are resistant to cholera and typhoid 1. There are no longer issues in the modern world, so now genetic rift (i.e. random chance) is likely cause of the prevalence of this disorder XI. Genetic Diversity between humans and Non – Humans a. A lot of genes are conserved across all forms of life i. Due to overlap of biological and biochemical functions in nature b. Most genetic variations between species are large structural changes i. E.g. Number of chromosomes, number of genes, DNA sequence, translocations c. Within the same species, most genetic variation results from smaller changes in DNA sequences (e.g. SNPs) i. Typically established after speciation (e.g. after we branched o> from the great apes) XII. SNPs and Disease Risks at the population level a. Some SNPs associated with diseases are more prevalent in some populations i. E.g. Sickle Cell allele is much more common in sub – Saharan African populations than anywhere else ii. E.g. Hereditary thrombophilia alleles are much more common in Caucasians than any other populations iii. E.g. Fumarase Deficiency 1. Some Mormon populations are at an astoundingly high risk of fumarase deficiency iv. E.g. Tay Sachs 1. Ashkenazi Jewish population are over 10-fold more likely to be carriers v. E.g. BRCA1 / BRCA2 Cancers 1. Ashkenazi Jewish population are over 10-fold more likely to be carriers vi. E.g. Type 2 Diabetes 1. Pima Indians are at a higher risk for T2D than other populations a. 38% of Pima Indians will develop Type 2 Diabetes b. Note: All non-Indigenous American populations are at a similar risk for T2D (8% – 11%) 302 Genomics | - Introduction and Genetic Maps I. Genomics a. Branch of science focused on aspects of genome i. Structure ii. Function iii. Evolution iv. Mapping v. Editing b. Interdisciplinary Filed of Science i. Many branches of science are interested in genomics 1. Including BCH, Genetics, Medicine II. Genomics vs. Genetics a. Genetics: Study of individual genes and their role in inheritance b. Genomics: Study of all genes simultaneously i. Requires sequencing of genomes ii. Data analyzed using bioinformatics iii. Can be used to study genetics III. Intragenomic a. Study of smaller, specific fraction of the genome i. Smaller fractions of the genome (e.g. SNPs, Exons) IV. Types of Genomics (4 types) a. Functional genomics (Aspects of gene expression, but not epigenetics) b. Epigenomics (Epigenetics) i. E.g. ATAC – seq, MethylC – seq c. Metagenomics (Collective genomes from ecosystem) d. Structural Genomics (Protein Structure) V. Applications of genomics includes improving medicine (e.g. Pharmacogenomics) VI. Studying Genomics a. Requires sequences genome i. Sanger Sequencing (or Chain Termination Method) 1. Note: This is typically used to sequence a relatively small fragment of DNA 2. Applied PCR technique a. Only one strand is synthesized per reaction 3. Fluorescent double deoxynucleotide triphosphates (ddNTPs) a. ddNTP lack 3’OH group i. Can not form a 3’ to 5’ phosphodiester bond ii. Results in chain termination (i.e. no more DNA synthesis for that strand of DNA) b. Fluorescent dNTPs emit a di>erent color that is specific to each base i. E.g. ddATP is green c. Together, this results in non-full length DNA sequence that emits a color dependent on which ddNTP was added 4. DNA fragment sizes exist from 1 nucleotide up to roughly 700 nucleotides 5. DNA fragments are sorted by size (via capillary gel electrophoresis) a. While being run on gel, a laser causes ddNTP to fluoresce, a detector reads the color, computer decodes the color to the corresponding ddNTP b. The order that the ddNTP are read by the detector correspond to the sequence of the DNA bases in the sequenced strand 6. If u see chain terminating nucleotides its sanger sequencing ii. Next Generation Sequencing (High throughput Sequence; Sequencing by Illumina) 1. Sequencing by synthesis a. PCR based techniques, but does not require chain termination 2. Can sequence many DNA templates simultaneously 3. Relatively low cost per sequenced fragment 4. Can be used to sequence RNA (after conversion to cDNA, ofc) 5. Requires construction of genomic library 6. May show bias against DNA fragments in low abundance 7. Can read, but not replicate, modified bases a. i.e. can’t be used to study epigenome without additional sample prep (e.g. Bisulfite conversion to study 5 – Methylcytosine) 8. Can be used to study the entire genome 9. Can be used to study sub-section of the genome a. E.g. DNA bound to proteins 10. Can be used to study RNA a. All RNA in organism b. Subset of RNA i. E.g. What is currently being transcribed, what is currently being translated, RNA bound to proteins b. If we can isolate a sub-fraction of genome, we can analyze it on a global basis i. E.g. We can use ChIP-seq to study DNA bound to a specific protein ii. E.g. We can use ATAC -Seq to study chromatin structure VII. What can we do with a sequenced genome a. Predict genes i. There are several functional DNA elements that can predict the presence of gene 1. E.g. Promoters, transcription, and translation start and stop sites, ribosome binding site, splicing sites and more ii. Once a gene is predicted, we can predict the sequence of the encoded protein 1. All it takes is a codon chart iii. Once the sequence of the protein is predicted, ca can predict both protein structure and function iv. While predictions are useful, we still must verify them using genetic, biochemical, and molecular biological techniques b. Alignment i. We can locate sequence homology (i.e. regions of DNA that are very similar or even identical between two genomes) 1. Plays a major role in identifying potential disease – causing variants c. Assembly i. We can build entire genomes from sequenced DNA ii. We can build partial genomes from sequenced DNA 1. Such genes too large to be sequenced via Sanger Sequencing VIII. Sequencing genomes are often stored in database so that they are easily accessible for studies a. 1000’s of human genomes stored in these data bases i. Di>erence between genomes can tell us a lot about genetics, and allow us to focus our e>orts on those di>erences IX. Mapping the Genome a. Cytogenetic Maps i. Genes are mapped relative to band location on chromosomes 1. Requires staining (e.g. Giemsa Staining) ii. Chromosomes di>erentiated based on banding pattern, size, and centromere location b. Linkage Map i. Genes are mapped relative to each other 1. Relies on genetic linkage, so actual distance is not measured ii. Uses genetic markers 1. E.g. DNA segments that are found at specific site and can be uniquely recognized (e.g. known genes, SNPs, CNV, ect.) iii. Lower recombination frequency correlated to closer genetic markers iv. Quantitative Trait Locus (QTL) Mapping 1. Region of DNA associated with a particular trait a. Since many genes often contribute to one trait, multiple QTL may be, and often are, found on multiple chromosomes 2. Often used to find regions of DNA associated with a disease a. Can not identify the specific disease – causing gene i. Other methods are necessary (e.g. DNA sequencing, molecular cloning) 3. Help find which regions of genome are tied to specific physical trait (height, eye color, disease risk) c. Physical Map i. Genes mapped relative to genes 1. Measured in base pairs 2. Does not rely on genetic linkage, but may improve linkage maps by identifying new genetic variances ii. Genomic Sequences 1. Shotgun Sequencing a. Known template is not required b. Sequences DNA fragments are aligned to each other to form a larger DNA sequence called Contig (Contiguous sequence) i. Contigs can also be aligned to other contigs to form even larger sequences. This continues until the genome or partial genome is put together c. Significantly more di>icult and work intensive that aligning it to a known genome 2. Alignment a. Known template required b. DNA sequences are aligned to known c. Decreases time and resources needed to construct full genome d. Can determine genetic variants X. Transcriptome a. All RNA in the cell / tissue i. Several important applications 1. Comparing di>erent cell types (e.g. liver vs. skeletal muscles) 2. Healthy vs. Disease cells a. Including tumor profiling 3. Development 4. Studying metabolic pathways 5. Changes in response to environment (e.g. hormones, toxins, ect.) XI. Proteome a. All protein in a cell b. Proteins are more complex than DNA and RNA i. E.g. proteins consist of 20 amino acids whereas DNA consist of 4 nucleotides ii. E.g. several amino acids will be modified on a given protein which will a>ect analysis, whereas modified DNA does not interfere with analysis iii. E.g. a lot of di>erent proteins being expressed with di>erent sequences and not di>erent levels, whereas genome remains constant (barring rare mutations) c. Requires Mass Spec d. Proteome can tell us things that DNA sequencing can not i. E.g. Alternative splicing and RNA editing (Note: These processes can be determined by RNA sequencing) e. Proteome can tell us things that DNA and RNA sequencing can’t i. E.g. Post – Translational Modification 303 Genomics || - Analyses and Genetic Testing I. Why analyze genomes a. To help identify disease – causing variants i. May lead to more e>ective treatment and prevention b. All diseases have a genetic component to them i. Not all genetic diseases are heritable ii. Not all genetic diseases have known causative alleles iii. Some know genetic components are poorly c. The more we know about the causative agents of genetic diseases, the more likely we are to predict if an individual will develop a disease and predict the course of disease symptoms II. Background info for Linkage Analysis a. Linkage Equilibrium i. Alleles are randomly assorted 1. Follows law of independent assortment ii. Most human genes fall under linkage equilibrium b. Linkage Disequilibrium i. Non – random association of alleles 1. Does not follow low of independent assortment ii. Influenced by genetic linkage and other factors (e.g. population structure) iii. Near genetic variants tend to be inherited together c. QTL Mapping: Usage of linkage maps to identify relative locations of a gene that contributes to a disease III. Linkage Analysis a. Aimed at establishing linkage between genes and rare disorders i. Focuses on families with history of disorder b. Gene hunting: Identify QTL related to phenotypes i. Additional experiments require to determine actual disease – causing gene c. Typically used in familial studies i. Compare diseased family members to non – diseased family members ii. Good for identifying QTL of rare inheritable diseases d. Pros: i. Can be used to help identify disease – causing variants ii. Does not require a known biochemical (BCH) defect iii. Does not require a known gene iv. Does not require knowing how many genes / alleles are involved in disease v. Helped identify the causative gene in both Cystic Fibrosis and Huntington’s Disease e. Cons: i. Additional experiments required to find actual disease – causing gene ii. Not very good for common disorders IV. Background info for HapMap and GWAS V. Haplotypes a. DNA polymorphisms that tend to be inherited together from a single parent b. Familiar haplotypes i. Survive several generations of reproduction with little to no change 1. E.g. Most of the Y chromosome in men rarely undergoes genetic changes c. Non – Familiar Haplotypes i. Polymorphism that are statistically associated with a non-familiar population ii. Analysis of non – familial haplotypes can be used to study common diseases VI. Using non- familiar haplotypes to find disease – causing variants a. HapMap Project i. Looked at pre – specified regions of the genome 1. Notably those with known SNPs ii. SNPs present in patients with a diseased compared to healthy patients were marked as a potential disease – causing variant iii. Good study, but not great 1. Technology made HapMap obsolete b. Genome Wide Association Studies (GWAS) i. Compare SNPs of diseased and non – diseased individuals to determine disease causing variants ii. Focused on known SNPs, similar to HapMap 1. Used SNP arrays to analyze results a. Allows us to focus on all known SNPs instead of those in prespecified areas 2. If SNP was statistically associated with the disease, the SNP was said to be associated with disease (i.e. annotated as disease risk) iii. Identified several diseases – carrying alleles (including those involved in type 2 diabetes and schizophrenia) 1. Also identified many false positives due to poor experimental design and poor analyses a. i.e. many non – disease – causing SNPs were linked to diseases iv. Early studies had very little, if any, diversity on population – based studies v. Most studies are better now, but some still have news vi. Finds common variants with small e>ects, not rate ones VII. Exon Aggregation Consortium (ExAC) a. Sequence and compare protein – coding regions of diseased and non – diseased individuals to determine disease – casing alleles b. Identified many disease-causing alleles i. Also “fixed” many false positives identified by GWAS c. Since this focuses on exons, these studies are not very useful for disorders caused by non-coding SNPs VIII. Issue with Genome-Wide Analyses a. Additional work needed to di>erentiate germline and somatic mutations IX. Genetic Testing a. Diagnostic Tests i. Determines presence or absence of disease based on disease symptoms 1. Typically, yes or no ii. Can be used to plan treatment and monitor treatment e>icacy b. Prognostic tests i. Determines likelihood of patient developing a disease c. Types of Genetic Tests i. Cytogenetic testing (diagnostic) 1. Number and appearance of chromosome 2. Big picture a. E.g. used to diagnose aneuploidy (e.g. trisomy disorders) ii. Molecular Testing 1. PCR or DNA sequencing to identify pressure of disease-causing variants iii. Biochemical Testing 1. Looks at protein translation a. Synthesis, concentration, structure, and or function of protein b. Looks at actual protein levels and function à trying to see the result of gene being silenced (less protein made this is the test) c. Checks the product iv. Prenatal Screening and Newborn Screening 1. Several cytogenic, molecular, and biochemical tests 2. Prenatal screening carried out during 1st and 2nd trimesters 3. Newborn screening typically occurs within hours after birth v. Carrier testing: Identifies if person is carrier of recessive disorder 1. Used to determine if a person is a carrier for a specific autosomal recessive disorder 2. Typically used by parents who have a family history of a recessive disorder a. Explores likelihood of o>spring inheriting disease 304 Non-Mendelian Genetics | - Polygenic Inheritance and Multifactorial Traits I. Polygenic Inheritance a. Range of phenotypes i. Phenotype can be quantified in some way b. Multiple genes contribute to one phenotype i. Gene products have additive e>ect towards phenotype ii. E>ect of the genes are cumulative c. Non – Mendelian Inheritance i. No dominant or recessive ii. Parents’ phenotype may or may not be passed down to o>spring d. Studied at population level i. No two individuals can account for all genetic variances that contribute to phenotype 1. Multiple genes and multiple alleles contribute to phenotype e. E.g. Skin Tone i. Estimated 378 genetic loci are involved in determining skin tone 1. Almost 170 known genes play a role in skin tone 2. Some have major e>ects; others have minor e>ects on skin tone ii. Skin tone is a result of melanin 1. Combination of eumelanin (black / brown pigment) and pheomelanin (pink / red pigment) 2. MC1R partially dictates ratio of melanin produced a. Di>erent alleles result in “more active” or “less active” MC1R b. Highly active MC1R protein results in more eumelanin c. Less active MC1R proteins results in more pheomelanin 3. Skin tone determined by both concentrations and ratios of each type of melanin a. Several additional proteins are involved in melanin synthesis i. Including signaling molecule, transcription factors, and melanin biosynthesis genes b. Each of the proteins involved in melanin synthesis play a role in both relative and absolute concentrations 4. Non-genetic elements also a>ect skin tone a. Sunlight exposure 5. Albinism is caused by lack of melanin synthesis a. i.e. neither eumelanin or pheomelanin is produced b. Usually results from a defect in melanin synthesis genes or melanosome upkeep genes II. Multifactorial traits a. Multiple factors contribute to phenotype i. Multiple genes, alleles, environments, and epigenetics work together to create a spectrum of phenotypes ii. Majority of human phenotypes are multifactorial 1. Including most common disorders b. Very di>icult to categorize many multifactorial traits i. As factors involved in phenotype increase, the larger the range of phenotypes and the more di>icult it is to categorize c. Can not use Punnett squares to predict inheritance i. Too many genes and other factors involved ii. Must use genome-wide studies (e.g. GWAS) d. Continuous traits i. No obvious boundaries between phenotypes 1. E.g. height, weight, blood pressure, skin tone e. Categorical traits i. Obvious boundaries between phenotypes 1. E.g. number of ridges in fingerprint, eye color f. Threshold traits i. Few phenotypes, but a sharp contrast between them ii. Multiple genes, environments, and epigenetics work together to reach a threshold 1. Phenotype is only displayed at or after this threshold is reached a. May have some gray area between a>ected and una>ected i. E.g. Pre-diabetic and Type 2 diabetes; Neurodivergent and Autism iii. Many Human diseases fall into this category 1. Including cancer, Type 2 diabetes, and Autism g. Non-Mendelian h. Usually, no obvious pattern of inheritance or phenotypic development III. Phenotypic Variance (Vp) a. Variance in a phenotype within populations b. Dependent on genetic variation (Vg) and environmental variation (Ve) i. Genotypic variance (Vg): Amount of phenotypic variance due to genetic variation ii. Environmental Variance (Ve): Amount of phenotypic variance due to environmental factors iii. High Vg and low Ve mean genetic plays a much larger role in phenotypic variance iv. Low Vg and High Ve mean the environment plays a much larger role in phenotypic variance v. Phenotypic variance can depend solely on environment IV. Heritability a. Amount of phenotypic variation within a population due in genetics b. Scale 0-1 i. 1 = All genetic ii. 0 = All environmental iii. Almost all human’s traits fall in between these extremes V. Genotype-Environment Association a. Genotypes not randomly distributed in all environments 1. i.e. population within the same environment tend to have similar genotypes b. May contribute to lower health risks associated with the environment VI. Studying Multifactorial Traits a. Require genome-wide approach (e.g. genomic sequencing) i. Di>icult to validate 1. Due to the multiple contributing factors a. Each variant typically shows low penetrance ii. Require large numbers of people to get statistically relevant results 1. May be able to get data from database/other studies iii. Some disease diagnostics depend solely on phenotype 1. In some cases, this may vary from doctor to doctor 2. Overlapping phenotypes with other diseases may hinder studies 3. E.g. Autism VII. Autism and the Troubles of Finding Genetic Variances a. Several phenotypes associated with Autism i. Each phenotype has its own spectrum of phenotypes ranging from severe to typical 1. In some phenotypes (e.g. intelligence) can range from low to very high ii. Not all phenotypes are present in every Autism case b. Several genetic factors contribute to phenotype i. Many of these genetic factors are likely found at relatively high frequencies within non-autists c. Environmental factors may play a role in Autism cases i. E.g. Diet of mother during pregnancy, infection of mother during pregnancy ii. Note: Vaccines are an environmental factor that are NOT liked to Autism d. Epigenetics may contribute to autism phenotypes e. Overlap of phenotypes between autism and other disorders associated with learning disabilities i. E.g. Fragile X, Rett syndrome, Angelman Syndrome, and more f. Small sample size, so genomic studies are less powerful 305 Non-Mendelian Genetics || - Heritability of Diabetes Mellitus I. Insulin a. Peptide Hormone b. Synthesized and excreted by 𝛽-cells in pancreas i. Found in the Islets of Langerhans ii. 𝛽-cells are typically the only cells that express insulin c. Signals high blood glucose and fed state II. Diabetes a. Chronic metabolic disorder i. Characterized by high blood glucose 1. High blood glucose is an environmental variance that can contribute to phenotypes of other disorders b. Results from issues with insulin i. Insulin deficiency ii. Insulin resistance c. A>ects 9% of world population d. Complex disease e. Some complications of diabetes result in other complex diseases f. Major types of diabetes i. Type 1 (10% of diabetics) ii. Type 2 (90% of diabetics) iii. Gestational Diabetes (2-20% of pregnant mothers) 1. Temporary, but leads to increased risk of type 2 diabetes diagnosis for both mother and o>spring g. Population level i. All human populations are at considerable risk for diabetes (notably type 2 diabetes) III. Type 1 Diabetes a. Characterized by extremely low or no insulin production i. Only treatment is insulin injection b. Phenotypes i. Early and sudden onset ii. Thin or normal body size iii. Ketoacidosis (excessive production of ketone bodies) is common iv. Autoantibodies against 𝛽-cells usually present c. Genetic causes i. Human leukocyte antigen region 1. Associated with autoimmune disorders a. Beta 𝛽-cells destruction 2. Mutations in di>erent HLA genes associated with di>erent populations ii. Several other genetic factors can contribute to T1D 1. Many are involved in autoimmune response 2. Some are involved in response to RNA viruses a. Contributes to environmental factors 3. Insulin, and genes that regulate insulin, may also play a role iii. Inheritance of T1D is a>ected by 1. Whether father and or mother is a>ected a. More likely to inherit from a>licted father than mother 2. Age of mother at the time of birth a. More likely to inherit from mother if below the age of 25 at birth d. Environment causes i. Viral (notably coxsackievirus B) 1. Virus may directly destroy beta cell 2. Immune system may destroy beta cell as result of viral infection a. E.g. interferon expression results in cell death, as well as death of surrounding cells ii. Important note: Diet does NOT play a role in the development of Type 1 diabetes e. Epigenetics i. Likely plays a role, but no concrete evidence f. May be caused by other diseases i. E.g. Type 2 polyglandular autoimmune disorder and Carpenter syndrome IV. Type II diabetes a. Genetic causes i. Polygenetic trait 1. Multiple di>erent genetic variances linked to type 2 diabetes a. More type 2 diabetes-causing variants present in genome results in increased risk of being a>ected with T2D compared to T1D ii. Multiple population-level studies 1. 250 potential genetic variances identified via various genomic studies a. Some looked at specific populations b. Some looked at more diverse populations 2. Some identified genetic variances do not have obvious link to diabetic phenotype a. But they are still possibly linked to the disorder b. Environmental causes i. Diet 1. High sugar increase risk of type 2 diabetes ii. Exercise 1. Sedentary lifestyle increases risk of type 2 diabetes iii. Body weight (major environmental factor) 1. Obesity has strong link to development of type 2 diabetes a. Most genomic studies control for this iv. Smoking 1. Smoking increases risk of type 2 diabetes c. Epigenetics i. Likely plays a role; nothing concrete yet 306 Developmental Genetics I – Overview and Stem Cells I. Human developmental genetics a. Study of gene expression at di>erent time points of development and between di>erent cell-types i. Includes which genes are expressed and how they are expressed (e.g. growth factors, activators, ect.) b. All cells have same genotype (barring rare mutations) i. Each organ and cell types within organs have their own phenotype ii. These phenotypes are dependent on di>erential gene expression patterns c. Di>erential Gene Expression i. Expression of a di>erent subset of genes in di>erent cell types 1. Every cell in our body has the same genome (barring mutations) a. But only a small subset of the genome is expressed in each cell type 2. Controlled by several factors, including, but not limited to a. Transcription Factors i. E.g. activators, repressors b. Epigenetics i. E.g. Histone modification, DNA methylation c. RNA processing i. E.g. Alternative Splicing d. Protein translation i. E.g. RNAi e. Protein modifications i. E.g. phosphorylation, methylation, acetylation d. Cell di>erentiation i. Development of cells from less developed to more developed cell ii. As gene expression pattern further di>erentiates, so will cell function 1. Cells within the same tissue may have di>erent gene expression patterns a. Di>erent cell. Types in an organ play a role within the organ i. E.g. exocrine and endocrine cells in pancreas ii. E.g. hepatocytes and cholangiocytes in liver e. Non-Specialized cells (e.g. stem cells) do not express developmental genes i. Need to be de-repressed 1. E.g. a repressor needs to be repressed 2. Chromatin needs to be relaxed ii. Growth factors tell cells how to develop 1. Growth factors are naturally occurring substance (e.g. some hormones) 2. Stimulates a. Cell proliferation b. Wound healing c. Cellular di>erentiation 3. Starts a signal transduction pathway a. Three parts to signal transduction i. Reception 1. E.g. binding of growth factor to receptor ii. Transduction 1. How signal is relayed inside cell a. Typically requires several signaling proteins iii. Response 1. The ultimate e>ect of growth factor 4. May be concentration and time dependent a. E.g. Concentration of a growth factor, NODAL, initiates di>erentiation of embryoblasts to gastrula 5. Many di>erent growth factors are required for correct development a. How much is expressed and at what time they are expressed are important i. Too much or too little may result in developmental defects ii. Too early or too late may result in developmental defects iii. If cell is not told to express a gene, it won’t II. Human development and stem cells a. Zygote (single cell) à Human (a lot more than one cell) i. Development requires a lot of cross talk between developing tissues at di>erent times b. Zygote: Result of fertilization (ovum + DNA from spermatozoa = zygote) c. Zygote ( 1 cell) à Morula (16-32 cells; solid ball) i. Zygote and Morula are examples of totipotent stem cells 1. Can develop into any cell type d. Morula à Blastocyst (200-300 cells; fluid filled cavity; inner cell mass becomes baby and trophoblast becomes placenta) i. Beginning stages of development 1. Results from growth factors released from endometrium ii. Blastocyst 1. Contain two types of stem cells: Pluripotent and Trophoblast 2. Embryoblasts (Pluripotent stem cells) a. Can develop into any cell type except placenta i. Required, but not su>icient for placenta development 3. Trophoblast stem cells a. Required, but not su>icient for placenta development b. Only develops into placental cells e. Blastocyst à Gastrula (early embryo stage where cell start organizing into 3 layers) i. Induced by growth factor (NODAL) released by a small number of cells in embryoblast 1. Cell development depends on concentration of NODAL that cells are exposed to ii. Gastrula 1. Endoderm a. Cells that release NODAL, or are in close vicinity in NODAL develop into endoderm; NODAL is for specific signals but not first overall thing (bigger picture later on) i. E.g. Liver, pancreas 2. Mesoderm a. Cells that are exposed to a low concentration of NODAL develop into mesoderm i. Eventually develop into muscles, cells, bones, connective tissues, cardiovascular system, cartilage 3. Ectoderm a. Cells that are not exposed to NODAL develop into ectoderm i. Eventually develop into epidermis, nervous system, and neural crest b. Ectoderm, mesoderm, and endoderm are all examples of multipotent stem cells i. Can develop into a subset of cells ii. As development continues, multipotent stem cells become more specialized 1. i.e. the cell types they can develop into become more restricted f. Adult stem cells i. Small caches of specialized stem cells that can develop into all blood cell types, but no other cells ii. Found in many tissues, notably those involved in blood cell development (bone marrow), epidermal regeneration, and regeneration of endothelial cells of digestive system 1. But not the liver g. Liver can regenerate up to 49% of mass without an issue i. Can respond to external or internal signals ii. No stem cells are involved 1. However, the two cell types in liver can act as facultative stem cells to replace the other type if needed 307 – Genetic Basis of Cancer I I. Cell cycle revisited a. Most cells are in quiescent state (G0 phase) i. Do not enter cell cycle unless told to by a growth factor b. Cells in cell cycle must pass cell cycle check points to get finish proliferating i. These checkpoints monitor conditions of the cell, including DNA damage ii. Failing a checkpoint means no proliferation 1. Unless the problem can be fixed 2. Extensive cellular damage results in programmed cell death (apoptosis) II. Cell Cycle and Cancer a. Cancer i. Uncontrolled cell growth ii. Genetic and Epigenetic disorders b. Cancer must be to bypass several elements of the cell cycle, including i. Bypassing the need of a growth factor (e.g. via oncogenes) L307 ii. Bypassing cell cycle checkpoints (e.g. mutation in tumor suppressor) L308 III. Cancer Hallmarks a. The 10 phenotypes associated with cancer i. E.g. sustained signaling of cellular proliferation (oncogenic stress) b. All hallmarks must be met before a cell is considered cancerous i. Otherwise, it is classified as non-cancerous tumor 1. Can still acquire additional mutations to become cancerous, so it should be monitored and treated ASAP IV. Origin of Cancers a. Somatic Mutation Theory of Cancer i. Cancer is disease caused by unwanted cell growth b. Tissue organization field i. Cancer results from disruptions between adjacent tissues c. Viral Oncogene Hypothesis i. Genes specifically programmed to trigger cancer had been deposited into vertebrate genomes by viruses V. Proto-Oncogene and Oncogene a. Proto-oncogenes i. Healthy proteins that promote cell growth and proliferation ii. Many components of the growth factor signaling transduction pathways are proto-oncogene b. Oncogenes i. Altered gene product which promotes cell growth and proliferation 1. I.e. does not go through the proper protocols for cell proliferation ii. Many components of the growth factor signaling transduction pathways can become oncogenes iii. Can result from genetic or epigenetic mistakes 1. Gain of function 2. Extremely common in human cancers c. Class of Oncogenes i. Growth Factors 1. Overexpression of growth factors may promote unwanted cell growth and proliferation ii. Receptors 1. Overexpression may result in increased signal strength even in absence of growth factor 2. Some common human oncogenes are intracellular transducers a. E.g. EGFR, HER2 iii. Intracellular transducers 1. Initiate cell signaling pathway in absence of activated receptor 2. Some common human oncogenes are transcription factors d. Viral causes i. Acute transforming virus 1. Oncogene expressed from viral genome ii. Non-acute transforming virus 1. Insertion of viral genome in front of oncogene 2. Oncogene expressed via insertional mutagenesis 308 – Genetic Basis of Cancer II I. Tumor Suppressor Genes a. Protein-encoding genes that inhibit tumor formation i. Stops cell growth ii. Signal apoptosis if problem is not fixable b. Decreased expression or decreased activity promotes tumor formation i. i.e. loss of function c. Common tumor suppressors include p53, BRCA1, BRCA2, retinoblastoma d. Retinoblastoma (Rb) i. Regulates transcription of genes involved in growth by sequestering transcription factors E2F ii. Ocular Cancers and Retinoblastoma 1. Hereditary a. High penetrance (90% of heterozygotes will develop ocular cancers) 2. Loss of heterozygosity à losing the last functional copy of a gene that was keeping the phenotype in check 3. Loss of homozygosity à You had a good copy + bad copy (heterozygous) but then you lose the good one and now you’re left with two bad ones (disease shows up); doesn’t really make sense a. An acquired mutation in a dominant allele result in homozygosity of the recessive trait e. P53 i. Multiple roles in prevention of cancer development 1. Including DNA repair, stopping cell cycle, initiation of apoptosis, senescence 2. Requires high concentrations of p53 to function a. P53 is stabilized by oncogenic stress and DNA damage 3. Loss of function of p53 results in inability to perform many aspects of cancer prevention f. HPV i. Can promote the degradation of retinoblastoma and p53 1. Prevents us from stopping the cell cycle g. Non-viral mutations are extremely common amongst cancers i. Genetic or epigenetic mutation that results in inactive protein no expression of protein 1. Mutator Hypothesis ii. E.g. retinoblastoma (and heterozygosity) iii. E.g. Lynch syndrome II. Genomic Instability a. Ever increasing mutation frequency due to inability to e>ectively repair genome i. i.e. mutations continuously occur in the cel ii. Can be caused by other mutations or epigenetic defects b. Preventing genomic instability i. Cell cycle checkpoints ii. G1 Checkpoint: Stops unwanted proliferation of cells (e.g. p53) 1. Unwanted entry into cell cycle results from failure to undergo G1 / S checkpoint iii. G2 Checkpoint: DNA damage repair checkpoint; occurs post-DNA replication 1. DNA mutagenesis – results from a failure to undergo G2 / S checkpoint iv. M checkpoint: Stops nuclear division when not ready 1. Aneuploidy results from a failure to undergo M checkpoint III. Hereditary Cancers a. Usually involve defects in DNA repair pathways b. Inability to repair DNA results in increased rates if mutagenesis (Mutator Hypothesis) i. i.e. genomic instability IV. Acquired Cancers a. Mutations are acquired through a person’s lifetime b. Majority of cancers c. Mutator hypothesis not always clearly seen at a genetic level i. But epigenetic disorders may result in lowered expression of DNA repair genes V. Mutator Hypothesis a. Cancers must acquire mutations that allow it to bypass barriers to proliferation i. Rare mutations within a tumor allow some cell to bypass these barriers b. Driver mutations i. Mutations present in cancer cells that contribute to the development of cancer ii. These are cancer-causing mutations c. Passenger mutation i. Mutations present in cancer cells that do not contribute to the development of cancer ii. May be important to some therapies VI. Not all cancers are the same a. Same type of cancer may be due to di>erent mutations i. E.g. not all breast cancer cells look like other breast cancer cells on a molecular level. There are even di>erent types of inheritable breast cancers (e.g. BRCA1, BRCA2, and HER2) b. But some pathways are commonly found in cancer, and severe as potential targets for therapies 309 Personalized Medicine – Pharmacogenomics I. Adverse drug events a. Injury resulting from medical intervention related to a drug i. E.g. severe side e>ects b. Nearly 5% of hospitalized patients experience an adverse drug event i. Preventing this will help treat patients II. Pharmacogenomics a. Study of how genotype a>ect a person’s response to drugs i. Lino adverse drug events to genetics to help limit the events ii. Link ine>ective medications to genetics to help guide treatment iii. Use the genotype of cancer cells to help guide treatment b. Pharmacology + Genomics i. Part of personalized medicine 1. Develop and design medication to a person’s genetic makeup 2. Identify the best treatment plan a. Which drug and dosage ii. Not a common application yet as genetic tests are often time consuming and we do not know, or always care about genetic links to every drug c. O>ers benefits compared to Trial and Error approach i. Trial and error more beneficial if patient show response to first drug 1. Trial and error become less beneficial if the first drug does not work or is toxic to the patient a. Where use of pharmacogenomics becomes helpful III. Identifying polymorphisms involved in drug metabolism a. Identified at population level i. Sort individuals within population by e>ect of drug 1. E.g. beneficial, ine>ective, toxic ii. Sequence genomes of individuals in each group 1. Whole genome sequencing à look at sequence not if it’s being read properly 2. Exome 3. SNP Array à look at sequence not if it’s being read properly i. Scans genome for known SNOs ii. Detects chromosomal gains / losses iii. Can be used to find regions of homozygosity, uniparental disomy iv. Can help detect a large chromosomal deletion in a patient with developmental delay b. Manhattan plot used to identify genetic variants more frequently found in patients with adverse drug reactions or resistance to drug 4. Molecular tests look at DNA/RNA but look at its function not just its code iii. Compare genotype between populations iv. Find genetic variance(s) linked to drug metabolism v. Confirm that identified genetic variances are linked to drug metabolism vi. Genetic tests on patient to determine if they gave these genetic variances 1. SNP Arrays a. More data, but more time consuming 2. Non-Sequencing based technique (e.g. PCR) a. Quicker, cheaper, and easier than DNA sequencing IV. Use of pharmacogenomics a. Warfarin (anticoagulant) targets vitamin k to reductase (VKORC1) i. Two known SNPs e>ect drug metabolism 1. Both SNPs associated with resistance to drug a. VKORC1 SNP is resistant to both enantiomers b. CYP2CP hyper-metabolizes S-warfarin, reducing e>icacy b. Abacavir (HIC medication) targets nucleoside reverse transcriptase i. One known SNP that e>ects drug metabolism 1. This SNP likely elicits an immune response a. Results in toxic e>ect of drug i. More serious toxicity upon additional usage c. Cancer i. Pharmacogenomics can become a very powerful took in guiding cancer treatments 1. Look at cancer genotype to help guide cancer treatment a. E.g. target specific mutations found in patient’s cancer ii. HER2 (Human epidermal growth factor receptor 2) 1. Induces cellular proliferation 2. Overexpression of HER2 leads to oncogenic stress a. I.e. constant signal to proliferate b. One of the more common mutations found in breast cancers c. Herceptin (antibody treatment) will decrease HER2 receptors on cells overexpressing this protein i. Decreases cellular proliferation signals ii. Does not target cells that contain typical number of HER2 1. Healthy cells 2. Cancers not caused by overexpression of HER2 iii. Pharmacogenomics can be used to determine if patient has mutation that is susceptible to Herceptin prior to treatment iii. EGFR (Epidermal growth factor receptor) TK (tyrosine Kinase) region 1. Induces cellular proliferation a. Tyrosine Kinase domain phosphorylates intracellular signaling protein b. If constitutively active, it becomes oncogenic 2. Several drugs can target the tyrosine kinase domain of EGFR a. Permanently inhibit kinase activity of EGFR which prevents cell signaling from occurring i. No signal = no oncogenic stress b. Some genetic variants are hyper a>ected by these drugs i. Some genetic variants are resistant to these drugs c. These drugs have no e>ect on oncogenic intracellular signaling proteins i. E.g. KRAS 1. Part of EGFR signaling pathway, but since the oncogenic stress is occurring after EGFR signaling, preventing the kinase activity of EGFR will have no e>ect of these oncogenes d. Pharmacogenomics can determine if patient has mutation that is susceptible to drugs that target a specific protein within a signaling pathway iv. 6-Mercaptopurine 1. Used to treat childhood acute lymphoblastic leukemia a. Inhibits guanine biosynthesis 2. Genetic testing related to this drug is recommended by FDA a. Due to severity of potential side e>ects i. Very cytotoxic 3. 6-Mercaptopurine is very toxic to all cells a. Inactivated by thiopurine-s-methyltransferase i. Low activity of thiopurine-s-methyltransferase leads to higher-than-expected concentrations of 6-Mercaptopurine 1. Several genetic variances linked to toxicity 2. Leads to severe toxicity of drug b. Many individuals will need their dosage adjusted i. All patients who are homozygote for alleles that results in extreme toxicity will need their dosage adjusted v. Irinotecan 1. Topoisomerase inhibitor 2. Part of chemotherapeutic regimen 3. Drug is inactivated in lover then transported to intestine for disposal a. Intestine will reactivate the drug i. Which is fine because this drug targets colon cancer b. Drug has been designed so that the liver prevents toxicity to any cell, except the colon / intestine which will re-activate it c. One known SNP results in decreased inactivation of drug in lover i. i.e. the drug remains active much longer in liver 1. Can cause liver damage a. May be life threatening in individuals that are homozygote for this SNP d. Pharmacogenomics to choose Drugs i. Pharmacogenomics may signify need to design drugs that target individuals that have SNPs that result in ine>ective medication or toxic side e>ects ii. May find niche uses for some drugs 1. E.g. drugs abandoned during development 2. E.g. drugs that show superior benefits to subset of population iii. May be able to target drugs to subset of population 1. Depression a. Citalopram treatment may result in remission of depression in individuals with specific SNPs i. These SNPs delay the metabolism of the drug, resulting in the drug being around longer 1. This drug is not very toxic, so sticking around longer is not a bad thing for patients 310 Therapeutic Approaches I – Recombinant DNA I. Recombinant DNA a. 2 or more strands of DNA that have been artificially combined b. Gene cloning i. Isolating and making many copies of a gene ii. Genetic selection 1. Only cells with selected phenotype will survive, or cells with selected phenotype will die 2. Most common example is antibiotic / toxin resistance iii. Genetic screens 1. Cells of interest will have di>erent morphological phenotype than other cells c. Uses i. Express a foreign gene in a host organism ii. Express a modified gene in a host organism iii. Express a protein in a test tube iv. Applications in disease treatment 1. E.g. insulin synthesized in E. coli 2. E.g. human hormones expressed in animal milk II. Vector DNA a. Carrier of insert DNA i. E.g. plasmid ii. E.g. viral iii. Can be replicated independently of chromosome 1. Contain an origin of replication iv. Derived from natural source 1. Usually modified to fit out needs in research / medicine v. Usually have selectable marker (e.g. antibiotic-resistant gene) 1. Allows selection of cells with vector vi. Some have color selection 1. Allows screening of cells with vector or cells with vector and insert vii. Not compatible with all hosts 1. Origin of replication dictates if vector is compatible in host a. Can add multiple origins of replication to be compatible with multiple hosts b. Transferred into host via horizontal gene transfer i. Transformation, conjugation, or transduction III. Insert DNA a. Segment of DNA of interest i. E.g. Gene we want to express ii. E.g. DNA we want to transfer to a new organism iii. E.g. DNA we want to study b. Sources i. Cut / sheared genome 1. Mechanically sheared (e.g. sonication, French pressure) 2. Cut with restriction enzyme ii. PCR 1. Can be used to generate specific fragments or random genomic library iii. RNA (required reverse transcriptase) iv. Synthetic DNA IV. Restriction Enzymes a. Specific category of endonucleases b. Recognize and cut restriction sites i. Usually palindromic ii. Most common restriction sites are 6 nt long c. Bacterial in origin i. Part of bacterial immune system used to cut foreign DNA d. Experimentally, some common uses include i. Fragmenting the genome ii. Cutting the insert iii. Cutting the plasmid DNA e. Usually creates 5’ or 3’ overhangs (sticky ends) i. Complimentary overhangs can interact 1. Often used to stick insert DNA into vector DNA f. May NOT have overhangs (blunt ends) i. Can also be used to stick insert DNA into vector DNA 1. Less e>icient than sticky ends 2. Blunt cut DNA can be artificially joined to any other blunt cut DNA g. DNA ligase i. Synthesizes 3’ to 5’ phosphodiester bonds between two DNA sequences ii. Can be used to join two unrelated pieces of DNA in test tube 1. i.e. recombinant DNA V. Uses of recombinant DNA technology a. Insulin production in bacteria i. High yield made cheaply b. Human hormone production in sheep i. Recombinant DNA expressed in mammary glands via beta- lactoglobulin promoter 1. Recombinant protein (hormone in this case) can be isolated from milk VI. Transgenic Animals a. Genetically modified animals use to study gene function and disease i. Helps understand gene function ii. Create disease models to study human diseases iii. Must remove animal copy of genes if only interested in e>ect of human gene 1. Otherwise run the risk of genetic redundancy a. Two or more genes have overlapping functions b. May mask phenotypic e>ects of disease-causing gene b. Plasmid based technology i. Utilizes homologous recombination to ass, remove, or modify segments of DNA from genome ii. Knock-out: Deletion of gene from genome 1. Gene may be replaced by selectable marker iii. Knock-ins: Insertion of gene into genome 1. Gene may be knocked-in along with a selectable marker iv. Conditional Knock-out: Controlled expression of gene 1. When and or where the recombination event occurs can be controlled a. Knock-in gene of interest with lox sites on both sides of gene of interest b. Knock out gene of interest when and or where desired i. Requires Cre Recombinase 1. Catalyzes recombination event between two lox sites c. Artificial restriction enzymes and CRISPR are becoming more common d. Studying human diseases in animal model i. May require multiple modifications to prevent genetic redundancy 1. E.g. knocking out any relevant animal genes and knocking in any relevant human genes ii. Make animals more human-like 1. Not making pigs with human ears or anything 2. More of modifying a pig’s organ to be less immunogenic to humans so that we can harvest it and use it for organ transplants in humans 311 Therapeutic Approaches II – GMO, Stem Cells, and Reproductive Cloning I. Genetically Modified Organisms (GMO) a. Organism that has had its genome modified through genetic engineering i. E.g. transgenic animals b. Plants i. Selective breeding 1. Use of plants with phenotype of interest to produce more plants with phenotype of interest 2. Happening since before recorded in history ii. Genetically modified 1. Most common method uses recombinant DNA technology and a bacterium to produce a plant with the phenotype of interest 2. Agrobacterium Tumefaciens a. Bacterium that naturally infects some plant i. Injects DNA into plant via conjugative plasmid called Ti plasmid 1. Typically causes tumor formation on infected plants b. Modified A. Tumefaciens can alter the genome of some plants i. We can modify the Ti plasmid so that it inserts a gene of interest into the plant genome ii. Used this technique to create golden rice iii. Rice stains that is high in Vitamin A 3. Biolistic gene transfer (gun shoots DNA into nucleus) 4. Microinjection (DNA injected into nucleus) 5. Electroporation (DNA transported into cell with use of electric current) II. Stem Cell Therapy a. Stem Cells i. Self-renewing non-di>erentiated cells b. Pluripotent or Multipotent used in therapies i. Pluripotent à Multipotent (à multipotent) à Tissue specific cell type c. Pluripotent stem cells i. Sometimes referred to as embryonic stem cells ii. Can develop into almost all cel types (placenta is exception) iii. Derived from blastocyst d. Multipotent stem cells i. Partially develop stem cells ii. Can di>erentiate into select cell types e. Adult Stem cells i. Specialized multipotent stem cells found in some tissues 1. Used for cell replacement and wound repair 2. Usually not easily found in tissues a. Hematopoietic stem cells are easily found in bone marrow ii. Can only develop into a very select few cells types 1. E.g. hematopoietic stem cells can only di>erentiate into blood cells f. Ex vivo Gene Therapy i. Tissue specific stem cells removed from body 1. Donor’s cells modified in test tube (e.g. gene deleted) 2. Cells screened for inserted gene and to ensure no deleterious mutation occurred 3. Cells placed back into donor’s body a. Little to no fear of immunodetection because cells originate from donor and are placed back into donor g. Induced Pluripotent Stem (iPS) cells i. Tissue specific cells converted to pluripotent stem cells 1. Engineered in lab ii. Used in pharmaceuticals 1. Screen drugs in human cell 2. Preclinical drug trials a. Including testing for toxicity and e>icacy iii. Used in disease modeling 1. Allows us to study human disorders in human cells iv. May have some therapeutic usage v. Issues 1. Unpredictable variability between cell lines 2. Requires a diseased and non-diseased cell line to study a. Can only vary at gene(s) of interest b. Complex diseases require multiple genes of interest to di>er i. Increase di>iculty of study these diseases h. Chimeric Antigen Receptor T cells (CAR T cells) i. Ex vivo modified T cells 1. Can be derived from patient (if healthy) or donor (if patients is not healthy enough) ii. Chimeric 1. Antigen Binding and T-cell activation function on same cel 2. Allows same cell to recognize and destroy target cancer cells a. Works well against some blood cancers (Leukemia) i. Destruction of cancer cells release toxin molecules 1. May be detrimental to organism a. E.g. neurological toxicity, anaphylaxis, death 3. Does not work well against solid tumors a. Do not express proteins at high enough levels to be recognized by CAR T cells b. Tumor environment is hostile to immune system III. Cloning a. Reproductive Cloning i. Creation of genetically identical animal to donor animal ii. Modified egg cell is implanted into donor animal b. Therapeutic cloning i. Creation of genetically identical cells to donor animal ii. Modified egg cell is grown into vitro (in test tube / petri dish) c. Somatic Cell Nuclear Transfer (SCNT) i. Most common cloning technique ii. Does not work to clone humans 1. Can be used to derive embryonic stem cells d. Human Embryonic Stem Cells i. Generated via SCNT from di>erentiated somatic cells ii. Histone methylation typically precents SCNT reprograming 1. Requires removal of H3K9me3 by mouse demethylase iii. Used to treat age-related macular degeneration e. Embryonic Twinning i. Artificial splitting of embryonic stem cells ii. Places into surrogate mother 312 Therapeutic Approaches III – Genome Editing I. Gene Therapy a. Introduction of cloned genes in an attempt to treat / cure diseases i. Gene can be integrated into genome 1. E.g. genome editing ii. Exist outside of genome 1. E.g. expressed on vector (plasmid DNA of viral DNA) b. Best used to treat disorders caused by a single allele i. Not good with complex diseases c. Non-Viral Delivery (e.g. Liposome) i. Place macromolecule inside liposome 1. Liposome also used to carry some medicines d. Viral Delivery i. Place gene of interest inside genome of virus 1. Can be retrovirus or DNA virus ii. Several factors need to be worked out in advance to determine which morula should be used 1. Size of gene 2. Immune response 3. Long-term or short-term gene expression 4. Expression level 5. Probably more II. Genome Editing a. Changing the DNA code of living organism i. Methods 1. Selective breeding (not directly discussed) 2. Recombinant DNA technology (310) 3. Artificial restriction enzyme 4. CRISPR/ Cas9 system ii. How? 1. Cut DNA 2. Repair DNA a. Homology directed repair with external piece of DNA b. Non-Homologous end joining i. Results in deletion ii. Therapeutic uses b. Prevalent in research c. Emerging in medicine i. Some ethical concerns in humans 1. E.g. o> target e>ects may cause more issues that disorder ii. Not possible in most tissues due to inability to target all cells d. Artificial Restriction Enzymes i. In vivo genome editing technique 1. DNA recognizing molecule fused to endonuclease ii. DNA-binding protein used to recognize DNA 1. Zn-finger (Zn-fonger nucleases (ZFN)) 2. TALE (TALE nucleases (TALEN)) iii. DNA Cleavage 1. FokI a. Endonuclease b. Dimer will cleave DNA indiscriminately i. To prevent this during genome editing, FokI monomers are separated so that they only cleave when both DNA-recognition molecule are bound in the correct locations iv. Used to treat some diseases 1. ZFN – possibly used to treat HIV infection 2. TALEN – used to treat acute lymphoblastic leukemia e. CRISPR i. In vivo genome editing technique ii. RNA used to recognize DNA (via complimentary base pairing) 1. Guide RNA iii. DNA Cleavage 1. protein a. Endonuclease i. Requires PAM sequence (5’ NGG) to interact with DNA f. Genetic modification uses i. Deletion of small DNA fragments 1. Via NHEJ ii. Gene Replacement 1. Requires Cas 9-nickase a. Only capable of nicking DNA (i.e. cuts one strand of DNA) b. Requires nicking DNA on both sides of desired modification site i. Each site requires its own guide RNA c. Insert gene of interest via homologous recombination g. Non-genetic Modification uses of CRISPR i. Control gene expression 1. Requires inactivated Ca9 fused to specific transcription factor ii. Sequence location 1. Requires inactivated Cas9 fused to specific transcription factor h. Basis of some gene drives i. Ensure a specific allele will be passed down to all subsequent generations i. Limited medical purposes in humans (so far) Extra Notes - Genomic Imprinting à Only use one copy of the gene If one copy has a mutation, deletion, methylation error (broken) you get the disease The other copy is just vibing in silence - Northern blot à measures RNA à tells you how much transcription is happening - Western blot à measures protein à tells you how much translation is happening - SNP array à just finds DNA mutations (variations). Doesn’t tell you if they a>ect transcription or translation - RNA-seq à gives you RNA levels, but no protein and no info on translation - PCR + Southern blot à Only tells you about DNA level (no mRNA or protein info) - FISH + CRISPR à FISH is DNA/RNA location-based, CRISPR is editing – not used for measuring transcription / translation - Coding bias only applies to coding regions - Two-hit hypothesis à Hit #1 get a “broken” copy Hit #2 à Second copy mutates No functional copies left = cancer risk increases - Monoallelic Expression à Only one allele (from maternal or paternal) is expressed, while the other is turned o> o Includes § Genomic Imprinting (only mom or dads’ gene is used) § X-Inactivation (in females) § Random monoallelic expression (happens in some immune and olfactory genes)