Global Change Mod 1 PDF
Document Details
Uploaded by Deleted User
Tags
Related
- Population Genetics HO AUG 2024 PDF
- Biology of Behavior PDF
- Psychobiology Summary - Biological Psychology - Tilburg University 2023-2024 PDF
- Chapter 2 Outline–Genes, Environment, and Development PDF
- Lecture 10: Human Dispersal and Genetic Markers of Disease PDF
- G2E1 2024 Intro to Block with Coral Bleaching PDF
Summary
This document discusses evolutionary processes, focusing on natural selection, consequences of mutation and recombination, and the impact of demographic processes on populations. It also explores the principles of evolutionary biology with regards to genetic variation, contemporary and historical considerations, and the relationship between macro and micro-evolution. The document provides an overview of the topic with details.
Full Transcript
GLOBAL CHANGE - MOD.1 11/11/2024 1. Evolutionary processes that shaped human biology → focus on Natural Selection Impact of demographic processes in populations consequences of mutation and recombination migration genetic drift selection → how populations evolve to adapt t...
GLOBAL CHANGE - MOD.1 11/11/2024 1. Evolutionary processes that shaped human biology → focus on Natural Selection Impact of demographic processes in populations consequences of mutation and recombination migration genetic drift selection → how populations evolve to adapt to a different environment. Human biological adaptability to climate, environment and dietary contexts. + Maladaptive processes due to environmental changes, cultural transitions, global migration → disappearance, abandonment, modification = ‘evolutionary mismatch’ PRINCIPLES OF EVOLUTIONARY BIOLOGY Most genetic variation is inherited (rather than arisen independently in individuals) and its history is shaped by evolutionary forces—->Enables to make inferences about our species past (i.e. genomes as records of evolution) Genetic variation: do not produce a phenotypic effect most of the time → most is inherited and does not arise independently in individuals; and its history is shaped by evolutionary forces. - About 30 million sequences for genetic variance from African genome (most ancient); - while for Americans we observe about 6 million sequences of genetic variance (most contemporary). Mutations are shared by the entire human population with respect to their ancestry. Individual genetic variants that are very recent aren’t evolutionarily relevant. Macro-evolution VS Micro-evolution Macro= evolution of species over geological time micro = processes operating on genetic diversity within a species over generation + happens over population → process that takes thousands of years + processes that lead to macro-evolution + Assumption that species-level of evolution is just an extrapolation of population-level evolution + we have the tools to predict what will happen at the species level → may lead to species transformation overall → single population as a unit of analysis + we analyze allele frequencies (= change in the frequency of genetic variance) though time and space → genetic variants can have different frequencies in different populations (effect of the own evolutionary history of the population and its coping with environmental stresses). Shuffling of different variability due to admixture → intermixing of populations and interchanges between alleles. The more recent the species, the less the genetic variability → the adaptation of the human population is an ongoing process. The gene pool of the entire species is reduced if the populations are mixed → isolation of a species in a single pop reduces variation. 1 Genetic variations of the single individual define the population’s genetic variability → structured at different levels. Geographical clustering at the continental level to make comparisons. ➔ A biological trait can be adaptive or disadaptive. Both phenotypic & molecular variation are influenced by many aspects of our evolutionary past: 1. How our species originated (effects of this upon our genetic diversity) 2. Early differentiation & spread of human populations 3. Recent admixture of populations with different histories The genetic differences between populations, even if small, in most cases aren’t due to environment but are due to the demographic differences in the population → most of our genome evolves neutrally, genetic variance is a stochastic process, change by chance → genetic drift → based on demographic processes and migration ⇒ demographic processes have an impact on differentiation. STOCHASTIC = random. GENETIC DRIFT = the change in frequency of an existing gene variant in the population due to random chance. Differentiation is also due to adaptations to diverse environments → biological adaptations for survival. Exit from Africa triggered adaptation in Humans ⇒ modified their own environment: demographic expansion, dietary shifts, close contact with animals, climate, pathogens landscape, new cultural environment ⇒ selective pressure to adapt (decreased in the modern era) 1. adaptation to natural environment 2. adaptation to social and cultural environment ^ongoing and parallel processes that impact population biology. Adaptive responses to similar environments likely result in similar phenotypes. Phenotypic similarities are unlikely to be representative of the rest of the genome (= unreliable indicators of biological origins) → since these characteristics are responsive to the environment. - “Selected” alleles to infer pop adaptations → regions of the genome where selection worked - “Neutral” alleles to infer pop demographic histories → accumulated genetic variants that do not entail pop variance; stochastic effects that impact allele frequency → here very specific regions of the genome are targeted. *pop = population DEMOGRAPHIC HISTORY = Demographic history refers to the study of population-level phenomena such as adaptation, extinction risk, and the historical changes in population size and structure over time. 2 *genetic variants that do not entail pop variance = variants that do not imply the change in frequency of an existing gene variant in the population due to random chance. NEUTRAL ALLELES = if a population carries several different alleles of a particular gene, odds are that each of those alleles is equally good at performing its job → that variation is neutral: whether you carry allele A or allele B does not affect your fitness. We can’t prove causality between events → we must rely on evidence, but this survival is selective! ^fossils and ancient DNA evidence Humans are a species with very low genetic diversity even though there’s a huge number of individuals. Risk alleles for a disease in populations are the same, what changes is the frequency. Evolutionary perspective allows us to make predictions about the future. Modeling trees of individual alleles/haplotypes in a pop to infer genetic similarities or distances (coalescent tree). ★ a huge number of humans does NOT correspond to high genetic diversity → phenotypic variance is due to adaptation to similar environments → genetic variability is a stochastic process. ★ clustered distribution of phenotypic diversity doesn’t correspond to similar genetic structuring → variations are too recent and are an outcome of adaptation to a similar environment. ★ disease alleles are not specific to continental groups → risk alleles are the same in a species, what changes is the frequency! ★ Information on human population: demographic history and ancestry + faced environmental challenges. ★ Recombination and chromosomal assortment divide the genome into segments with independent histories → ancestry of a single segment of the genome converges on a single common ancestor → each segment has a separate ancestor (older than the pop). ★ The genetic record of life is contained in the genome of species and it reveals evolutionary processes and relationships back to the last common ancestor ⇒ intra-species comparisons provide evidence on the more recent evolutionary processes. ★ Variation among modern individuals is shaped by cumulative past processes. An evolutionary perspective allows to make predictions about the future of our species Most phenotypic traits are controlled by a combination of inherited (i.e. multiple genes), environmental, stochastic developmental & molecular processes Disentangling the interplay of these factors will help to relieve considerable burdens of complex diseases on societies The Evolutionary Genetics approach: 1. Rationale: The genetic record of life is contained in the genomes of species and it reveals evolutionary processes and relationships back to the last common ancestor 2. Methods: Compare distantly related branches on the tree of life, with intra-species comparisons providing evidence on the more recent evolutionary processes Genetic evidence comes from two main sources: Genomes of living individuals (subset of that of the ancestors) Ancient DNA (may or may not be represented in living descendants) 3 Variation among modern individuals is shaped by cumulative past processes investigation of different times in human prehistory Different layers of the past accessible through genetic diversity assays: - Phylogenetic relationship to species - Origins of our species - Prehistorical migrations - Historical migrations - Genealogical studies - Paternity testing - Individual identification -_-_-_-_-_-_-_-_-_-_-_-_-_-_ 18/11/2024 2 Most methods to infer evolutionary processes in humans are SNPs-based. Single nucleotide variation (SNPs = single nucleotide polymorphism) - the allele has to be present at least in 1% of the population to be considered - irrespective of the frequency → the allele could be frequent or rare. - Are genetic markers with a very slow evolution - once they are introduced in the gene pool of the pop through mutation they spread slowly - we infer information only in the recent timeframe of the pop - from single locus analysis to exploration of genome-scale data. How do we understand which genetic variant is present in the pop we want to study? DNA microarray = tool used to determine whether the DNA from a particular individual contains a mutation in a gene. The chip consists of a small glass plate encased in plastic. On the surface, each chip contains thousands of short, synthetic, single-stranded DNA sequences, which together add up to the normal gene in question, and to variants (mutations) of that gene that have been found in the human population. To investigate if an individual has a specific mutation, its DNA is extracted and denatured; the corresponding mutation is identified based on which strand (mutated or control) the subject’s ssDNA pairs to. sequencing approaches = process of determining the nucleic acid sequence. ^genomic data can be used to make different inferences → or there can be practical constraints. ^we obtain a picture of general genome variability → we have to understand whether the changed gene is the outcome of adaptation (natural selection) or demographic history. Most of our genome evolves neutrally = according to stochastic processes related to the demographic history of the population. In addition, SNPs could mediate a natural adaptation. ^Both act with each other contributing to variation. 4 APPROACHES TO CHARACTERIZE GENOMIC DATA SNP chip approach Pros: ★ advantageous to type a lot of genetic information into a single experiment Produce data about a lot of genetic variance across the genome It’s not expensive → a lot of individuals in a single experiment It’s most used in the field of evolutionary studies ★ aren’t custom experiments → can’t choose which genetic variant to survey → SNP chips are produced by the company to a specific set of variants → limitation!! = I cannot assert the variants Cons: SNP chips are very different between companies → it’s a problem when it comes to comparison between pops → smaller subset of variants overlap when comparing pops histories. We have to check whether data is compatible SNP chips are predominantly based on European ancestry ★ SNPs have high or moderate frequency → we have no idea about the rare variants present in the gene pool of the pop → we aren’t able to find recent variations introduced ★ linkage disequilibrium → genetic variants can be associated if they are inherited together → are located on the same portion of the chromosome in very close proximity ⇒ recombination does NOT work in this portion of the genome → when two variants are linked together, they are in a linkage disequilibrium. ★ Tag SNPs are informative of other variants are useful to have a wider range of information at the same cost → BUT the two tag SNPs are not linked with each other To study the haplotypes (= combination of alleles in different positions of the genome) loci must be tightly related with each other = linkage disequilibrium. ^but to reconstruct haplotypes we cannot use tag SNPs because they are not related to each other! SNP chip approach used only for preliminary stages of the study → general overview of genetic variability of the pop and to understand which individuals are not useful to study the history of the pop. How does this approach work from a technical perspective? The experiment is performed on a chip on which probes are fixed (= very short sequences of nucleic acids). - Each probe is complementary to the sequence on which the genetic variant is present - A limitation is that I cannot infer information on new genetic variants! - probes = oligonucleotides - for each genetic variant there are two different probes: 1. one oligonucleotide is complementary to the reference allele 2. the other is for the same portion of the genome, but complementary to the variant allele Experiment: - I use enzymes that cut DNA or fragmentation - Fluorochrome is added → emits a fluorescent signal - I unload the DNA of the chip and the fragment I want to study will emerge and match with the compatible probes. 5 - With some buffer I wash away all the parts not attached to the chip and I can study the preferred sequence (DNA which matched with the probes) - I need a platform that is able to scan the chip → matched to reference allele = it’s the reference allele the genome → matched to variant allele = it’s the variant allele in the genome ^allelic state of genetic variants (both necessary for inference) I obtain genotype data → variants annotated based on position, biological relevance etc. - PLINK = software to perform quality control on generated data - function implementary R packages → for analyzing data from microarray Mass Spectrometry approach To infer the genetic state of the variant in study ^multiple PCR requirements → combine regions of the genome into a single experiment Complementary strategy for validation/replication or to type well-established informative markers (of variants I already know). Useful to validate already obtained results → SNPs that can be typed in a single experiment like this are few, meaning higher accuracy. Pros: cost effective typing of few tens of genetic mass/charge ratio markers simultaneously on 384 samples Cons:limited n. SNPs constrained by multiple PCR requirements, candidate pathway approach Complementary strategy for validation/replication or to type well-established informative markers Experiment: - I design a primer that is complementary to the region of interest - I perform the multiple PCR - I perform the single base extension complementary to genetic part where the variant is present - I obtain information about the allele that has been introduced by the PCR (that can be the reference or the variant allele), based on the fragment that complements the single base extension inserted Massive parallel sequencing ⇒ technology used to sequence the entire genome. ★ Possibility to modulate the sequencing effort in order to read the genome as many times as we need for reliability. I can sequence the portion of the genome many times → depends on how I design the experiment. ★ I obtain an unbiased description of both common and rare variants → high coverage result. A technical problem during the sequencing is useful to detect sequencing errors and variants → I find the variant or reference allele specific to an individual or a single ethnic group/pop. *Fragments are amplified and sequenced at the end in a single experiment. Reliable results, good cost in respect to sanger, doesn’t take much time. ★ Limitation: computational inputs to manage the data → programming language is difficult + huge amount of data → I need infrastructures and skills. A lot of time is needed with limited computational power. Variant calling files.vcf have millions of rows → impossible to import on R 6 Experiment: - I fragment the DNA → I extract it from a biological sample - I select the right fragment according to the company’s requirements (ex: how many bases it should be long) - I add adapters so that the fragments we want to study stick → as for the probes in the SNP chip method. I add adapters on the extremes of the DNA - adapters will pair on oligonucleotides on flow cell *The adapter on the free side can bind to other adapters - polymerase performs an amplification as in a PCR → the adapter works as a template - When bridge amplification is completed and the process is repeated many times, I obtain clusters of fragments that are complementary to the original one. The number depends on how many bridge amplification I have created (that are representative of the same portion of the genome) - Fragments are read during sequencing (…)X times (ex: 30X = sequence was read 30 times) Sequencing.vcf data format: What I observe on columns: 1. chromosome number 2. nucleotide position 3. code of the variant according to database On a row I observe: A SNP that is already discovered or completely new variants. PROCESSES SHAPING DIVERSITY OF HUMAN POPULATIONS - environmental factors, dietary changes, cultural influence… With migration, frequencies started to change among populations → hints at the peculiar history of a pop. We compare pops to define the genetic variability. ★ Allele frequency affects a pop’s genetic variability but not the single individual → models relate to pops, not on a single level. Synthetic theory of evolution—>POPULATION paradigm - By describing micro-evolutionary processes it explains also macro-evolution Union of Two independent models: 1) Mendel’s idea of independent inheritance 2) Darwinian Natural Selection of phenotypic traits → traits are selected and transmit across generations because they advantage the individual By showing that discrete alleles can underline continuous traits Breeding pop = groups of individuals within a species reproductively isolated from other groups. ^oversimplification of reality, in humans at least → In the past, geographical factors played a main role in separating groups; nowadays human pops cannot be defined fully isolated. Ideal pop = pop on which an ideal model works well; I oversimplify evolution mechanisms to make predictions. non-overlapping generations identical allele frequencies in both sexes 7 panmitic (= randomly mating) very large size (infinite) no migration mutations can be ignored natural selection does not affect allele frequency → true for the most part of our genome (evolves stochastically, neutrally) Why are these concepts important to make assumptions about a Real pop? To make comparisons: In ideal → random union of individuals = random union of gametes → fixed expected frequency. According to the frequency of the allele, I can predict the frequency for the next generation. There is a mathematical model ⇒ observed and expected frequencies are compared. Hardy-Weinberg equilibrium (model) = no changes in allele frequency = no evolution. First major application of pop genetics was to explain how allele freq. in one generation could be used to predict genotype proportions in the next one ★ The equilibrium test must be applied at every locus to see if there are variations. Hardy-Weinberg test to infer equilibrium for each variant to understand if the variant is impacted by natural selection or any other kind of evolutionary effect ★ In ideal populations:: Random union of individuals= Random union of gametes If we know alleles freq. we can predict proportions of genotypes in the succeeding generation by combining gametes (each 1 with 1 allele) at random - 8 To be able to make predictions using HWE we need an ideal pop Infinite randomly mating sexually reproducing diploid individuals No selection, no migration, no mutation Evolution is occurring if the pop is not in H-W equilibrium (i.e. allele & genotype freq. not stable over time) One or a combination of the above factors is operating Application examples: - If I have an idea of the frequency of the recessive trait, I can calculate the frequency of the heterozygous individuals → I can see how many carriers there are and calculate the risk for carrying a recessive disease - Kuru disease = neurodegenerative disease transmitted by consuming brains of infected individuals. Some genetic variance is observed at a higher frequency; being heterozygous was protective with respect to the disease = heterozygosity of a PRNP SNP confers resistance. The frequency of the protective allele increases in the female component because it was the more exposed part of the pop. MUTATIONS generate diversity. Increase genetic variability through time → by adding alleles at each mutational event. *SNPs are useful to investigate these genetic variants Mutation models → there are lots of models; we have to make sure that the model is reliable with data. Infinite-allele model = each mutation creates a new allele Infinite-site model = each mutation occurs at a different site, so that a nucleotide can not be subjected to more mutations + It discounted the possibility of back-mutations and recurrent mutations + Recurrent mutations = happen multiple times on the same nucleotide → high mutation rate + ^Back mutations are a subset ⇒ reverse mutational state on the site → restores the original allelic state: for example, from variant to reference allele and back to variant with back mutation Stepwise mutation model = a mutation increases or decreases with equal probability the length sequence by one unit Necessity of more complex models to consider the possibility that several changes might have occurred at the same time Finite-sire model = real DNA sequences have a finite length and present mutational hot-spots... so mutations can occur several times at the same site - Not considering that aspect lead to underestimate the actual n° of occurred mutations 9 We use SNPs because we know that these variants are rare, have low probability. We need models that account for the mentioned mutational possibilities in order to obtain a robust hypothesis. INDELs = insertion and deletion, Freq. dependent on repetitive nature of surrounding sequences CpG dinucleotides: Nucleotide mutability dependent on its neighbor nucleotide ^might be useful to describe pop history but at the moment we have NO models able to describe how they happen through the generations!! RECOMBINATION → as for mutations, increase genetic variability and as a consequence biological variability in little time (higher variability level). Enhances populations ability to adapt to environment through the combining of advantageous alleles at different loci ★ Does NOT create new alleles like mutations, instead it creates NEW COMBINATIONS ★ Generates new haplotypes (new combination of alleles on the same DNA) ★ Alleles combined in parallel have less time to assemble the fittest haplotype How can we observe haplotypes in the gene pool of a pop? - Physical separation = typing SNPs in individually amplified alleles. - Pedigree analysis = child haplotypes resolved or not by parental genotypes. - Statistical inference = to reconstruct the haplotype (need for large reference panels because the reconstruction made by algorithm starts from homozygous sites; after that we look for heterozygous sites → where recombination is possible). ^problem: we need a similar ancestry panel of other pops to make a general inference on the structure of the haplotype. 25/11/2024 3 Recombination can also be studied at the population level Investigating whether alleles are associated with one other more or less than expected by chance: If there is no association between them (Linkage Equilibrium) Freg of haplotype AB should be equal to the product of A x B fre - association ⇒ disequilibrium, Non-random association of alleles at different loci - no association ⇒ equilibrium LINKAGE DISEQUILIBRIUM = non-random association of alleles at different loci. Alleles on different chromosomes are randomly segregating during meiosis while loci closely linked on a chromosome are not as recombination between them occurs infrequently - Linked loci share a common evolutionary heritage (e.g. hitchhiking effect) Genetic variances are more associated than under a regular linkage condition. Haplotype frequency can be predicted → combination of both variants can be predicted if they are independent and strictly associated. - useful to say something about pop history - compare patterns between pops - tend to evolve in terms of allele frequency ⇒ genetic variants mediate adaptation to an environment: 10 ★ hitchhiking effect = when an allele changes frequency not because it itself is under natural selection, but because it is near another gene that is undergoing a selective sweep and that is on the same DNA chain LINKAGE EQUILIBRIUM = genetic variants are not associated → located farthest *Loci closely linked on a chromosome are not as recombinant between them → less frequent *Linked loci share a common evolutionary heritage *Evolution and recombination can lead to haplotype breakdown (into small pieces) and disequilibrium shifting. GENETIC DRIFT → stochastic process in the gene pool of the pop - statistical effect that results from the influence of chance o the survival of alleles - related to pop history - eliminating diversity - No pop is infinitely large as assumed under HWE, each generation is a finite sample from the previous one Variation in allele frequency over generation occurs solely through stochastic sampling. WRIGHT-FISHER MODEL: magnitude of genetic drift is related to pop size. Small pop = allele is rapidly fixed or lost large pop = alleles persist with more subtle frequency variations ★ The smaller the pop size the greater the drift This model contains many unrealistic assumptions: Generations overlap (especially for large pop) Pop size is rarely constant No random mating (especially for large pop) These factors vary among pop: - To compare amount of drift experienced by different pop Wright introduced the concept of: - Effective population size (Ne) Effective pop size Ne = Size of an idealized pop that experienced the same amount of drift of the studied pop → there are many unrealistic assumptions → actually smaller than expected → idealized Influences causing estimation mismatch: - number of breeding individuals - time fluctuations of absolute pop size - sex ratio - variance in offspring number - inbreeding The probability of fixation for a new allele can be calculated of mutation and selection are negligible → more rapid in a small pop ★ Drifting alleles have a finite lifetime (average time in generation) Fixation by drift is slower than by selection and rare for new alleles 11 Census size fluctuations → few pop are constant in size for many generations. Long term Ne is equal to harmonic mean of pop size overtime. - Ne is close to the smallest N—-> Ne (neutral variation) due to smaller ancestral pop size Processes that shape the early stages of our history as a species → demographic processes that shape genetic diversity of a pop ⇒ decreasing genetic diversity! + reduced ancestral Ne Importance of two pop processes which shape genetic diversity BOTTLENECK EFFECT = related to dramatic events (ex: epidemic, eruptions, climate change) that reduce the pop number a lot → increases genetic drift and inbreeding. - > 50% of a pop is killed or prevented from reproducing - reduced average heterozygosity - reduces number of alleles FOUNDER EFFECT = related to the colonization process → separation of a subset of diversity from the source pop - movement into previously unoccupied land - intense drift - sudden random change in allele frequency - drop in genetic diversity POP SUBDIVISION: mating is random but natural pops are actually subdivided. Metapopulation = single pop that within itself is structured in subgroups (= demes) → distinguishable because of small differences (due to natural selection or genetic drift). A metapop may be not randomly mating because it is made up of partially isolated pops (demes). Isolation leads to genetic differentiation of demes because drift acts independently of them: - members of the same deme are more closely related than are those of different demes. - The same genetic variant may change differently in groups because it is a random process! - Differences accumulate due to the action of genetic drift. - demes are not completely isolated! → rate of gene flow or Ne of the metapop as measure of pop structure Fst ⇒ used as a measure of genetic distance between pops. Apportionment of variation between demes, compares mean genetic diversity of demes to diversity of metapop HT = expected heterozygosity of meta-pop. HS = mean expected heterozygosity across demes ^~ 0 → little differentiation, high gene flow ^ ~ 1 → high differentiation ^= 0 → demes have identical allele freq., ^ = 1 → demes have fixed different alleles 12 MATING CHOICES - non random - selection based on phenotype - choice via assortative mating - inbreeding ASSORTATIVE MATING = choice according to a biological trait common to the individual and her partner increases genetic drift (decreased Ne) generates a lower heterozygote freq. than expected under random mating DISSORTATIVE MATING = choice on the basis of divergent phenotypic features ★ sexual selection increases some biological traits, just because they are attractive → acting similarly to natural selection → regulate allele freq. INBREEDING = mating of relatives → similar genome does not change directly allele freq./can affect allele freq. by changing how selection operates effects: increase of homozygous genotypes with respect to what we expect in random mating it affects the whole genome Coefficient of consanguinity = probability that two homologous alleles are identical by: - descent IBD = alleles inherited from a common ancestor - state IBS = alleles are the same due to independent mutation → pop presents a high level of inbreeding (isolated pops) GENE FLOW = exchange of alleles between pops - alter allele freq. only in the pop that receives the migration - the number of individuals moving is relevant in order to know whether freq. varies Colonization: movement into previously unoccupied land unoccupied land—> Founder effect Migration = movement into an occupied land Gene flow: migrants contribute to the next generation of the recipient pop One way migration ⇒ produces very rapid changes in allele freq. → the higher the number of migrants, the faster the changes in all. freq. ISLAND MODEL = pop structured within itself into subpopulations, not completely reproductively isolated but with a gene flow maintained within themselves. A metapop is split into islands of equal size (N) and rate of gene flow per generation (m). Nm = migrants received per generation. Assumptions: - no geographical substructure - each pop persists - no mutation and no selection - migrants are random samples from source *Rate of migrant exchange can be directly related to measure of pop subdivision (Fst). 13 STEPPING-STONE MODEL = gene flow is established only between pops that are geographically close → gene flow between adjacent demes. Accounts for geographical distances by allowing gene flow between adjacent demes Equilibrium regulated by action of genetic drift and gene flow. *Even low migration rate retards differentiation of demes ★ Equilibrium between loss of alleles by drift and gain by gene flow ISOLATION BY DISTANCE = discard the concepts of metapop and subpop → instead, distribution is geographically continuous! - Distribution may present a gradient of variation - ing choices are limited by distance and these - I.e. individuals distributed over a large area rather than discrete demes - genetic similarities develop in neighborhoods as a function of dispersal distances. *used for human migration and most used one MIGRATION MATRIX MODEL = constant migration rate. More complex because applicable only with detailed data on migration rates and distances. - different migration rates - asymmetric migration ^ more realistic relationship between distance and migration → due to economical, social and cultural factors. ★ high migration rate ⇒ more gene flow ⇒ reduce the effects of genetic drift ★ low migration rate ⇒ more genetic drift ⇒ allele freq. changes evolving throughout generations, accumulating differentiation independently (isolated pops) ^ all these processes are happening at the same time! METAPOPULATION MODELS = More realistic, more complex - long-distance movements - asymmetric migration - Migrants not a random sample (sex or age-biased and related) Migration processes are far more complex than available models SEX BIASED MIGRATION Long-range migration is male biased Inter-continental migrants Explorers Traders Soldiers Slave trades 14 02/12/2024 4 MECHANISMS UNDERLYING ADAPTATION Sometimes biological adaptation can be so effective that it decreases the chance of disease. Mostly mediated by genetic mechanisms. Environmental stresses can trigger adaptation. Adaptation can be: biological and/or cultural PHYSIOLOGICAL MECHANISMS ⇒ short term and reversible responses (disappear as soon as there’s no more stress) to environmental changes. Cannot be inherited and is of the individual!! - acclimatization → adaptation to living at different altitudes - tanning → rapid physiological adaptation, reversible! epigenetic mechanism DEVELOPMENTAL PLASTICITY ⇒ environmental influences on long-term development, rapid but not reversible. At an individual’s early stages of development because we have plasticity. Works at individual level. - stresses influencing growth → mother’s nutrition during pregnancy = individual from badly fed mother develops metabolism issues all life long - influence of early nutrition on metabolism epigenetic mechanism → inherited for some generations, not too many, that’s why we cannot compare it to genetic mechanisms. FULLY GENETIC MECHANISMS ⇒ distribution of alleles underlying some phenotypes is governed by evolutionary forces. Impact a small fraction of the genome. Most variation observed works neutrally → when under genetic pressure, gene expression may change due to natural selection ⇒ doesn’t have the goal of adaptation!! no finalism because only the individuals that survive can reproduce. ADAPTATION BY NATURAL SELECTION Evolutionary paradigm of adaptation by natural selection in the framework of the neutral theory of evolution ↓ Success/failure of variants in the background of neutral variation shaped by genome instability, stochastic mutation processes & demographic events ↓ Demographic events leave genomic signatures hardly distinguishable from those left by natural selection ★ Present-day patterns of variation can be explained by random genetic drift/selective pressures/interplay of both Genomic footprints of selection are also considerably variable, mainly according to the typology, age and strength of selection ⇒ recombination can reshuffle changes made by natural selection (so it’s difficult to infer changes in ancient pops) 15 CHANGE OF ALLELE FREQUENCY ⇒ different response to specific selective pressures → changes in different directions. 1. purifying selection = negative/background selection; works independently of cultural or cultural aspects. Maintains genomic integrity of functional regions by eliminating deleterious mutations. Some diseases are at low freq. in Humans because of the action of negative selection. Different freq. spectrum between missense/nonsense SNPs & silent SNPs + lower diversity in exons→ Enrichment of nsSNPs at low freq Some evolutionary constraints are acting on the genome → prevents from accumulating variability (some variants could be detrimental/nonfunctional) + Individuals with mutations often die before having the chance of reproducing. nonsense mutations are the genetic variance with the highest probability at lowest allele frequency overrepresentation that decreases with allele frequency increase Quantitative comparisons to see how different pops are exposed to the effects of negative selection Mutational processes randomly introduce them in the gene pool of the pop. 2. balancing selection = when polymorphism is present in two haplotypes with a similar frequency → we can consider two different alleles more or less at the same frequency. Maintains high polymorphism in the selected region (a specific locus responds to a specific selective pressure). It takes thousands of years in order to achieve this kind of equilibrium. Ex: having a second genetic disease (balancing action) in order not to develop a severe form of Malaria (main genetic pressure). balancing selection of heterozygotes → advantageous condition because the second disease is not so severe, and at the same time the main disease is milder. - AA = no Malaria, severe second disease - aa = no second disease, severe Malaria - Aa = balance between the two diseases increased resistance to infection milder symptoms High incidence of recessive diseases → higher than what is expected with negative selection. Innate immune genes involved in inflammation Ex: Two haplotypes are noticeably more present in the pop → just in european ancestry wheat example ,red part of graph→ caused by an endemic pathogen in the european continent. H1 (aggressive inflammatory responses to pathogens) VS H10 (prevention of inflammatory by-products).- - Disrupted by recent introduction of diet-related immune-stimulatory epitopes - A recent cultural shift turned this adaptation into a risk factor==Increased risk of non-celiac wheat sensitivity Higher H1 freq. in Europeans than in other populations makes them more prone to the side effects of modern wheat varieties consumption 16 3. positive selection: hard sweep model = favors increase in frequency of advantageous alleles; leads to an outstanding reduction in genetic (and biological) diversity of the selected genomic region. A new selective pressure established by an environmental change ⇒ new allele introduced by mutational stochastic processes ⇒ allele is useful, impacts a biological trait and the changed trait acts positively against the external stress. rare that the new allele is spread → it takes time, humans are a young species change in the genome of the single individual and then transmitted most tested model, but not the most realistic one according to H.sapiens demography 4. positive selection: soft sweep model = favors increase in frequency of advantageous alleles; a single gene is involved in the adaptation process; selection on different de novo mutations; selection on standing variations. More realistic model → in humans, in which mutations are not so likely. Different haplotypes according to the different genetic variants present in the individual - Difficult to be tested advantageous in new due to moderate reduction in diversity (often requires functional data) 5. Positive selection: polygenic adaptation model = favors increase in frequency of advantageous alleles. Difficult to be tested because of moderate reduction in diversity and small effect of single genes/variants on the phenotype. Selection also on standing variation 6. Positive selection: adaptive introgression = favors increase in frequency of advantageous alleles; difficult to test selection on archaic species & to distinguish introgressed loci from ancestral polymorphisms. Specific alleles of a pop that are introduced in the pool of another specific pop, due to gene flow. 17 Emerging evidence of human adaptations/disadaptations Different responses of different pops to the same disease may be due to different susceptibility. - Positive/balancing selection if: a simple disease is more frequent than expected given mutation-selection balance, or has a particular geographical distribution. - Unusual distributions of Mendelian diseases can almost always be explained by genetic drift. see slide 3-9 09/12/2024 5 Complex diseases’ risk alleles at high frequency Paradoxical some alleles that substantially increase disease risk are found at high freq Selective benefits of malaria resistance outweigh disadvantage of thalassemia or sickle-cell disease Similar scenarios may account for some of the differences in allelic architecture of different common diseases T2D Type 2 Diabetes ⇒ series of genetic variants that regulate the energy metabolism have been accumulated in order to handle food shortage. When humans were hunter gatherers and didn’t have the possibility to have food frequently, natural selection advantaged genetic variants that favoured the rapid insulin release after eating → allows for the glucose to be released quickly in the blood and it is more likely to be stored as fat. Genetic variation has played a role in prevalence of T2D in westernized Pacific Islanders, Native Americans & Australian Aborigines? + Island of Nauru in Micronesia, 40% of people over 15 have T2D + High rate of diabetes in Pima Indians of Arizona, but not the same risk alleles as other pop Several possible explanations have been proposed, but all remain speculative 1) Thrifty genotype hypothesis Optimized to store energy as fat, to help in famine periods A thrifty genotype rendered detrimental by progress ⇒ modern pops don’t need this condition because we are more sedentary and we can access more food → now it increases the chance 18 of having cardiovascular diseases → insulin resistance is the first stage towards obesity and cardiovascular disease. The thrifty genotype could have been favorable in a wider group of pop until very recently (nutrition among farmers might be even poorer than hunter gatherers’) → since agriculture doesn’t resolve food shortage issues, the allele should be present in different pops → the advantageous trait should be widely diffused across the pop (high allele frequency). Individuals who store fat more efficiently should survive and be more able to reproduce → being thrifty is associated with the capability of being overweight → not so right! Lack of evidence for differential survival of obese & lean individuals in famines, necessary for the spread of thrifty alleles ○ food shortage = selective pressure that kills most fragile components of the pop (older and younger people, who aren’t in the reproductive stage) ⇒ NOT THE REAL SELECTIVE PRESSURE!! ○ Problem of post- rather than pre-agricultural societies ○ instead, PATHOGENS are the real selective pressure → during famine, people die from disease and not from starvation → data is not reliable and the thrifty hypothesis doesn’t work ^ If true, high frequency of genotype because of natural selection → positive selection ^ Different incidents of diabetes in pops, different allele frequency → so actually the selective event wasn’t in early Homo (risk haplotype would be fixed) ^ If recent, risk haplotypes are combination of genetic variants in which we observe the derived allele instead of the ancestral one 2) Drifty genotype hypothesis ❖ drifty genotype hypothesis → genetic variants that increase the capability of the body of storing fat (as for thrifty genotype); ❖ ancestral pop, metabolism did not have to cope with excessive fat intake Lipid storage genes accumulated variation essentially neutral (i.e. influenced mainly by drift) These variants revealed their functional consequences under the pressure of a high-fat diet Modern variation of T2D prevalence is unsurprising, whereas long-term selection for thrifty alleles would have led to the fixation of risk alleles in all pop ❖ increases reproductive capability ❖ thrifty BUT evolves neutrally → lipid storage genes accumulated variation neutral → act stochastically on the gene pool ❖ the environmental setting does not trigger disease this time → variants revealed their functional consequences under the pressure of high-fat diet ❖ increase of allele frequency is just due to genetic drift ^ If true, high frequency of genotype because of genetic drift T2D: evolutionary genetics testing If the thrifty genotype hypothesis is correct, haplotypes associated with increased T2D risk should have undergone positive selection + If selective event in early Homo (~1 MYA), risk haplotypes would be fixed in humans + If recent selective event (5-50 KYA) risk haplotypes should be derived + higher freq. than expected for their age 19 Examples: T2D: selection on protective alleles in East Asians → protective derived allele shows a high degree of pop differentiation → striking increase in frequency that may be mediated by natural selection. ⇒ result is OPPOSITE to the one predicted by the thrifty gene hypothesis! Not yet a biological explanation. Natural selection kept the protective allele at high frequency, not the risk allele! ^ Drifty genotype hypothesis is more likely. ^ Previous adaptations that evolved in order to decrease diabetes susceptibility → because of the rice type for many years → modulate susceptibility for disease (reduced insulin resistance). - Low freq. in Africa - Intermediate freq. in Europe - 95% in East Asians (with significant EHH scores Introduction of rice-consumption - High Glycemic index - Selection on standing variation that increased cell sensibility to insulin - Reduced insulin resistance & T2D risk T2D: selection on risk alleles in East Asians → result of natural selection = increased insulin levels and basal metabolic rate (increased capability for the cell to burn energy = insulin resistance = metabolic diseases) → heat dissipation + T2D risk → adaptation to colder climate (energy used to maintain body temperature) ^higher risk but no thrifty because the selective pressure is cold climate T2D in the Italian pop: Different demographic history alters the frequency of the genetic variants → no difference in biological traits though. Differential distribution of nutritional resources & pathogen Genetic differences are very small and may be due to the exposure to very different selective pressures. - polygenic adaptation was tested for northern Italy ⇒ experienced selective pressure according to cold climate → high calorie diet is needed for thermogenesis. - Climate-mediated adaptive evolution at insulin secretion pathway in N_ITA - Thermogenesis/adiposity: Optimization of energy metabolism in temperate climate with cold winter seasons - Climate & fat-rich diets as selective pressures—> Modulation of lipid metabolism—>Reduced cardiovascular & T2D risks Differential distribution of nutritional resources and pathogens → complex genomic background despite restricted geographical extension. 20 Other risk alleles at high frequency in the Italian pop → targets of natural selection. ^ Pathogen-driven selective pressures ⇒ increases risk of autoimmune disease + increased resistance to infections by mycobacteria. ^ Diet related inflammatory risk ⇒ divergent adaptive evolution made some Italian groups more exposed to modern metabolic and immune challenges triggered by dietary shifts. Addiction in Native Mexican people: psychoactive plants and alcoholic beverages used in religious practices/traditional medicine → adaptations that increased tolerance to their harmful effects → high risk of developing addictions due to disruption of traditional diets (ex: increased alcohol consume and junk food) 16/12/2024 ppt 6 Risk alleles at high frequency: KIDNEY DISEASEs - Kidney failure is 4 times more frequent in African-Americans than other Americans - APOL1 = apolipoprotein that was found to be related with the susceptibility to kidney disease → in Uruba pop in Africa. African ancestry was studied but the same results can be applied to American people with African ancestors. - The two APOL1 high risk haplotypes at ~33% combined freq. in African-Americans controls ○ 2 nonsynonymous SNPs in perfect LD [rs73885319 & rs60910145) at 52% in cases—>Positive selection in YRI (iHS scores) ○ rs71785313 in-frame 6-bp deletion at 23% in cases—>Too low frequency to test - Risk haplotypes are also present in healthy individuals → combination of the genetic variants that is well distributed across people with African ancestry → positive selection for maintenance of the combination of the variants (it means they are advantageous even though they might be a risk for the individual). - Might have developed as a response to a pathogen → Disease-associated APOL1 variants were able to lyse & thus protect against Trypanosoma brucei (sleeping sickness)endemic disease. 21 The disease applied more selective pressure than kidney disease (because kidney failure appears later in life = less impact and less intense) Selection for resistance to sleeping sickness resulted in high freq. of kidney disease risk variants - Increased life expectancy in American pop ⇒ increased probability of impairment of alleles and so kidney disease. Haplotypes are conserved even though the trait isn’t adaptive → maintained because it is neutral (it would take thousands of years to remove it). T1D type 1 diabetes & CELIAC DISEASE - A 1.6 Mb region with extended LD on chr 12q24 associated with platelet number & volume in Europeans - Typical of European ancestry - large region of the genome that is targeted and strictly associated with the trait and the modulation of the immune response Haplotype risen to high freq. due to positive selection beginning ~3.4 KYA Derived allele associated to increased risks of CAD, T1D & celiac disease - increased frequency of the allele in the region → selective pressure that triggered this adaptation → derived allele might be the target of the positive selection → associated with risk of CAD, T1D and celiac disease - functionality of the T-cell is affected Among the 15 genes in the region, SH2B3 showed an amino acid difference (rs3184504) between selected/non-selected haplotypes Implicated in T-cell-mediated immune response - we don’t know which pathogen triggered this adaptation Selection for a highly active immune system in past pathogen-rich environments was advantageous, but is detrimental in current Western societies AUTOIMMUNE DISEASES A study of 107 risk SNPs in 7 autoimmune diseases found that 44% were shared between them: - Celiac disease - Crohn’s disease - Multiple sclerosis - Psoriasis Rheumatoid arthritis - Systemic lupus erythematosus - Type 1 diabetes Share the same functional basis for susceptibility (e.g. SNPs within or near functional elements that are active in T cells) Part of the pathological condition is the reaction of the immune system to the organism have a common genetic base → same functional basis for susceptibility related to evolutionary history of European ancestry due to SNPs within or near functional elements that are active in T cells Advantageous in the past (pathogen-rich environments) → but detrimental current Western societies. Nowadays our immune system is exposed to new antigens contained in the food produced industrially or present in the environment (pollution and chemicals and in the soil). → increased incidence of autoimmune diseases in the recent century (interplay between our cultural life and habits) Selection for a highly active immune system in past pathogen-rich environments was advantageous, but is detrimental in current Western societies 22 Mechanisms underlying human adaptation PHYSIOLOGICAL MECHANISMS = rapid changes, short-term & reversible responses to environmental changes (e.g. acclimatization, tanning) DEVELOPMENTAL PLASTICITY = long-term changes, environmental influences on long-term development (e.g. influence of early nutrition on metabolism) ○ Change in the biological trait expressed by the individual → epigenetic mechanisms, no change in the genome sequence; the changes are rapid compared to genetic ones → that's why they can’t be considered from an evolutionary perspective (isn't transmitted through generations. ^ Both do not play a significant role in the light of evolution because they are limited to the individual. Epigenetic variations are due to transitions during life span and happen more frequently; useful when there is an environmental change or a change in the cultural setting of the pop. Adaptation mediated by epigenetic mechanisms? Epigenetic changes create phenotypic diversity—>Reservoir of variation useful for rapid adaptation Epigenetic inheritance Intergenerational (indirect exposition to stimuli) Transgenerational (epialleles maintained also in non-exposed generation) When natural selection acts on epigenetic variation adaptive phenotypes arise before genetic changes Genetic mutation rate 10-06 - 10-09 substitution per nucleotide per generation Methylation/demethylation rate 10-04 per cytosine per CG pair Strong selection is needed to fix fast evolving epimutations → transmission fidelity is very low and are difficult to inherit. 23 Inheritance if selective pressure is intense and continuous in the next generation (ex: living at high altitudes → strong selective pressure: immediate adaptation is essential for survival) *T2D is associated with an epigenetic background ⇒ short term response to environmental variation via developmental plasticity can modulate T2D risk Poor nutrition in utero & in early infancy may be a major cause of increased T2D risk The fetus adapts to maternal malnutrition by becoming thrifty (i.e. decreased growth, hormonal/metabolic adaptations & changes in insulin secretion) Transition to over-nutrition later in life makes this disadvantageous Weak selection is sufficient to fix highly heritable genetic variants → pass to the next generation easily even if the genetic pressure is not so intense. ★ Most of the differences that we observe in different human pops are differences at the epigenetic level that are regulated by genetic differences! Epigenetic differences are strictly related to genetic variants ⇒ Therefore, even the epigenetic profile may be useful to reconstruct the evolutionary history of the pop! → but provides no further information of adaptation mechanisms. Local pathogens shaped epigenetic variation in genes involved in immune reaction or xenobiotic response of the organism to environmental stresses. ↓ Case study of African rainforest hunter-gatherers and farmers: - First to be conducted to prove the correlation between epimutations and pop adaptive history. - Recent ecological shifts in Africa (such as introduction of agriculture) lead to variation in immune and cellular functions. - Developmental processes + association with adaptive SNPs - Cultural improvement (agriculture) allowed genetic differences to accumulate throughout the pop (farmers reproductively isolated from hunter-gatherers) → random change in allele frequency that affected the gene pool of the pop independently. - First epigenetic response to environmental challenges, then adaptive phenotype was fully achieved via genetic changes. ^ Results: 1. Differences in the DNA methylation are due to developmental (basic) processes → differences due to genetic variants and adaptations (to different diet) mediated by natural selection on genetic variants, in turn able to modulate epigenetic variation. 2. DNA methylation levels observed in the groups are related to genetic differences. ^ Another experiment on: - Pop that adopted agriculture in the forest - Pop “ “ “ in a rural environment - Pop “ “ “ in a urban environment ^ we are not able to appreciate relevant differences between these groups BUT we observe methylation differences related to immune processes. We see that pops start to adapt and differentiate from an epigenetic perspective (because changes are more rapid). ★ GENETIC MECHANISMS = pop is adapted because the individuals inherit an epigenetic adaptation to live in that specific environment, generation after generation. 24 ★ EPIGENETIC MECHANISMS = the single individual develops its own epigenetic adaptation → first attempt to develop a permanent adaptation. Evolutionary advantage that keeps the pop alive while it is genetically evolving!! Case study: The genetic architecture of adaptation to high altitude in Ethiopia ^ Difference in time frame of the adaptations - Tibetan pops (Himalaya): had a lot of time to evolve to high altitudes = effective genetic adaptations - Nepal pops: developed the most advantageous adaptations because of the country’s morphology - Indian pops: a lot of time to evolve and adapt to high altitudes, but a little less effective adaptation. - Ethiopian pops: moved to high altitudes very recently → are evolving an adaptation that is still not effective because the selective pressure isn’t so strong. ➔ Amhara pop: moved to high altitudes 5k ya → developed genetic adaptation to hypoxia ad UV ^ no significant differences between low and high altitude pops at the epigenetic level, just a little genetic difference due to adaptation. ➔ Oromo pop: moved to high altitudes 0.5k ya → still adapting via epigenetic changes ^ no significant differences between low and high altitude pops at the genetic level, but observable epigenetic differences. ★ DNA methylation as a medium-term response to the environment. 25