Nutrigenomics Book Material PDF
Document Details
Uploaded by SprightlyTranscendental3394
2022
Cristian Taccioli
Tags
Summary
This book provides a concise introduction to nutrigenomics. It covers cellular biology, genomics, and molecular biology concepts, intended for those studying nutrigenomics with or without prior knowledge of these subjects. The author emphasizes the importance of reading scientific publications.
Full Transcript
NutriGenΩics 1 PREFACE This text is intended to be a valuable tool as an introduction to Nutrigenomics, so I have tried to write as simply and concisely as possible. The book is divided into three parts. The first part deals with the study of the fundamental topics of Cellular Biology, the...
NutriGenΩics 1 PREFACE This text is intended to be a valuable tool as an introduction to Nutrigenomics, so I have tried to write as simply and concisely as possible. The book is divided into three parts. The first part deals with the study of the fundamental topics of Cellular Biology, the second deals with the main notions of Genomics while the third describes the basics of Nutrigenomics. I believe that this didactic material, is particularly indicated for those who want to study Nutrigenomics even if they do not have experience in Cellular Biology and Genomics. In fact, in most modern university texts, the chapter on Genomics and Molecular Biology are reduced to a minimum or even omitted. In particular, Genomics is, instead, the discipline of the future since, in 2001, the sequencing of the human genome opened the doors to the complete reading of our DNA. Most human and animal diseases have, indeed, a relationship with DNA and food and, therefore, genomics is very important to understand Nutrigenomics. Therefore, I invite readers and students to read chapters II and III of this text with particular attention. As a last note I would like to emphasize the fact that I have kept to a long the number of monographs and scientific publications in the bibliography, so as to stimulate the student to read them since they represent the basics of this field. The Author Table of Contents PREFACE......................................................................................................................................................2 Cellular Biology.....................................................................8 Life 8 The cell......................................................................................................... 8 Prokaryotes and Eukaryotes......................................................................... 9 The tree of life............................................................................................ 21 Evolution..................................................................................................... 22 Genomics...........................................................................24 Genomics: a new science............................................................................ 24 DNA 26 Various types of DNA structures................................................................. 26 RNA 26 Mechanisms that increase or decrease the size of a genome...................... 28 GC content in genomes............................................................................... 30 Chargaff's second law, Szybalski's and Sinclair's rule................................. 30 The human genome..................................................................................... 31 Molecular Biology..............................................................32 Replication in Prokaryotes.......................................................................... 32 Replication Eukaryotes................................................................................. 35 Transcription in Prokaryotes......................................................................... 39 Transcription in Eukaryotes........................................................................ 40 Splicing....................................................................................................... 42 Translation in prokaryotes.......................................................................... 44 Translation in Eukaryotes..........................................................................................................................47 Regulation of transcription.........................................................................................................................49 Notes on microRNA..................................................................................... 50 Notes on Transposons................................................................................. 50 Notes on epigenetics................................................................................... 51 Nutrigenomics...................................................................53 What is Nutrigenomics all about?................................................................ 53 Origin of human beings............................................................................... 53 From vegetarian to an omnivorous species................................................. 57 Early Neolithic and Agriculture.................................................................... 58 Out of Africa................................................................................................ 61 Sapiens and Neanderthal diet..................................................................... 63 General adaptations to Neolithic diet.......................................................... 64 Breeding and Lactase persistence in human populations............................ 64 Agriculture and AMY1................................................................................. 68 Grain consumption and autoimmune conditions......................................... 71 3 ADH in human populations.......................................................................... 74 Summary on LCT, AMY1 and ADH................................................................ 75 Technologies used in Genomics Nutrition studies........................................... 80 PCR 111 RT-PCR and qRT-PCR................................................................................................................................... 112 Microarray.................................................................................................................................................. 112 Next-Generation-Sequencing (NGS)........................................................................................................... 113 Western Blot............................................................................................................................................... 114 Northern Blot............................................................................................................................................. 114 Southern Blot.............................................................................................................................................. 114 Bibliography................................................................................................. 119 Books 119 Articles 120 INDEX...............................................................................123 Distribution and copying of this material by any means is not permitted. The images in this text have been downloaded from the "Google Image Repository ©" which do not have copyright. In case information have been retrieved from other sources they were included in the bibliography. Version II, 2022 5 Cristian Taccioli 2022 © 7 Cellular Biology Cell biology is a discipline that studies the structure and function of the cell. In this chapter we will deal with this subject in a very synthetic way, so as to provide a basic tool or a review before delving into the topics of the following chapters. In addition, the most famous evolutionary theories will be explained in light of new scientific discoveries. Life It is not easy to establish what life is. Suffice it to say that most scientists do not consider viruses to be living beings because they are not able to interact with the external environment, while a percentage, albeit minimal, of scientists consider all those algorithms created in artificial environments (computers) to be living beings. In many texts the following definition of life is given: "An entity is defined living when it is able to be born, to grow, to interact with the inside and the outside, it is able to repair itself, to evolve, to reproduce and to die". In my opinion, however, it is very difficult to create a divide between living and non- living physical systems. Life has to do with the concepts of entropy, complexity and information. Life is an open physical system, not at equilibrium, and in mathematical terms it is difficult to calculate its properties quantitatively. In some environmental situations, if sufficient energy is provided to specific molecules or molecular systems, it is possible to create "semi-replicators", i.e. physical systems capable of semi-replicative properties. The problem is therefore not easy to solve but, in this text, I will use the so-called "Occam's razor" that is I will choose the simplest way. Life (on our planet) is defined as everything that is provided with nucleic acids (DNA and RNA) that allow it to replicate. For more information on this topic I refer to the works of Erwin Schrödinger (Schrödinger, 1944), Arieh Ben-Naim (Ben-Naim, 2015), Addy Pross (Pross, 2017), Jeremi England (England, 2020), Sean Carrol (Carrol, 2017), John Campbell (Campbell, 1982), and Paul Davies (Devis, 2019). The cell The cell is the fundamental unit of all living organisms present on earth. We do not know if there are other forms of life in our solar system but on our planet the cell has, with different characteristics, a general homogeneous structure. It is formed by an external envelope, while inside there are structures that are able to keep it intact and alive. Some organisms are formed by only one cell like bacteria, archaebacteria, and protozoa, while others have managed to evolve thanks to a mechanism that is able to make millions (sometimes trillions) of cells interact at the same time (multicellular organisms). Prokaryotes and Eukaryotes There are essentially two types of cells and therefore organisms derived from cells: prokaryotic cells and eukaryotic cells. Prokaryotic cells (bacteria and archaea) are always unicellular while eukaryotic cells can be both unicellular (protozoa) and multicellular (algae, sponges, plants, fungi, amphibians, reptiles, birds and mammals). Prokaryotic cells or prokaryotic organisms (Figura 1) are apparently very simple structures and appeared about 3.5-3.8 billion years ago. Bacteria and archaebacteria are the only groups of prokaryotic organisms that exist today. Viruses are not prokaryotic organisms because they are just nucleic acids (DNA or RNA) covered with proteins. Figura 1. Prokaryotic cell. Prokaryotes range in size from 1 to 5 micron, with a shape that can vary greatly. DNA is circular and is located within the cytoplasm. The cytoplasm is that area of the cell that is located within the membrane. In the cytoplasm of prokaryotic cells, there are no complex structures that transform energy or store chemicals, and there is no real compartmentalization either. There are, however, ribosomes which are structures dedicated to the creation of proteins and plasmids which are small circular DNA accessories, distinct from the chromosomal DNA which is much larger. Furthermore, prokaryotes are often equipped with cilia (pili) or flagella which are structures that allow motility in both liquid and terrestrial environments. As we have already said in prokaryotes there is an external membrane which separates the cytoplasm. This membrane is extremely complex if we compare it with the apparent simplicity of prokaryotes. It consists of a capsule, a cell wall and a plasma membrane: 9 The capsule is a polysaccharide structure (polymer of sugars). It is the outermost structure of the so-called Gram-negative bacteria, although some Gram-positive bacteria are equipped with capsules (Streptococci). An easy way to remember which organisms have capsules is to memorize the following English nursery rhyme: "Even Some Super Killers Have Pretty Nice Big Capsules". Each word initial in this phrase represents a different bacterium (Escherichia coli, Streptococcus pneumoniae, Salmonella, Klebsiella pneumoniae, Haemophilus influenzae, Pseudomonas aeruginosa, Neisseria meningitidis, Bacteroides fragilis, Cryptococcus neoformans). The capsule, when present, then has several functions: it firmly unites the cell wall, protects from pathogens, promotes adherence to surfaces, prevents drying and facilitates the exchange of nutrients from outside; the cell wall is more internal to the capsule. It is present in both Gram negative and positive bacteria but with some differences (Figura 2). The cell wall in Gram negatives is formed by a thin layer of peptidoglycan (polymer of protein and sugar) and surrounded by an outer lipopolysaccharide (LPS) membrane. Figura 2. Cell wall in Gram positives and negatives. The cell wall in Gram positives lacks an outer membrane but has numerous layers of peptidoglycan. The Gram positives and Gram negatives take their name from the fact that a scientist named Christian Gram (1853 - 1938) was able to color the bacteria he was studying differently. Some of these bacteria, because of their thick layer of peptidoglycan, retained the dark purple color of a molecule called Fuchsin while others bound it more loosely. The latter, however, could be stained with a light pink dye and could therefore also be visualized under a light microscope (Figura 3). Figura 3. Staining of Gram positive (purple) and negative (pink) bacteria. the plasma membrane is formed, on the other hand, by a phospholipidic double layer on which proteins with different functions move. The "heads" of the phospholipids (hydrophilic, i.e. they bind to water) are in contact with the aqueous environment of the outside and inside of the cell, while the tails (hydrophobic, i.e. they do not bind to water) touch each other inside the membrane itself which does not contain aqueous substances. Actually, among prokaryotes, there are not only bacteria but also another group of unicellular living being called archaea. Probably the taxonomic class of archaea organisms are the first prokaryotes to have appeared (or arrived) on earth, although this topic is still debated among scientists. A proper phylogenetic analysis on the DNA of archaebacteria, bacteria and eukaryotes can easily show how the latter are more similar to archaebacteria than any other class of prokaryotes. Archaebacteria live in extreme environments, both in terms of temperature and in terms of salinity, pH, etc. The plasma membrane of archaebacteria is also formed by phospholipids but the chemical bonds between them are stronger, perhaps to maintain a more solid structure within environments that we can consider extreme. Eukaryotic cells (Figura 4) are instead more complex. They have a high degree of compartmentalization of the cytoplasm due to a protein structure called the cytoskeleton, which contains numerous organelles (small structures that also have an outer membrane). They present only a plasma membrane that surrounds the cytoplasm even if some of them have a wall formed by polysaccharides (species belonging to the kingdom of plants). 11 Figura 4. Structure of the eukaryotic cell. Let's start by seeing what organelles (Figura 5) are present in the cytoplasm of eukaryotic cells: The nucleus is the organelle that contains the chromosomal DNA. It is the largest of the eukaryotic cell and, as we will see later, it is the parameter that more than any other determines the size of a cell. It has a permeable membrane that separates it from the cytoplasm. Its shape varies from cell to cell and from species to species. Inside the nucleus is DNA grouped into a number of chromosomes specific to each taxonomic group. Chromosomal DNA is bound to proteins called histones that are involved in the regulation of DNA structure, replicative and transcriptional activity (see the chapter on "Nutrigenomics"). The nuclear membrane consists of an inner and an outer membrane. The inner membrane is bound to the chromosomal DNA through a rich network of proteins called lamin proteins (see next paragraph), while the outer membrane is in direct contact with the cytoplasm; The cytoskeleton is a system of cytoplasmic filaments composed of proteins and is present in every eukaryotic cell. Its function is to support and shape the cell. It also has the function of creating specific compartments in the cytoplasm and facilitating the transport of molecules and macromolecules. The cytoskeleton consists of microfilaments, intermediate filaments and microtubules. Microfilaments are the thinnest filaments of the cytoskeleton, they are formed by actin and have a diameter ranging from 5 to 7nm R. These structures, so thin, have a polar activity in the sense that they are negative at one end and positive at the other. They also have "ATPase" activity (they bind ATP which is a molecule capable of supplying chemical energy) and can bind to other types of proteins in order to obtain a greater transport capacity. . Figura 5. List of organelles in eukaryotic cells Intermediate filaments, on the other hand, are about 10 nm long and are much stronger than microfilaments even though they have no polar activity (i.e. they are not electrostatically charged). They are composed of vimentin, desmin, keratin and lamina. In particular, lamina is a fibrous protein that forms the nuclear lamina or nucleoskeleton which has the same function as the cytoskeleton in the cytoplasm but is present inside the nucleus. Microtubules, on the other hand, are formed by the alpha and beta tubulin proteins and, in addition to being part of the cytoskeleton, also form cilia and flagella (Figura 6); 13 Figura 6. Structure of flagella in Gram-negative prokaryotes (left) and eukaryotes (right) Flagella are structures that are used by cells for movement. In eukaryotes they are made up of microtubules and use ATP as an energy source. They are anchored to the membrane through the basal bodies. In prokaryotes, however, flagella are composed of flagellin and have one (in Gram-positive bacteria) or two basal discs (in Gram-negative bacteria); Mitochondria are cytoplasmic structures dedicated to the transformation of energy through the formation of molecules called ATP. Many cellular functions would not take place without the use of ATP. ATP, in fact, donates a phosphate group to some proteins that are thus activated. During this process ATP loses a phosphate group and turns into ADP. It is at the mitochondrial level that ADP is recharged and transformed back into ATP. Inside the cell there are numerous mitochondria and each of them contains its own DNA that is called mitochondrial DNA, deputed to the codification of various messenger RNA that will bring to the codification of various mitochondrial proteins. To level of structure, the mitochondrion is found in the cytoplasm and it has the shape of a bean. It is surrounded by an inner and an outer membrane (Figura 7). They are present in eukaryotic cells but not in prokaryotic cells. Figura 7. Structure of the Mitochondrion These membranes form the mitochondrial ridges and matrix. Oxidative phosphorylation, which is the biochemical process that leads to the production of ATP, occurs in the mitochondrial ridges after the processes called glycolysis and Krebs cycle have taken place. The oxidative phosphorylation is, therefore, the final phase of the cellular respiration and exists also in the prokaryotes but it happens only at the level of the external membrane. From evolutionary studies it seems that, about 1.5 billion years ago, a prokaryote (probably an alpha-proteobacterium) incorporated a smaller bacterium (Thrash, 2011). The latter would have then transformed into what we now call the mitochondrion. This is why there are similarities between mitochondrial and prokaryotic DNA. The counterpart of the mitochondrion in plants is the chloroplast, where chlorophyll photosynthesis takes place. In the chloroplast the light energy is captured by the chlorophyll that allows the transformation of CO2 (carbon dioxide) plus H2O (water) in glucose (C6H12O6). The mitochondrion is however contained in the vegetable cell even if its function seems to go beyond that one from the task that it develops in the animal cells. It seems, in fact, that the DNA of the chloroplast, larger than the animal one (more than ten times), contains important information for the development and differentiation of the plant (Liberatore, 2016); Ribosomes (Figura 8) are organelles formed by ribosomal RNA (rRNA) and two protein subunits that are located in the cytoplasm. This means that this protein is actually a protein-nucleotide complex formed by two subunits. In prokaryotes the sedimentation coefficient (unit of measurement that identifies the volume of a protein) of their ribosomes is 70 so the ribosome is called 70S and is formed by a small subunit 30S and a larger 50S. The sedimentation coefficients are not calculated algebraically, in fact the major and minor subunits of eukaryotic ribosomes, respectively 40S and 60S, form a protein complex called 80S. In the ribosomes, translation takes place, i.e. the mRNA (messenger RNA) that is transcribed from DNA is transformed into proteins. This function of reading mRNA and subsequent assembly of cellular amino acids to form proteins is carried out both by eukaryotic ribosomes and by the ribosomes of prokaryotic cells; 15 Figura 8. Structure of the ribosome The endoplasmic reticulum (Figura 9) a protein structure found in the cytoplasm. It is very complex and is structurally divided into Rough Endoplasmic Reticulum or RER and Smooth Endoplasmic Reticulum or REL. In the former are found the ribosomes that form proteins, in the latter takes place the synthesis of lipids, phospholipids and steroids. The presence of ribosomes on the RER does not imply that they cannot perform their function in other cellular sites as well; Figura 9. Structure of the Endoplasmic Reticulum The Golgi apparatus (Figura 10) is a cytoplasmic organelle whose function is to determine the three-dimensional structure of proteins and direct them, also through vesicles, into specific compartments; Lysosomes are small cytoplasmic organelles that contain enzymes responsible for digesting numerous molecules that are useless or harmful to the cell; Peroxisomes are vesicles located in the cytoplasm and their function is breaking down harmful molecules such as hydrogen peroxide analog compounds; Vacuoles are organelles found, usually, in all plant and fungal cells, as well as in some protozoa. Typically, their function is to store water and waste products, balance the pH (acidity and basicity), and help maintain pressure within the cell itself; Centrioles (Figura 11) are protein complexes present in most animal and plant cells. They have the purpose of creating the mitotic spindle. The mitotic spindle is a cytoplasmic structure composed of microtubules that serves to bind the chromosomes, thus allowing an equal distribution of these chromosomes in the two daughter cells during cell division of somatic cells (mitosis) or sex cells (meiosis). During mitosis and meiosis, they are present in double pairs at the extremes of the poles of the cell and each pair presents two structures one orthogonal to the other from which the microtubules of the mitotic spindle; Proteosomes are protein complexes that are involved in the cleavage of simpler molecules (catabolism) from peptides. They differ from lysosomes in that the latter have an outer wall that separates them from their surroundings so that they can use lytic enzymes (degradation enzymes) to degrade a wide variety of organic molecules. 17 1. Figura 10. Structure of the Golgi apparatus Figura 11. Structure of the centrosome and centrioles In conclusion, the major differences, also summarized in Figura 12, between eukaryotic and prokaryotic cells are mainly these: Prokaryotes have sizes ranging from 0.3 to 2 micrometers, while in eukaryotes the size range from 5 to 100 micrometers, except for spermatozoa; Prokaryotic DNA is circular, whereas eukaryotic DNA is linear; In prokaryotes transcription takes place in the cytoplasm (precisely because they do not have a nucleus), instead in eukaryotes it happens in the nucleus; Prokaryotes have smaller ribosomes than eukaryotes; Prokaryotes lack the many organelles found in eukaryotes; Prokaryotes move through flagella composed of flagellin while eukaryotes move through through flagella and cilia formed by tubulin; Prokaryotes divide by binary cleavage, eukaryotes by mitosis (in somatic cells) and meiosis (in sex cells). In particular, reproduction in prokaryotes generally corresponds to a simple cell division that leads to two daughter cells identical to the original mother cell. It is called binary fission or cleavage and is preceded by DNA replication and duplication of the organelles and structures that make up the prokaryotic cell (plasmids and ribosomes). There are other forms of reproduction in prokaryotes such as budding (a bud is formed from the mother cell that detaches to form the daughter cell), sporulation (the mother cell divides into a large number of daughter cells) and other less common types of cell divisions. Reproduction in eukaryotes is, instead, more complex. In unicellular eukaryotes (protozoa) reproduction can be both asexual and sexual. In the first case it is called mitosis and is more complex than cell division in prokaryotes. In the second case (sexual reproduction) it is called meiosis and foresees the formation of gametes, that is, of sexual cells that have half of the chromosomal equipment. From the meeting of two gametes of the same species the zygote is formed which represents the new organism. In the superior eukaryotes (sponges, fungi, animals and plants) there is both mitosis in the somatic cells (blood, muscle, brain, etc.) and meiosis in the sexual cells (ovules and spermatozoa). The various phases of mitosis and meiosis are described in detail in Figura 13 and Figura 14. The various phases of the cell cycle are shown instead in Figura 15. Figura 12. Differences between prokaryotes and eukaryotes Figura 13. Mitosis. Prophase: DNA condenses into chromosomes (duplicates); Prometaphase: the microtubules of the centrosome join the chromosomes; Metaphase: the chromosomes align in the centre of the cell; Anaphase: the chromosomes separate; Telophase: 19 the two nuclear membranes appear around the new chromosome set-up. During mitosis, therefore, from a cell with chromosome set 2n, two other cells with chromosome set 2n are obtained. Figura 14. Meiosis. Prophase I: the homologous (duplicated) chromosomes (paternal and maternal) pair up, the membrane of the nucleus disappears and there is recombination by crossing-over (mutual exchange of DNA between homologous chromosomes) and the microtubules of the centrosomes subsequently join the chromosomes; Metaphase I: the chromosomes align along the equatorial plate; Anaphase I: the chromosomes move towards their own pole; Telophase I: the chromosomes de-condense, the nucleus is reformed and the organelles duplicate to each reach the two cells that are forming; Phase II: the chromosomes are visible and the nucleus disappears while the centromeres join the chromosomes; Metaphase II: the chromosomes align along the equatorial plate; Anaphase II: the chromosomes divide into chromatids and move to opposite poles; Telophase II: the chromosomes de-condense and the nucleus reappears. During meiosis, therefore, from a cell with a chromosomal kit 2n we arrive at four cells with chromosomal kit n. Figura 15. Cell cycle. G1 = cell growth; S = DNA replication; G2 = cell growth termination;I = interphase (G1+S+G2); M = mitosis; G0 = quiescent state. The tree of life As we have seen there are two cell types which also give their names to two groups of organisms: prokaryotes and eukaryotes. Prokaryotes are unicellular and obviously made up of prokaryotic cells. They are the simplest and most numerous organisms on earth and are divided into bacteria (Gram positive and Gram negative) and archaea. Eukaryotic organisms on the other hand can be both unicellular (protozoa) and multicellular (algae, fungi, plants, fish, amphibians, reptiles, birds and mammals). By studying the genome of living organisms it is now possible to demonstrate that we all derive from prokaryotes following an evolutionary line which has not yet stopped (Figura 16). Figura 16. Tree of life 21 Evolution The first idea of evolution came from Anaximander of Miletus (Censorinus, ~3rd BC), a Greek pre-Socratic philosopher who lived around the 6th century BC. Anaximander elaborated a theory according to which all vertebrates derive from fish which, abandoning their natural environment, moved to the land during the succession of geological eras. However, the theory of evolution was forgotten for the duration of the classical period and the Middle Ages. It took the French Enlightenment to bring the idea of evolution back into vogue, with naturalists such as Georges-Louis Leclerc de Buffon (Buffon, 1749 - 1767) and Geoffroy Jean-Baptiste Lamarck (Lamarck, 1809) and Saint-Hilaire (1829) until Charles Darwin (Darwin, 1859) theorized a clearer and more convincing hypothesis in 1859. Jean-Baptiste Lamarck was a botanist but won the chair of Zoology at the National Museum of Natural History in Paris. He was the first to coin the term "biology" and to hypothesize a solid theory of evolution. This hypothesis of his was published in 1809 under the name "Philosophie Zoologique" (Lamarck, 1809). This theory foresaw that the characteristics acquired during the life of an organism (e.g. muscle tone, weight, etc.) could be inherited by the offspring and transmitted to the following generations. Snubbed and mocked throughout his life, he was buried in a mass grave and his remains have been lost. His theory was not successful even in the periods to come but has recently been re- evaluated after the advent of a new discipline: epigenetics (see next paragraphs). Charles Darwin was the scion of a family of English doctors. He left his studies to embark on a boat named "Beagle" and traveled to the Galapagos Islands (an archipelago of thirteen volcanic islands located in the Pacific Ocean, about 1.000 kilometers from the western coast of South America) where, observing finches and other animals, he elaborated an evolutionary theory, according to which all the individuals that present characters that better predispose them to live in their environment, will transmit these characters to their offspring, otherwise they will succumb to the selection that nature carries out on every living organism. His famous book "The Origin of Species" (Darwin 1859), published in 1859, was, from the very beginning, what we today would call a "best-seller". His theory is still accepted even though the advent of epigenetics poses increasingly frequent and insistent questions. The history of evolution in human thought is very articulate and there are entire volumes on this subject. A very interesting and comprehensive book on evolution is "The Structure of Evolution" published in 2003 by Stephen Jay Gould (Gould, 2002). 23 Genomics Molecular biology originated in 1953 with the discovery of the structure of DNA (Watson & Crick, 1953; Wilkins, 1953; Franklin, 2003) and is concerned with the study of the molecular mechanisms that regulate the cell. The difference between genetics and molecular biology lies in the fact that molecular biology goes into the details of the mechanisms that regulate the function of DNA, RNA and proteins, whereas genetics deals with the heritability of phenotypic and molecular characters acquired by the offspring. It can, therefore, be said that Molecular biology is the foundation of all biological and medical disciplines. From Molecular biology a new discipline has recently emerged that goes by the name of Genomics. Genomics is a branch of molecular biology that deals with the study of the genome of living organisms. In particular, it deals with the structure, content, function and evolution of the genome (content of DNA and RNA within the cell). Notoriously, it is a science that relies on bioinformatics to process and visualize the enormous amount of data it produces. An even newer science is Nutrigenomics that concerns the study of the human genome related to food. In this chapter we will make a brief excursus on genomics first and then on the basics of molecular biology. In the next chapter we will deal in depth with Nutrigenomics. Genomics: a new science Johann Gregor Mendel (Mendel, 1865), was an Augustinian monk who lived at the time of the Habsburg Empire and is today considered the father of Genetics. By Genetics, today we mean the set of all those laws that describe heredity while Molecular Biology is considered the branch of life sciences that focuses on the molecular mechanisms that underlie all cellular processes. Mendel, however, was not the first to notice that phenotypic characters (morphological characteristics) of parents were transmitted to their children, we already have evidence of this in "De Rerum Naturae" written by Lucretius in the first century BC. What makes Mendel a genius is the fact that he was the first man to create a mathematical model that could describe hereditary transmission. Obviously, he used genetic characters on which it was possible to make simple probabilistic calculations such as, for example, the color of peas and their roughness. In reality, the genetic traits that today we call "Mendelian", and on which the famous "Mendel's laws" (Mendel, 1865) are applied, are not many and in fact the good Gregor chose those characteristics that best suited the mathematical calculation. Unfortunately, no one understood the scope of his work and, in the darkness of his monastic cell, he died of acute nephritis in 1884. Not even Charles Darwin, the naturalist who devised the hypothesis of evolution by natural selection (Darwin, 1859) had probably ever read Mendel's work, even though they both lived in the same historical period. Thirty-five years after the monk's death, the Dutchman Hugo de Vries, the German Carl Correns, the American William Jasper Spillman and the Austrian Erich von Tschermak came to the same conclusions as Mendel and brought his writings to light. From that moment on it was clear that, within every living organism, there was a substance that transmitted genetic characters in a methodical and organized way. In 1909 the botanist Wilhelm Johannsen (Johannsen, 1909) called this substance a gene. In fact, today we know that every living unit (the cell) is like a small self-sufficient universe capable of growing, repairing itself, exchanging information with the environment, evolving, replicating and dying. Inside almost every cell (some don't have these characteristics) there is a molecule called DNA that carries genetic information, useful to transmit genetic characters to future generations in an almost perfect way (mutations, sometimes positive and sometimes negative, have a frequency that varies from species to species). In addition, DNA leads to the formation of a molecule called RNA, which in turn creates proteins, which are the main structures that form the cell. The term genome was first coined by Hans Winkler in 1920 for reasons we do not know. Lederberg and McCray in 2001 proposed the hypothesis that Winkler combined the word gene with that of chromosome or with the Greek suffix "ome" meaning unity (Stencel & Crespi, 2013). Today, by genome we mean haploid DNA (i.e., the chromosomal makeup of sex cells) that resides in a cellular compartment called the nucleus and is packaged into self-contained structures called chromosomes. In higher organisms, half of the chromosomes come from the mother and the other half from the father. In humans, for example, there are 46 chromosomes in the nucleus of each cell, of which 23 come from the father and 23 from the mother (Fig. 2.1). Each cell in an organism, therefore, has the same DNA content. In particular, in each tissue (e.g. muscle) there is a selective activation of a group of genes, so that they can be activated in one cell type but not in another. The mechanisms underlying the activation and deactivation of genomic elements, which go by the name of gene regulation, are actually very complex and much remains to be clarified. Going back to the history of life sciences, there are some fundamental stages that we must remember to better understand how we came to understand the structure of the genome. In 1869 Friedrich Miescher discovered that in the nucleus there was a substance rich in phosphorus that could not be a protein because of its size and called it nuclein (Gregory, 2002). In 1879 Walther Flemming was able to color it and called it chromatin (Gregory, 2002). In 1888 Wilhelm Waldeyer was able to stain some nuclear structures that were banded and called them chromosomes (Gregory, 2002). In 1915 Thomas Hunt Morgan published a book (Morgan, 1905) where he demonstrated the presence of genes within chromosomes through a study of crosses in the species Drosophyla Melanogaster (fruit fly). This work earned him the Nobel Prize for medicine in 1933. Other genetic studies, carried out by as many scientists, followed until, in 1930, the name nuclein became deoxyribonucleic acid (DNA). In the following years Erwin Chargaff (Chargaff, 1950) discovered that the number of nitrogenous bases that form the DNA had a proportion of the type: Adenine number (A) = Thymine number (T) and Cytosine number (C) = Guanine number (G). This law goes by the name of Chargaff's Law I. The famous British biochemist did not understand that this regularity was caused by 25 the double helix structure of DNA (Fig. 2.2). This molecule is formed like a train track, in which the tracks are made of phosphate sugar, while the inner bars are the nitrogenous bases that pair up with each other in twos (A with T and C with G). For this reason, he did not receive the Nobel Prize. Instead, Oswarld Avery, Maclyn McCarty, and Colin MacLeod in 1944 and Alfred Hershey and Martha Chase (1952) demonstrated that it was DNA that transmitted genetic information (Gregory, 2002). Now, it remained to be understood what structure DNA had, which, in higher organisms, was packaged with various proteins and formed nuclear chromosomes. Proteins are structures that result from the translation of RNA (which is derived from the transcription of DNA) and are made up of amino acids. They are the constituent units of the cell. Now the study of the DNA/RNA (Genomics), using the Next Generation Sequencing (NGS) technologies, is one of the most promising fields of science both in the biomedical field and from the point of view of biotechnology and food science. DNA The structure of DNA was understood by three young Englishmen James Watson, Francis Crick and Maurice Wilkins. In 1953, they published an article (Watson & Crick, 1953; Wilkins, 1953; Franklin, 1953) in the magazine "Nature" which changed the course of the history of biology and medicine. Their contribution was supported by the data obtained by the researcher Rosalind Franklin, who had analyzed DNA with X-rays (Franklin, 1953). Unfortunately, Rosalind Franklin did not receive the Nobel Prize, unlike Watson, Crick and Wilkins (Watson & Crick, 1953; Wilkins, 1953). She died, in fact, of cancer in 1958 at the age of 37 (Maddox, 2003) a few years before the Nobel Prize was awarded (1962). Various types of DNA structures There are various types of DNA molecules classified according to their structure (Fig. 2.4): B-DNA is the most typical and common right-handed form; A-DNA has a right-handed spiral structure and is present in non- physiological conditions, such as in situations of dehydration; Z-DNA has a left-handed spiral structure that is formed when chemical modifications such as methylation occur. The presence of the Z-form has been demonstrated almost exclusively in vitro. Other forms obtained in vitro are C-, D-, E-, H-, L-, and P-type DNA. RNA RNA is a molecule (ribonucleic acid) very similar to DNA but it differs from it because it is single stranded and the thymine (T) is replaced by the uracil (U). RNA is synthesized from DNA (transcription). RNA then serves as a template for protein synthesis (translation). RNA can be of various types: mRNA or messenger RNA, rRNA or ribosomal RNA, and tRNA or transfer RNA. The mRNA is the RNA that will act as a template for all proteins, the rRNA is the RNA that makes up the ribosomes, while the tRNA is the RNA that carries the amino acids that will be used for the synthesis of proteins. The rRNA is synthesized by the RNA polymerase I protein, the mRNA is synthesized by the RNA polymerase II protein, and the tRNA is synthesized by the RNA polymerase III protein. We will discuss RNA in detail in the chapter on "Molecular Biology". If we analyze the size of mammalian genomes we can see that, in general, they do not vary significantly between species. In amphibians and fish, however, the situation is different. The nuclear DNA of the frog Limnodynastes ornatus, for example, is 120 times smaller than that of the salamander Necturus lewisii, while the African fish Protopterus aethiopicus has a genome that is 350 times larger than that of the puffer fish Tetraodon nigroviridis. There are other such examples even within the same taxonomic group. A very effective mechanism by which a species amplifies its chromosomal complement is polyploidy. In practice, the entire genome can be duplicated, triplicated, etc. It can even happen that a genome of a different species is included in its own (plant species). For example, two diploid wild plant species Aegilops speltoides and Triticum urartu have formed, in the course of evolution, a tetraploid hybrid (Triticum durum) whose subspecies would have later joined another wild plant (Aegilops tauschii), thus becoming the hexaploidy soft wheat (six copies for each chromosome). Among animals, polyploidy is common only in fish and amphibians, while it is rare in other taxonomic groups. Other biological mechanisms that can transform the size of a genome are segmental duplications of repeated sequences, gene duplications, and LTR retrotransposon activity. The latter are genomic elements that are able to copy themselves by moving within chromosomes. The first studies related to the amount of DNA present in a cell date back to the early '50s. Hewson Swift (Gregory, 2002), in an attempt to show that all cells had the same amount of DNA, found that the amount of genetic material in somatic cells was twice as much as in sex cells. The value he called C, which expressed the amount of haploid DNA (DNA from sex cells), is now used to describe the size of a particular organism's genome. For example, the value of C in humans is 3.3. This means that the number of bases (bp) in the haploid human genome is 3.3 billion nitrogenous bases, approximately. In 1951, when the structure of DNA was not yet understood, researchers Alfred Mirsky and Hans Ris (Gregory, 2002) discovered that something strange had happened during evolution. They observed that the amount of DNA present in the nucleus of the cells of the salamander (amphibian) was 70 times greater than that of much more evolved organisms. How was this possible? Did the salamander have 70 times the number of genes than man? Wasn't the biological complexity of an organism therefore proportional to the number of genes? The paradox was not understood until 1990 when sequencing of the genomes of organisms began. In fact, the situation was only better understood after 2001, when the human genome was sequenced using modern "Next Generation Sequencing" techniques. The 27 percentage of genes in a genome is, in fact, very low (about 1.5%) and more or less all mammals, if grouped by taxa (taxonomic units), have the same number of genes (20-30,000). All the rest of the genome probably has regulatory activity and, although most of the DNA is transcribed into RNA (non-coding RNA), only a small percentage of it gives rise to protein synthesis. In 2013, Ganqiang Liu, John S. Mattick, and Ryan J. Taft (Liu, 2013) showed that the ratio of non-coding RNA to total genome size correlates positively with the biological complexity of an organism or taxonomic group (Fig. 2.5). Thus, in the evolutionary ladder built on this ratio we see archaea and bacteria (both prokaryotes) appear first, then unicellular eukaryotes, plants, reptiles, birds and mammals. C-value data for each species can be downloaded from the "Animal Genome Size Database" and "Plant DNA C-values Database" websites. The size of a genome, for historical reasons, has always been calculated in eukaryotic cells (cells that have various organelles and a nucleus that contains DNA) as mass in picograms (pg)R. One reason some species increase their genome in volume is probably to resist mutations when the environment becomes mutagenic. This theory was first hypothesized by Patrushev and Minkevich in 2009 (Gregory, 2002). Richard Dawkins (Dawkins, 2003) at the end of the '70s hypothesized, however, that genes are passed from one generation to another only to continue to be replicated and not necessarily to make the organism that hosts them survive (selfish gene theory). This biological process would lead, therefore, to an increase in the size of the genome. One picogram is equal to one trillionth of a gram (1 pg = 10-12 g) and corresponds, approximately, to one billion bases. Dolozel et al. in 2003 (Gregory, 2002) calculated that the number of bases is equal to the mass in picograms multiplied by 0.978 and 109. Mechanisms that increase or decrease the size of a genome Only recently has it been possible to understand the mechanisms that caused DNA to accumulate or decrease within the genome. The first genome was sequenced in 1977 (Bacteriophage MS2) with a technique called Sanger (named after its inventor and double Nobel Prize winner), while the first prokaryotic genome was sequenced in 1995 (Haemophilus influenzae). In 1997 it was the turn of the eukaryote Saccharomyces cerevisiae (yeast) and in 1998 the first multicellular organism Caenorhabditis elegans (nematode) was sequenced. Since 2001, when the human genome was sequenced, thousands of bacterial and viral genomes and a few hundred eukaryotes, including mammals, have been sequenced. Returning to the mechanisms that allow the variation of the C value, the first chromosomal elements discovered capable of transforming the size of a DNA molecule are some mobile sequences present in eukaryotic genomes that are called transposons (TEs). Transposons give chromosomes a certain plasticity. Their presence in prokaryotes, however, is not fully elucidated. As we have already said, in 1976 Richard Dawkins (Dawkins, 2003) hypothesized that in the genome there was parasitic DNA that was transmitted from generation to generation, from species to species, increasing the size of the chromosomes. In 1980 Doolittle, Sapienza and, later, Orgel and Crick (Gregory, 2002), developed a theory claiming that the main responsible for genomic variations were duplication events. This theory hypothesized that all genomes tended to increase the number of nitrogen base pairs as long as selective pressure allowed them to do so. We must not forget the pseudogenes and their role in the variation of the C value. Until recently, they were considered junk DNA. This term was coined by Ohmo and Yomo in 1992, who envisioned pseudogenes as the remnants of ancient genes (garbage DNA) encoding proteins now unused by the eukaryotic cell. Unfortunately, for a long time, the term "junk or garbage DNA" was associated with all those regions of the non- coding genome, with an attempt to downplay their role. Shalabina and Spiridonov in 2004, however, discovered that at least 45% of DNA is transcribed into RNA even though it does not code for proteins. We now know that these sequences have regulatory activity (microRNA or miRNA, piRNA, long non-coding RNA, circRNA, etc.). In 2012, the ENCODE consortium stated that at least 80% of human DNA is transcribed into RNA and has functional activity in the cell. Introns also contribute to variations in genome size. Introns are those DNA sequences that, after transcription (RNA formation), are eliminated from mRNA which is the final product that acts as a template for proteins. In humans’ introns make up 26% of the total genome (see the website of the International Human Genome Sequencing Consortium, 2001). There are also mechanisms at the level of whole chromosomes that can increase or decrease the size of a genome. In particular, duplication or loss of individual chromosomes is called aneuploidy and is often associated with detrimental phenotypic effects. In addition, segments of chromosomes may join with others, and during subsequent cell division, certain areas may be included or deleted from the genome. Fusion of entire chromosomes is also possible but this does not in itself lead to an increase in the genome unless duplication events or various chromosome rearrangements occur. A classic example is the fusion of two ancestral chromosomes in human chromosome 2 that occurred during primate evolution. This hypothesis arose from the observation that chimpanzee genome has 48 chromosomes (as do many primates) compared to 46 humans (Yunis & Prakash, 1982). Other differences between human and chimpanzee genomes include the presence of more transposable elements in humans, although the chimpanzee genome has a larger genome (C=3.75). During the cellular division of the sexual cells and of the somatic or asexual cells there can be exchanges of DNA between brother chromosomes (ex: the maternal chromosome 2 with the maternal chromosome 2, etc.). This mechanism, called "crossing-over" or "recombination", can be unequal between chromosomes, leading to an increase or decrease in the size of the genome. It may happen, in fact, that the "crossing-over" involves the repeated terminal regions of some transposons called LTRs, which are usually deleted. DNA repair events also usually lead to losses of genomic material. 29 GC content in genomes Vinogradov and colleagues studied the genomes of 154 fish, amphibians, reptiles, birds and mammals in the late 1990s and found that the total content of Guanines and Cytosines correlated positively with total genome size (Gregory, 2002). The term GC refers to the percentage of Guanines and Cytosines. In reptiles and amphibians, the GC content is higher than in mammals, while in bony fish the correlation with genome size is even negative. In bacterial genomes the percentage of GC is very variable but is on average 46%. In the human genome, however, it is 42%. Chargaff's second law, Szybalski's and Sinclair's rule As we have seen, Chargaff's first law (Chargaff, 1950) shows that Adenines pair up with Thymines and Cytosines pair up with Guanines on the opposite strand of a DNA molecule. Thus, the number of Adenines on one strand of DNA is equal to the number of Thymines on the opposite strand, and the same is true for Cytosines and Guanines. Chargaff, however, discovered that even on single strands the number of Adenines is almost identical to the number of Thymines (A =˜ T) and the number of Cytosines is almost equal to the number of Guanines (C =˜ G). This correspondence is called the second Chargaff rule (Rudner_a, 1968; Rudner_b, 1968; Karkas, 1968). Recently (Fariselli & Taccioli, 2020), some researchers have suggested that this nucleotide conformation is the most probable (maximum entropy) and the one that allows greater structural stability to DNA, since it is characterized by lower energy (Gibbs free energy). Their mathematical model predicts, in fact, perfectly the observations found on the genomes of all living organisms. Chargaff's second law extends its validity both on single chromosomes and on whole genomes but it is not valid for animal mitochondria and for some single stranded DNA or RNA viruses. There are other rules that could derive from or be related in some way to Chargaff's second law. The chemist Szybalski (Szybalski, 1966), in 1966, for example, noticed that in the viruses the genomic "codifying" regions contained, on the single strand, more purines than pyrimidines and this biological rule seems to be valid for all the living organisms. However, one must consider the fact that there are intrinsic mathematical properties in genomic sequences that do not depend on evolutionary or mutational processes. For example, the total number of di- nucleotides in a circular genomic sequence is equal to the total sum of the individual types of di-nucleotides (which are 16) and is independent of any evolutionary event, but intrinsic in the mathematical rules concerning circular sequences. These rules, and many others, were expounded in 2015 by Sinclair (Sinclair, 2015) who further demonstrated how in circular genomes the frequencies of di-, tri- and tetra-nucleotides can be calculated as simple sums and differences. The human genome The human genome is undoubtedly the most studied genome and is very similar to most mammals in terms of its composition. In Figura 33 we can see that the percentage of DNA coding for proteins is very low (about 1.5%) while 45% is composed of transposons (LINEs, SINEs, LTR transposons and DNA transposons). Transposons are mobile elements of DNA and are the possible cause of diseases but also the engine of evolution. Recently, transposons have been associated with special functions at the level of the central nervous system and embryonic development (Robertson, 2002; Egger, 2004). Figura 17. Composition of the Human Genome 31 Molecular Biology Replication in Prokaryotes Replication in prokaryotes is the mechanism by which a bacterium or archaea duplicates its DNA before the organism divides and creates two identical daughter cells. Most of the studies on prokaryotic replication have been carried out on the bacterium E. coli (Escherichia coli) which is the model organism in the field of basic biological research regarding the study of prokaryotes. In general, in E. coli the initiation of DNA replication occurs from a specific genomic region called "oriC''. OriC is a DNA sequence of approximately 245 base pairs that contains two repeated sequences of 9bp and 13bp, respectively. The first step in initiating replication is the binding of the DnaA-ATP protein to oriC. This protein is active only in the presence of ATP and only when it is associated with the inner membrane of the bacterium. The second step is binding to the DnaB- DnaC protein complex. DnaB is a protein called a helicase because it unwinds the DNA double helix while the DnaC protein is a chaperonin whose function is to inhibit the helicase activity of DnaB until it is ready to activate. Note that DnaB can only perform its function if DnaA has denatured the DNA at the level of the origin. Denaturation is also assisted by a variant of topoisomerase II called DNA gyrase. Unlike prokaryotic class I topoisomerases and eukaryotic class II topoisomerases, which remove supercoils by relaxing and stabilizing the DNA molecule, DNA gyrase acts as a pivot pin allowing the DNA to rotate on its own axis. In the absence of DNA gyrase, helicase would fail to unwind DNA which would become resistant to its action. The length of the double helix in E. coli is generally less than 60 bp and a HU protein appears to be involved in bubble formation in E. coli although in "in-vitro" experiments its activity is not absolutely required. The DnaB protein then activates a primase (DnaG) that synthesizes small RNA fragments that will serve as a trigger for the synthesis of new strands. The use of the ATP molecule is necessary in the unwinding of DNA by DnaB and also in the processes of functioning of primases. In addition, ATP is used for DNA assembly on subcomplexes linked to DNA polymerase III which is the enzyme that synthesizes the new strands. When these polymerases, which advance in opposite directions, meet at a termination point, the two new circular DNAs are separated by specific topoisomerase proteins. In detail, the steps of replication in prokaryotes are as follows: A protein called DnaA is positioned on prokaryotic DNA in an area called oriC and opens the two DNA strands, preparing the arrival of another protein called DnaB, often associated with a chaperonin (DnaC). In order to function, the DnaB protein must be released by this chaperonin, which normally inhibits its activation. The release of the DnaC protein promotes, in fact, the activation of the DnaB protein and the consequent start of replication; The protein DnaB, which belongs to the helicase family, binds to the DNA and starts to flow to the right. At the same time another DnaB binds to the same point of the DNA but starts to flow to the left. In this way, what is called a replication bubble begins to form and it gets larger and larger. These replication mechanisms occur in each strand; Some of the proteins called SSBs bind to the two DNA strands and prevent them from rejoining (they are only released at the end of replication); At this point the DNA polymerase III protein binds to each DNA helix and starts to synthesize a new strand on the template of the existing one. Two polymerases are bound on the right side of the bubble (Tau protein complex) and two on the left side. Replication is called semiconservative because it always leads to the formation of a new DNA helix from the template of an old strand. There is, however, one problem. The polymerases only move in the 5' - -> 3' direction of the new strand. In fact, each DNA strand has a direction. Each end is called either 3' or 5'. This nomenclature is derived from the orientation of deoxyribose within the DNA molecule. Moving only in one direction will cause it to happen that while one polymerase will quietly follow the opening bubble, the other polymerase will move in the opposite direction to the opening of the bubble. For this reason, the DNA polymerase III that moves in the opposite direction to the bubble will form a hooked loop so that it too moves in the 5' --> 3' direction but its movement will be slower and more discontinuous. It will then form DNA fragments (Okazaki fragments) that must be bound together by a protein called ligase. It should be noted that to start, the polymerases (both those that move to the right and those that move to the left) need a fragment of RNA (primer) that is synthesized by a protein called DnaG primase (Figura 17). The replication bubble continues to move, both left and right, with the two polymerases following the bubble. The strand not formed by the Okazaki fragments will be called leading while the other will be called lagging (Figura 18) common mistake among students is thinking that one strand is called leading while the other is lagging. This is not correct. Each strand of DNA is either leading or lagging depending on whether it is to the right or left of the start of replication. The leading strand is located in the area where the polymerase follows the movement of the replication bubble, while the lagging strand is located in the area where the polymerase moves in the opposite direction to the movement of the bubble and for this reason a loop is formed in this chromosomal area and for the same reason the Okazaki fragments are formed. The replication termination is depicted in Figura 19. 33 Figura 18. Replication in prokaryotes. The hooked loop in the filament lagging is not shown. Figura 19. Elongation phase during replication in prokaryotes. During replication, the bubbles advance in opposite directions until they meet at the point of termination. At this point, some topoisomerases intervene to separate the circular genomes that are concatenated at the moment of replication termination. In E. coli it is Topoisomerase IV that performs this function, through a process of "cut and sew" freeing, so, the two circular DNA just duplicated Figura 20. Replication termination step in prokaryotes. The concatenated circular genomes require the action of some topoisomerases to separate them. In E. coli, Topoisomerase IV plays the most important role in this process. Often in bacteria, several different genes are transcribed together. This region of DNA, which acts as a template for such an RNA, is called an operon while the regulatory region upstream of the operon is called an operator. Replication Eukaryotes Replication in eukaryotes is similar to that in prokaryotes, but many more different enzymes are involved, some of which are described in Figura 20. Here is a detailed description of the steps of replication in eukaryotes (Figura 21): The origin of replication is located in many different regions of eukaryotic DNA, which is not circular but subdivided into chromosomes; therefore, numerous replication bubbles are observed on each chromosome. These source regions are called "ORIs". The polymerases will move to the right and left of each strand, until they find another bubble and stop. The "gaps" that will be formed will then be joined by protein ligases; The ORC protein binds to the ORI sequence. Some experiments have shown that the ORC protein can remain bound to the ORI throughout the replication phase; The CDC6 protein binds to the ORC protein; CDT1 proteins are then bound to this CDC6+ORC complex, stabilizing the entire complex; MCM helicases then bind to the ORC+CDC6+CTD1 complex; The CDK and BDK proteins then bind and phosphorylate the entire complex so that only the MCM remain attached to the DNA; 35 GIN and CDC45 proteins then bind to MCM proteins, activating them and advancing the replication bubble. Figura 21. Some proteins involved in prokaryotic, eukaryotic and viral replication Figura 22. Replication initiation phase in eukaryotes. Boxes read from top to bottom, starting from the left Again, Okazaki fragments are formed in the lagging strand while the leading strand continues to be synthesized in a fast, linear fashion. The SSB/RPA proteins then prevent DNA re-pairing opened by MCM proteins. Specifically: The Alpha DNA polymerase encounters the primer and begins synthesizing the leading strand. Subsequently, it detaches and the DNA polymerase Epsilon continues the synthesis started by the Alpha protein on which binds another protein called PCNA that prevents the detachment of the polymerase from the DNA. The latter is in some cases also used by prokaryotes; On the lagging strand (the one that goes in the opposite direction to the replication bubble and on which the Okazaki fragments are formed) the beginning of replication occurs through the intervention of Alpha polymerase that recognizes the RNA starter (primer) and begins replication. At this point the 37 Alpha protein detaches and the process is continued by DNA polymerase Delta assisted by the PCNA protein that allows, therefore, the polymerase to remain attached to the DNA. Again, the Okazaki fragments are joined by a ligase (Figura 22). As hypothesized for E. coli also in this case it is possible that a hook loop is formed to allow the polymerases to move in the 5'->3' direction of the new strand and in the same direction as the replication bubble; Figura 23. Replication elongation phase in eukaryotes When the polymerase protein arrives at the end of a chromosome, a protein called telomerase, which has retrotranscriptional activity (possesses the ability to synthesize DNA from an RNA fragment), elongates the lagging strand mold with a few repeated sequences, using its own RNA mold. This process is completed by Alpha polymerase. In this way the whole chromosome is replicate. However, it should be noted that during aging telomerases lose some of this enzymatic capacity and telomeres shorten replication after replication. Telomerases (Figura 23) are particularly active in cancer cells and in stem cells. A detailed video can be viewed at: https://www.youtube.com/watch?v=AJNoTmWsE0s Figura 24. Retrotranscriptional activity of telomerases in eukaryotes Transcription in Prokaryotes In molecular biology, the term transcription means synthesis of RNA from a DNA template. This is true for both prokaryotes and eukaryotes. The enzyme that synthesizes RNA is called RNA polymerase and also moves in the 5'->3' direction of the newly synthesized RNA strand. The initiation of transcription occurs through the following steps (Figura 24): The RNA polymerase II, along with a protein called Sigma, binds to a DNA sequence that is usually identified with the bases TTGACA, in a zone called "-35" because it is located 35 bases before (in scientific language we say "upstream") of a point called start or start of transcription. Usually in a zone 10 bases upstream of the start (called "-10") there is a region called TATA box that is characterized by the sequence TATAAAT. The TATA box is very important as an additional binding site for RNA polymerase; Sigma protein detaches from the DNA leaving free space for RNA polymerase II which, being about 60-80 bp long, bends to enter the area between the "-35" region and the start; RNA polymerase II begins synthesis from the specific start zone identified on the DNAmolecule. 39 Figura 25. Transcription in Prokaryotes. Boxes read from top to bottom, starting from the left. Termination in prokaryotes is said to be intrinsic when the RNA, which is being synthesized, reaches the transcription termination signal sequence forming a loop. The RNA polymerase at this point stops. Instead, Rho-dependent termination is said when a helicase protein, called Rho, recognizes the termination sequence and stops transcription. Transcription in Eukaryotes Transcription in eukaryotes is a more complicated process than that already seen in prokaryotes. In eukaryotes, in fact, there is a nucleus where transcription takes place, and there is also a processing of the messenger mRNA through a molecular process called splicing. In addition, transcription in eukaryotes involves a much larger number of proteins than in prokaryotes. For each gene, then, a single strand of mRNA is usually formed that, through alternative splicing, can be translated into different proteins (see next paragraphs). The initiation of transcription in Eukaryotes occurs through the following steps: The transcription factor TFIID binds to the TATA box region that is located at the beginning of the transcription site called Ini. The region on the DNA that includes three important regulatory zones: the Ini, the TATA box and another zone even further upstream called UPS. The union of these chromosomal units located on the DNA upstream of the transcription start point is called the promoter; TBP protein binds to TFIID, while TFIIA binds to TBP; The TFIIB protein binds to the start site on DNA called Ini; RNA polymerase II to which the TFIIF protein is bound joins the TFIIB protein that is still localized to the Ini region; A protein complex called STF binds to RNA polymerase II, which is made up of specific proteins (transcription factors) for that particular gene; TFIIH protein activates RNA polymerase II, which initiates transcription; To the RNA is, then, added a cap at 5', composed of a molecule called 7- methyl guanosine bound to the mRNA by a 5'-'5 triphosphate bridge (Figura 25). In addition, approximately 200-2000 Adenine (poly-A) nucleotides are added to the 3' terminal portion of the mRNA. The function of this Adenine tail is to stabilize the mRNA. Figura 26. Transcription in Eukaryotes. Boxes read from top to bottom, starting from the left. During RNA strand synthesis, RNA polymerase II reaches a termination sequence and stops. The transcription stop in eukaryotes is assisted by transcription factors which are proteins bound to RNA polymerase II and then to DNA in order to regulate transcription. 41 Splicing RNA, as we saw in the first chapter, can be messenger RNA (mRNA), ribosomal RNA (rRNA) or transfer RNA (tRNA). In the case of mRNA, after the processing that leads to its maturation (cap + poly-A), there is also another phase called splicing.. Figura 27. Structure of eukaryotic mRNA and spliceosome During splicing, some regions of DNA are eliminated while the remaining regions are bound together to form a shorter RNA (mRNA) that will then be transported into the cytoplasm to be translated into protein. The regions of RNA that are eliminated are called introns, those that remain are called exons (Figura 26). Exons are the regions within the mRNA that will be translated to proteins. A not yet mature eukaryotic mRNA is formed by exons and introns that will be excised later. At the ends we have the cap (5') and the poly A (3'). The introns contain sequences such as "GU", "A branch site", "Pyrimidine rich region" and "AG region". In addition, the protein complex related to splicing is called spliceosome consisting of the subunits U1, U2, U4, U5 and U6. The U3 subunit does not take part in the splicing process and has, instead, a regulatory function in the synthesis of ribosomal RNA. The stages of splicing are: The U1 and U2 subunits bind to RNA in the "GU" region and fold it; The U4, U5, and U6 subunits bind to the "AG" region; The interaction between the U1+U2 and U4+U5+U6 subunits cause the end of the intron where the "GU" zone is located to be cut off and joined on the zone called the "A branch site"; A cut is also made on the other end where the "AG" region is located; Introns are excised while exons are joined (Figura 27). Figura 28. Stages of splicing. Boxes read from top to bottom, starting from the left An example figure describing, in general, transcription and translation in prokaryotes and eukaryotes can be viewed in Figura 28. 43 Figura 29. Difference between transcription and translation in prokaryotes and eukaryotes The "GU" and "AG" regions just studied also offer another possibility. The exons can be used with different combinations in order to create mRNAs of different types. In this way, the same eukaryotic gene can be transcribed into dozens of mRNA molecules that can be translated into as many proteins. In the human genome, in fact, there are about 20,000 genes but more than 2,000,000 different proteins. Translation in prokaryotes In molecular biology, translation means protein synthesis on an RNA template and occurs in the cytoplasm. The protein complex that translates mRNA into protein is the ribosome. The ribosome always moves in the 5'->3' direction as do also the enzymes DNA and RNA polymerase. The ribosome consists of 2 subunits: a large subunit, called 50S and a small subunit, called 30S. The total complex is called 70S and not 80S because the sedimentation coefficients (the sedimentation process is a laboratory technique) are not additive. The 30S subunit of the ribosome contains 21 "r" proteins and one rRNA molecule (16S), while the 50S subunit contains two types of rRNA (23S and 5S) and 31 "r" proteins. Both subunits, then, are formed by 3 sites E, P and A. On these sites are positioned molecules that are called tRNAs. The tRNA molecules are RNAs that carry the amino acids that represent the fundamental units of proteins. Almost all biological structures within cells are formed by proteins that have structural or regulatory functions (enzymes). In addition, multiple proteins can form protein complexes such as, for example, the ribosome. One important thing to keep in mind is that mRNA molecules are read in triplets by the ribosome. That is, for each nucleotide triplet there is a single amino acid (although the reverse is not true) that is carried by the tRNA. The correspondence between nucleotide triplets and amino acids is called the universal code. In prokaryotes, the biological process of splicing does not exist because there are no introns in the genome of bacteria or archaebacteria. In general, the tRNA carries on one end the amino acid, while on the other shows the anticodon, that is the codon recognition triplet (triplet on the RNA). The first tRNA that arrives always carries with it the amino acid "formyl-Methionine" (fMet) that binds to the codon "AUG" (translation start sequence) codon on the mRNA. Usually tRNAs carrying amino acids are also referred to as "aa-tRNAs". This first tRNA carrying fMet goes to position on the P site of the 30S small subunit of the ribosome. When the second tRNA arrives with its amino acid, it is positioned on the A site of the small subunit. At this point, the amino acid of the first tRNA (on site P) goes to bind to the one on site A, while the first tRNA always remains on pocket P. The shift takes place only with regard to the first amino acid that moves to the right above the second amino acid. At this point, the ribosome moves forward (direction 5' - 3'), so that the first tRNA is now positioned on the E pocket, the second tRNA with the two amino acids is now on the P pocket, while the A pocket remains empty. A third tRNA with its own amino acid locates on the empty A pocket and the cycle continues until the ribosome encounters the three stop codons (UAA, UAG and UGA) and the translation process stops. Amino acids are detached from the mRNA by the P pocket and the protein is ready, subject to targeting in competent cellular districts. Figura 29 shows a graphical summary of how translation occurs in prokaryotes. Figura 30. Translation in prokaryotes in detail. The boxes are read from top to bottom, starting from the left, by turning the page horizontally There are three initiating factors in prokaryote translation: IF1: This protein binds to the A site and does not allow amino acids to bind before fMet arrives at the P site; IF2: Its sole function is to carry the first amino acid fMet to the P site. It does not carry any other amino acids; IF3: This protein has many functions, including stabilization of the 30S subunit and proper attachment of mRNA on E, A and P sites. These factors are arranged on the mRNA-bound 30S subunit positioned with the P pocket on the start codon (AUG, GUG or UUG). A purine-rich sequence called Shine-Dalgarno is present 10 bases upstream of the start and binds almost perfectly to a complementary region of the 16S rRNA of the 30S subunit. Specifically: 45 IF1 and IF3 are located on the 30S subunit at the A site; IF2 transports fMet-tRNA to the P site of the 30S subunit; The aa-tRNA carrying fMet contains an anticodon that is complementary to the start codon (in case the start codon is AUG, the anticodon is UAC). By anticodon, we mean the sequence on the aa-tRNA that is complementary to a generic triplet on the mRNA; After the arrival of fMet, IF1, IF2 and IF3 detach from the 30S ribosomal subunit. The 50S subunit joins the 30S subunit; EF-Tu protein transports the second aa-tRNA on the A pocket; The 50S subunit contains an rRNA called 23S that has peptil-transferase activity (Figura 30). It is, that is, able to bind the amino acid of the aa-tRNA on the P-pocket to the one just arrived on the A-pocket; Figura 31. Peptil-transferase activity. Peptide bond formation occurs through a reaction between the peptidyl-tRNA polypeptide at the P site and the amino acid of the aminoacyl-tRNA at the A site. This activity is borne by the 50S subunit in prokaryotes and 60S in eukaryotes. Thus, the first amino acid always remains more "up" than the newly loaded amino acids. At this point the ribosome shifts (translocates) one triplet thanks to the EF- G protein in the 5'->3' direction, so that the unloaded tRNA positions itself on the E pocket and then exits, in the P pocket the peptidyl-tRNA (loaded tRNA) remains, while the A pocket remains empty waiting for a new amino acid transported by EF-Tu; The cycle continues until the ribosome encounters stop codons (UAG, UAA, UGA); Thus, RF1 and RF2 release factors intervene to block translation, while a third RRF factor releases the 50S subunit from the 30S subunit. Translation in Eukaryotes Translation in eukaryotes is very similar to that of prokaryotes but the ribosomes are larger (60S + 40S = 80S) and much more enzymatic factors are involved. In addition, the E-pocket is not present in the ribosomes of these higher organisms. Again, translation moves in the 5' -> 3' direction and the mechanism of amino acid entry and exit from the P and A pockets is the same when compared to that of prokaryotes. Since there is no E pocket, amino acids exit through the P site. Let's see how eukaryotic translation occurs in detail (Figura 31 and Figura 32): The pre-start complex is called 43S and is formed by the 40S ribosomal subunit + eIF2, eIF3, Met-tRNA (first aa-tRNA), eIF1 and eIF1A. This complex has not yet bound to the mRNA. At this point the protein factor eIF2 binds the first aa-tRNA which in this case is Met-tRNA (without the formyl group); Another complex consisting of eIF4A, eIF4B, eIF4E and eIF4G binds to the mRNA. The complex eIF4A+eIF4E+eIF4G is called eIF4F. In addition, a protein called PABP binds to poly-A; At this point the 43S complex joins the mRNA-binding complex and they all form the 48S complex (excluding PABP) together; The 48S complex moves along the mRNA until it reaches the "AUG" codon; At this point the 60S subunit arrives. All the remaining factors detach and the complete 80S ribosomal complex is formed; As in bacteria also in prokaryotes there is a protein factor called EF-Tu that transports the aa-tRNAs after the first amino acid Met-tRNA has bound to the pocket P. The pocket E in Eukaryotes, as pointed out in previous paragraphs, does not exist. In the next step, EF-Tu positions the various aa-tRNAs on the ribosome using the GTP molecule as an energy source, which hydrolyses into GDP. The GDP reloading enzyme is called EF-Tu-Ts; 47 Instead, the peptil-transferase activity is carried out by the 28S rRNA of the 60S subunit assisted by some ribosomal proteins; As mentioned above, subsequent amino acids bind by the same mechanism known in prokaryotes. The same is true for the exit of "unloaded" tRNAs from the ribosome, with the difference being the absence of the E pocket from the ribosomal structure (Figura 31); Also, in Eukaryotes the EF-G protein is used for translocation; Stop codons in Eukaryotes are recognized by a release factor eRF1 assisted by another factor eRF2. The release factor eRF1 terminates translation by releasing the protein chain. RRF (ribosome recycling factor) aided by the EF- G protein releases the 60S subunit from the ribosome. In eukaryotes, ribosomes are synthesized in the nucleolus and transported to the cytoplasm, where they can be either free or bound to the Rough Endoplasmic Reticulum (RER). In the RER, newly synthesized polypeptides are bound to chemical molecules so that they are "sorted" into the most appropriate cytoplasmic compartments. Figura 32. General scheme of translation start, elongation and termination. The basic mechanism of translation is the same already seen in Prokaryotes although in Eukaryotes the E site is missing and we do not find the presence of the Shine-Dalgarno sequence for the recognition of the ribosome to the messenger RNA molecule. Moreover, in Eukaryotes, the first amino acid is Methionine and not Formyl-Methionine (prokaryotes). Figura 33. Translation in Eukaryotes. The beginning of Translation in Eukaryotes takes place with the binding of the 40S subunit with other proteins to form the 43S complex. Other proteins subsequently bind to the messenger RNA molecule. When the 43S complex binds the mRNA this new complex is called 48S (more voluminous complex) and moves to search for the "AUG" codon. On eukaryotic ribosomal subunits the E site is not present, but the pattern in Figure 3.18 is also valid in eukaryotes, if we exclude the fact that the first amino acid transported on the P site is in this case Methionine and not formyl-Methionine as in prokaryotes. Regulation of transcription The regulation of gene expression is very important because it allows to modulate the action of genes through their translation into proteins. In some tissues some genes must be very active, in others the same genes must even be deactivated. This fine regulation is not yet fully understood but probably occurs through the following biological processes: Regulation by microRNAs: miRNAs are small RNAs that do not code for any protein but have a predominantly regulatory function. They are able to block the translation of specific target mRNAs that cannot be translated into proteins; 49 Epigenetics: through DNA methylation it is possible to block the transcription of certain genes. In addition, through chemical modification of so- called histone proteins (proteins constitutively linked to DNA) it is possible to activate or inhibit specific genes to which these histones are linked; Post-transcriptional editing: it is not yet completely clear how a cell is able to modify the nucleotide sequence of specific mRNAs already processed thus regulating their action after transcription, but this process has been observed for some eukaryotic cell types; Intervention of other non-coding RNAs: in addition to microRNAs (miRNAs) there are other classes of RNAs that do not code for any protein but are able to block the translation of specific mRNAs or regulate their expression. Notes on microRNA MicroRNAs (miRNAs) are RNAs that do not code for any protein but have regulatory activity. This activity takes the form of blocking the translation of specific target mRNAs. The miRNAs are transcribed in the nucleus into pri- miRNAs (thousands of bases) and then are cut into pre-miRNAs (hundreds of bases) by the Drosha/Pasha protein complex. They then fold up, coming to form a hairpin structure. The pre-miRNAs arrive in the cytosol through a protein called Exportin 5 and are transformed by the Dicer protein into the mature form (a linear structure, no longer a hairpin with a length of about 20 bases). Each microRNA can block entire mRNA families (up to a few dozen) through two processes: cutting the mRNA or simply blocking its translation. For these reasons, miRNAs are very important in the control of gene expression and are related to many diseases, in case they do not function normally (Vision & Croce, 2009). Notes on Transposons Transposons are DNA sequences that move within the genome. They were discovered by Barbara McClintock (McClintock, 1950) in the 1950s, who was awarded the Nobel Prize in Physiology and Medicine only in 1983 for identifying transposable elements. Transposons are classified according to the method by which they move along genomes: Class 1 or retrotransposons: retrotransposons are copied to a new region of the genome through retrotranscription from RNA to DNA and subsequent copying of DNA to a new chromosomal location (e.g. LINEs and SINEs). LINEs (non-LTR retrotransposons) possess the information for transcription and translation of their own retrotranscriptase while SINEs (non-LTR retrotransposons) use the LINE enzyme for replication or, in some cases, cellular RNA polymerase III. Class 2 or DNA transposons: DNA transposons are not copied to another location, but cut and pasted into a new genomic region. The enzymes involved are transposases (proteins that cut nucleotide sequences). However, it should be noted that some DNA transposons are also copied and glued in the same way as retrotransposons. Notes on epigenetics The external environment can influence the function of genes and these characteristics, thus acquired, can be inherited. Unfortunately, studies investigating how environmental factors can influence an individual's genetics are difficult to plan in the laboratory. However, a great deal of research has been carried out in this area. For example, Swedish scientists have recently conducted investigations into whether nutrition can influence the mortality rate associated with cardiovascular disease and diabetes and whether these effects can be passed on from parents to their children and grandchildren (Kaat, 2002; Bastian 2008). Using data from three generations of families from the 1890s onwards, these researchers found that food deprivation in parents during a critical period of development, just before puberty, was able to prevent offspring from developing cardiovascular disease but increase others such as diabetes. These results show how diet can cause changes in gene expression that can be inherited in subsequent generations and, in general, how epigenetics can modify organisms by adapting them to their environment in the short term. These changes are usually heritable over a few generations. Under the name epigenetics, therefore, are understood all those chemical modifications of the DNA that do not change its sequence. Epigenetics is implemented through the direct methylation of specific DNA bases (C and G) or through histone modifications (methylation, acetylation, ubiquitination, etc.). Histones are proteins that are constantly bound to DNA and have both a structural and a regulatory function (see later sections on 'Direct DNA methylation' and 'Histone modifications'). DNA methylation is a chemical process by which a methyl group is bound to certain nitrogenous bases, or nucleotides, in DNA. DNA is composed of four nucleotides Adenine (A), Thymine (T), Cytosine (C) and Guanine (G). Methylation is highly specific and occurs in regions where the cytosine nucleotide is located near the guanine nucleotide (4-6). These sites, known as CpG islands, are methylated by one of three enzymes called DNA methyltransferases (DNMTs). The insertion of methyl groups changes the structure of DNA, altering the interactions of a gene with the transcriptional machinery. DNA methylation can be used to distinguish which copy of the gene is inherited from the father and which from the mother. This biological phenomenon, known as imprinting, is used by cells to silence certain genes or entire chromosome regions. In general, methylation inhibits the expression of genes within the DNA molecule (Lewin, 2012). On the other hand, histone modifications are chemical changes that do not directly affect DNA but the proteins that surround it (histones). Depending on the type of histones 51 modified, DNA may be more or less expressed during the cell cycle and development. The first human disease identified as being related to epigenetics was cancer. Early studies showed that DNA from diseased tissues obtained from patients with colorectal cancer had a lower degree of methylation when compared to DNA from normal tissues of the same patients (Feinberg, 1983; Jones, 2002; Grønbaek, 2007). As methylated genes are generally deactivated, the loss of DNA methylation can cause some genes to be activated. On the other hand, excessive methylation can override the expression of protective genes called onco-suppressors. Regarding evolution, Neo-Darwinists regard the gene that is the target of the mechanism of evolution by natural selection as the fundamental unit of heredity. They combine in a single theory the ideas of Charles Darwin with those of his contemporary Gregor Mendel (the father of genetics) and other contemporary authors. According to this hypothesis, evolution occurs because of mutations that lead to changes in the frequency of alleles (the same gene represented by either the maternal or paternal chromosome) due to genetic drift (variation in gene frequencies in a population due solely to chance) or natural selection. Nutrigenomics What is Nutrigenomics all about? Nutrigenomics represents the most fascinating frontiers of modern food science. It aims, in fact, to investigate how much the genome of an individual can influence the diet and vice versa. Nutrigenomics is the discipline that correlates genomics with the study of nutrition. In particular, it studies how each of us, who has a different DNA from the others, reacts to molecules in food, or how diet can influence gene transcription, protein expression and metabolism. In addition, nutrigenomics studies nutrition at the level of entire human populations thus combining the knowledge of molecular anthropology with that of nutrition. Origin of human beings How did humans become meat eaters? Our earliest ancestors fed on plants, seeds, and nuts. In the rainforests that existed in vast areas of the African continent, among trees, our next ancestor had just evolved twenty million years ago. It is the first primate ever known: some researchers have nicknamed it Purgatorius. It resembled a cross between a mouse and a squirrel. Our ancestor was a skilled tree-climbing primate and a vegan. It abandoned the insect-based diet of its ancestors in favor of the abundant new fruits and flowers, carving out a comfortable niche high up in the branches. For tens of millions of years, Purgatorius' descendants have committed to their plant-based diet. From tiny apes to gorilla-sized monkeys, they survived primarily on tropical fruits, seasoning their meals probably with occasional annelids or nematodes (often by accident). About 15 million years ago, they diversified a bit, adding hard seeds and nuts to their diet, but stayed true to their vegan roots. Then, about 6 million years ago, Sahelanthropus tchadensis entered the African primate scene. With the advent of Sahelanthropus, our lineage likely separated from that of our closest cousins, the chimpanzees and bonobos. The word Homo denotes modern humans and all extinct species closely related to us - and Sahelanthropus was the first primate from which we are probably derived (Figure 1). A short, flat-faced creature with a small brain, we don't know if it walked upright on two legs. It had smaller canine teeth than its ancestors and thicker enamel, suggesting that its diet required more chewing and grinding than Purgatorius-type fruit and flower meals. However, meat eating had not yet taken hold among our ancestors. Sahelanthropus probably ate tough, fibrous plants supplemented with seeds and nuts. Later, the various species of Australopithecus (the earliest find is called Lucy) that lived between 4 and 3 53 million years ago in the forests, riverine forests, and seasonal floodplains of Africa were not meat-eaters either. Their dental microwear-the pattern of microscopic pits and scratches left on the surface of their teeth by the food they ate-suggests a diet similar to that of modern chimpanzees: some leaves and shoots, lots of fruit, flowers, a few insects here and there, and even the bark of trees. Did Australopithecines ever eat meat? It's possible. Just as modern chimpanzees occasionally hunt colobus monkeys, our ancestors may also have occasionally dined on the raw meat of small apes. But the guts of early hominins would not have allowed them to have a meat-rich diet like the one Americans eat today. Their intestines were characteristic of fruit and leaf eaters, with a large caecum, a pouch full of bacteria at the beginning of the large intestine. Figure 1. Tree of life of humans If an Australopithecus (Figure 3) decided to eat meat - for example, he ate a few zebra steaks - he would likely suffer from colonic torsion, experiencing stomach pain, nausea, and bloating, perhaps resulting in death. Yet despite these dangers, by 2.5 million years ago, our ancestors had become meat eaters. By 2.5 million years ago, our ancestors were ready for meat: They had the tools to get it and the bodies to digest it. It seems that our bodies had to adapt gradually before they latched onto seeds and nuts, which are high in fat but low in fiber. If our ancestors ate a lot of these, such a diet would have promoted the growth of the small intestine (where lipid digestion takes place) and the narrowing of the caecum (where fiber is digested). This would have made our intestines better for processing meat. A diet of seeds and nuts may have prepared our ancestors for a carnivorous lifestyle in another way as well: It may have given them the tools to carve carcasses. Some researchers suggest that the simple stone tools used to pound seeds and nuts could have been easily reassigned to break animal bones and cut pieces of meat. And so, by 2.5 million years ago, our ancestors were ready for meat. They had the tools to get it and the bodies to digest it. But being capable is one thing; having the will and ability to go get meat is another. At some point something inspired our ancestors to look at antelopes and hippos as potential dinners? The answer, or at least part of it, may lie in a change in climate about 2.5 million years ago in Africa. The African rift valley (Figure 2) system represents a unique environment for understanding the origin and evolution of mankind; for the important paleoanthropological discoveries in Ethiopia, Kenya, Tanzania, Uganda and Zaire, the rift valley is in fact considered the cradle of mankind, i.e., the place where our species has evolved and diversified over the last million years. Figure 2. Rift Valleys The association between paleoanthropological findings and African rift valleys is not accidental, since the volcanic and tectonic activity responsible for the formation of these tectonic depressions and the simultaneous sedimentation have created ideal conditions for the proliferation of life. In parallel, lava flows, volcanoclastic sediments and volcanic ash quickly covered animal and plant 55 remains, thus allowing the preservation of fossils. To the Ethiopian rift valley, and in particular to the Afar depression, are associated many findings of hominid fossils, suggesting that this area has represented a crucial zone for the process of hominization in the last million years. The rift valley probably divided a primate population into two subpopulations. Figure 3. Tree of life of modern humans One of these found itself in the middle of the savanna. In this region, much of the rainforest became sparsely wooded grasslands, with few high-quality plants to eat but with more and more animals to graze. During the long dry period, our ancestors would have had trouble getting enough food, and to find their usual food, they would have had to spend more time and calories. Early hominins were at an evolutionary crossroads. Our ancestors, therefore ceased to be vegetarians and also turned to eating meat. They also became more aggressive and social to fight other carnivores in groups. Hominins thus became omnivorous and opportunistic. If something was edible and it was there, they ate it. By 2.6 million years ago, there was a lot of meat around. Just as Purgatorius took advantage of climate change and a new abundance of fruits, their descendants, the early Homo, successfully adapted their diets to changes in their environment. But this time, it meant hunting for meat [Sci]. From vegetarian to an omnivorous species Man evolved from that moment in a direction that led him to be omnivorous. Starting with our teeth, which are complete and multifunctional, very different from those of herbivores. We do not have teeth with infinite growth, we do not have mouths suitable for the direct collection of food from the ground and therefore we cannot grasp, cut the grass with our incisors and graze as herbivores do, we do not have molars suitable for shredding vegetables and we cannot easily swallow foods that are too fibrous and coarse. Our molars are also much more similar to those of carnivores and omnivores than to those of herbivorous animals. Moreover, the area between the canine and the first premolar is very sharp and pointed, similar to the teeth of carnivores and therefore suitable for cutting a piece of meat. In man the presence of the opposable thumb, suitable for grasping preys and for building weapons, in order to face the lack of claws, is another characteristic in favor of the omnivorous diet, as well as the frontal disposed eyes, which is a typical characteristic of predators, because it allows binocular vision, as opposed to preys which have the eyes disposed laterally in order to widen as much as possible the visual field and to be able to sight and run away from predators. Other primates are omnivorous as well, including chimpanzees and bonobos, that is the species closest to Homo sapiens (Figure 4). Figure 4. Appearing of Homo sapiens 57 These ones feed just on meat, actively hunting in the forest other animals, plundering the eggs and the young born from the nests of the birds, besides habitually feeding on insects, using twigs to extract ants and termites. The gorilla instead is a herbivore and the only animal sources of which it feeds are some insects, in percentage less than 1%. Even the enzymes involved in the digestive process are a clear evidence of our being omnivorous. In fact, we are equipped with lipases, enzymes that break down fats, such as triglycerides, which are only of animal origin and not vegetable. And pepsin, an enzyme that attacks proteins by breaking them down into amino acids. So nature has genetically programmed us to perfectly digest meat, which has been present in our diet for hundreds of thousands of years, as shown