I am sharing 'Lecture5_Genomics_data_analysis' with you

Full Transcript

Lecture 5: Genomics Data Analysis Part_1 ‘The important thing is not to stop questioning’ Albert Einstein Mr. Nabras Al-Mahrami Nabras.almahr...

Lecture 5: Genomics Data Analysis Part_1 ‘The important thing is not to stop questioning’ Albert Einstein Mr. Nabras Al-Mahrami [email protected] 1 Refreshing DNA is a polymer of nucleotides (sugar, phosphate and one of four nitrogenous bases (A,T,G,C)) DNA is double stranded, with complementary bases pairing Size/length unit = base pairs (bp), kilobases (Kb), megabases (Mb) Order of bases along one strand is referred to as the DNA sequence 2 Human genetic variation Genetic variation is the difference in DNA sequences between individuals within a population (Changes) Understanding the role of genetic variation ( harmful VS. beneficial impact) Genetic mutations which cause the disease sickle cell anemia have also been found to have a protective effect – individuals with sickle cell trait (i.e. carriers of the recessive gene) are less likely to die from malaria, which is caused by parasites and spread by mosquitoes. Luzzatto, L., 2012. Sickle cell anaemia and malaria. Mediterranean journal of hematology and infectious diseases, 4(1). 3 Understanding the role of genetic variation ( disease causing ) Some variant leads to disorders, in the form of: single-gene disorders: Follow Mendelian inheritance patterns Generally rare Caused by variant in one gene Examples: Cystic fibrosis and Huntington’s disease. Complex diseases: Common Diseases Caused by the combined effects of multiple genetic and environmental factors. Heart disease, cancer, diabetes, and psychiatric disorders 4 Genetic variant classification Based on Location Genic Variants: Occur within genes and may directly affect gene function. 1. Exonic: Within the coding region of a gene. 2. Intronic: Within the non-coding, intron regions of a gene. Based on Scale 1. Single Nucleotide Variants (SNVs): Changes in a single nucleotide. Synonymous: Do not change the amino acid sequence of the resulting protein. Non-synonymous: Change the amino acid sequence (missense and nonsense). 2. Insertions and Deletions (Indels): Addition or removal of nucleotides. 3. Copy Number Variants (CNVs): Variations in the number of copies of a particular gene or region. 4. Structural Variants: Larger-scale changes like duplications, inversions, and translocations. 5 Genetic variant classification Single Nucleotide Variants (SNVs) 6 Single Nucleotide Variants (SNVs)- Non-synonymous For more information: 7 https://medlineplus.gov/genetics/understanding/mutationsanddisorders/possiblemutations/ Insertions and Deletions (Indels) For more information: 8 https://medlineplus.gov/genetics/understanding/mutationsanddisorders/possiblemutations/ Copy Number Variants (CNVs) For more information: 9 https://medlineplus.gov/genetics/understanding/mutationsanddisorders/possiblemutations/ Genetic variant classification Based on Origin Germline Variants Somatic Variants 10 Genetic variant classification Based on frequency 1. Common Variants: Frequency: > 5% in the population. These are often called single nucleotide polymorphisms (SNPs). Because they are so common in the general population, they are typically considered benign or non-pathogenic, unless there's specific evidence to the contrary. 2. Low Frequency Variants: Frequency: 1% - 5% in the population. These variants are less common, and their significance can vary. Some might be benign, while others might have a more pronounced effect on phenotype or disease risk. 11 Genetic variant classification Based on frequency 3. Rare Variants: Frequency: < 1% in the population. Rare variants are more likely to be pathogenic, especially if they disrupt the normal function of a gene. However, being rare doesn't automatically mean a variant is harmful. Each rare variant requires careful evaluation in the context of its potential impact on protein function, disease association studies, and other lines of evidence. 4. Very Rare: Frequency: Extremely low, sometimes seen in only one family or individual. These are often the most challenging to interpret, as there's limited data available about their effects. They might be recent mutations or simply variants that haven't been widely studied. National Organisation for Rare Disorders: https://rarediseases.org/ 12 HGVS nomenclature Why do we need a standardized mutation nomenclature system ? To ensure efficient and accurate reporting. To ensure the increase in sequence variation detected is quality controlled, documented and stored correctly to enable future availability. Practical using of HGVS nomenclature (General Rules) All variants should be described at the most basic level, i.e. the DNA level Descriptions should always be in relation to a type of sequence, either a genomic or a coding DNA reference sequence The main components for variants HGVS nomenclature are reference, and descriptions ( Sometimes: gene name and protein description are added) Gene symbol Reference: Basic level description (Protein description) 13 Practical using of HGVS nomenclature (Basic 1) Example 14 Practical using of HGVS nomenclature (Basic 1) Description of a variant it should be preceded by a letter indicating the type of ref erence sequence used c. for a coding DNA sequence (like c.76A>T) g. for a linear genomic sequence (like g.476A>T) m. for a mitochondrial sequence (like m.8993T>C) r. for an RNA sequence (like r.76a>u) p. for a protein sequence (like p.Lys76Asn) MUST READ: https://hgvs-nomenclature.org/stable/background/simple/ 15 Practical using of HGVS nomenclature (Basic 2) >: indicates a substitution at the DNA level (e.g. c.76A>T) _ (underscore): indicates the range of affected residues, separating the first and last residue affected (e.g. c.76_78delACT) del: indicates a deletion (e.g. c.76delA) dup: indicates a duplication (e.g. c.76dupA) ins: indicates an insertion (e.g. c.76_77insG) MUST READ: https://hgvs-nomenclature.org/stable/background/simple/ 16 Practical using of HGVS nomenclature (Basic 3) Nucleotide Numbering (coding DNA) There is no nucleotide 0 Nucleotide 1 is the A of the ATG‐translation starting codon The next exon is completing the count Nucleotide Numbering (Before starting codon and stop codon) Before starting the codon is starting count by minus like ‐1 for the first nucleotide that comes before ATG etc. e.g. c.‐12G>A The first nucleotide of stop codon is *1, the next *2, etc. e.g. in c.*70T>A MUST READ: https://hgvs-nomenclature.org/stable/background/simple/ 17 Practical using of HGVS nomenclature (Basic 3) Nucleotide Numbering (introns) Beginning of the intron: the number of the last nucleotide of the preceding exon, a plus sign and the position in the intron, e.g. c.88+2T>G End of the intron:the number of the first nucleotide of the following exon, a minus sign and the position upstream in the intron, e.g. c.89‐1G>T In the middle of the intron: numbering changes from "c.77+.." to "c.78‐..” (for introns with an uneven number of nucleotides the central nucleotide is the last described with a ”+” MUST READ: https://hgvs-nomenclature.org/stable/background/simple/ 18 Practical using of HGVS nomenclature (Basic 3) MUST READ: https://hgvs-nomenclature.org/stable/background/simple/ 19 Practical using of HGVS nomenclature (Basic 3) Description the following a variant HBB NM_000518.5:c.*112A>G HBB NM_000518.5:c.436T>C (p.Tyr146His) 20 Thank You

Use Quizgecko on...
Browser
Browser