Podcast
Questions and Answers
Which of the following best describes genomics?
Which of the following best describes genomics?
- The analysis of protein structure and function.
- The application of computational tools to analyze large biological datasets.
- The study of all the information within the entire genome. (correct)
- The study of individual genes and their functions.
Bioinformatics is primarily focused on experimental laboratory techniques rather than computational analysis.
Bioinformatics is primarily focused on experimental laboratory techniques rather than computational analysis.
False (B)
Which of the following is a common application of bioinformatics?
Which of the following is a common application of bioinformatics?
- Comparing gene sequences between different species. (correct)
- Synthesizing new DNA molecules in a laboratory.
- Performing surgical procedures.
- Developing new antibiotics.
The process of 'stitching together' genomic sequences to create a complete genome sequence is facilitated by the field of ___________.
The process of 'stitching together' genomic sequences to create a complete genome sequence is facilitated by the field of ___________.
Match the sequencing method with its description:
Match the sequencing method with its description:
What is the primary goal of performing a BLAST search?
What is the primary goal of performing a BLAST search?
The ENCODE project revealed that all RNA transcripts encode proteins.
The ENCODE project revealed that all RNA transcripts encode proteins.
Which of the following databases houses a large collection of genomic DNA sequences, identified genes, and proteins?
Which of the following databases houses a large collection of genomic DNA sequences, identified genes, and proteins?
__________ splicing allows a single gene to code for multiple proteins.
__________ splicing allows a single gene to code for multiple proteins.
Match the 'omics' field with its description:
Match the 'omics' field with its description:
What percentage of the human genome is estimated to consist of protein-coding sequences?
What percentage of the human genome is estimated to consist of protein-coding sequences?
Personalized medicine involves treating patients based on a population average rather than their individual DNA sequence.
Personalized medicine involves treating patients based on a population average rather than their individual DNA sequence.
Which of the following is a primary application of the 1000 Genomes Project?
Which of the following is a primary application of the 1000 Genomes Project?
Sequences of DNA that are similar across species are said to be __________.
Sequences of DNA that are similar across species are said to be __________.
Match the sequencing technology with its generation.
Match the sequencing technology with its generation.
What is a key challenge in predicting protein structure from its amino acid sequence?
What is a key challenge in predicting protein structure from its amino acid sequence?
ENCODE project primarily focuses on identifying protein-coding genes within the genome.
ENCODE project primarily focuses on identifying protein-coding genes within the genome.
Which of the following represents a significant advancement in protein structure prediction?
Which of the following represents a significant advancement in protein structure prediction?
__________ are variations in a single nucleotide that are spread across the genome and contribute to individual differences.
__________ are variations in a single nucleotide that are spread across the genome and contribute to individual differences.
Match the application with the genomic technology.
Match the application with the genomic technology.
Which of the following is a key characteristic of third-generation sequencing technologies?
Which of the following is a key characteristic of third-generation sequencing technologies?
The human microbiome consists of only bacteria.
The human microbiome consists of only bacteria.
In the context of genomics, what is a 'contig'?
In the context of genomics, what is a 'contig'?
The area of genomics that allows the study of how protein sequences encoded by conserved genes have changed during evolution is known as __________.
The area of genomics that allows the study of how protein sequences encoded by conserved genes have changed during evolution is known as __________.
Match the genomic terms with their definitions.
Match the genomic terms with their definitions.
What is the primary purpose of microbiome transplants in medicine?
What is the primary purpose of microbiome transplants in medicine?
Synthetic genomes have enabled scientists to create entirely new organisms with no ancestral relationship to existing life forms.
Synthetic genomes have enabled scientists to create entirely new organisms with no ancestral relationship to existing life forms.
Which of the following is the correct order of steps in shotgun sequencing?
Which of the following is the correct order of steps in shotgun sequencing?
A major aim of genomics is to identify the __________ coding genes that are present in the genome.
A major aim of genomics is to identify the __________ coding genes that are present in the genome.
Match the type of sequencing with its description.
Match the type of sequencing with its description.
What is a common application of analyzing Neanderthal DNA?
What is a common application of analyzing Neanderthal DNA?
What does the acronym SNP stand for in the context of genomics, and why are they important?
What does the acronym SNP stand for in the context of genomics, and why are they important?
Describe the process of identifying protein-coding genes using cDNA sequencing and its advantages over genomic DNA sequencing.
Describe the process of identifying protein-coding genes using cDNA sequencing and its advantages over genomic DNA sequencing.
Distinguish between genomics and transcriptomics, and briefly explain how they complement each other in systems biology. Genomics studies the entire genome, while transcriptomics focuses on the __________.
Distinguish between genomics and transcriptomics, and briefly explain how they complement each other in systems biology. Genomics studies the entire genome, while transcriptomics focuses on the __________.
Describe how advances in sequencing technology have enabled rapid genome sequencing and led to the development of the new area of genetics called _________.
Describe how advances in sequencing technology have enabled rapid genome sequencing and led to the development of the new area of genetics called _________.
Flashcards
What is Genomics?
What is Genomics?
The study of all the information within an organisms DNA.
What is Bioinformatics?
What is Bioinformatics?
A field using computational techniques to organize, share, and analyze biological data.
Genome Sequencing
Genome Sequencing
The process of assembling many DNA fragments to determine the entire genome sequence.
What is Whole Genome Shotgun Sequencing?
What is Whole Genome Shotgun Sequencing?
Signup and view all the flashcards
What is Gene Identification?
What is Gene Identification?
Signup and view all the flashcards
What is BLAST?
What is BLAST?
Signup and view all the flashcards
What are regulatory elements?
What are regulatory elements?
Signup and view all the flashcards
What are genome databases?
What are genome databases?
Signup and view all the flashcards
What does the NCBI Genome Browser do?
What does the NCBI Genome Browser do?
Signup and view all the flashcards
What is the ENCODE project?
What is the ENCODE project?
Signup and view all the flashcards
What are SNPs (Single Nucleotide Polymorphisms)?
What are SNPs (Single Nucleotide Polymorphisms)?
Signup and view all the flashcards
What is the 100,000 Genome Project?
What is the 100,000 Genome Project?
Signup and view all the flashcards
What is the Microbiome?
What is the Microbiome?
Signup and view all the flashcards
What is Transcriptomics?
What is Transcriptomics?
Signup and view all the flashcards
What is Proteomics?
What is Proteomics?
Signup and view all the flashcards
What are Microbiome Transplants?
What are Microbiome Transplants?
Signup and view all the flashcards
What are Synthetic Genomes?
What are Synthetic Genomes?
Signup and view all the flashcards
What is Transcriptomics?
What is Transcriptomics?
Signup and view all the flashcards
What is Personalized Genomics
What is Personalized Genomics
Signup and view all the flashcards
Ancient DNA Analysis
Ancient DNA Analysis
Signup and view all the flashcards
What is Metagenomics?
What is Metagenomics?
Signup and view all the flashcards
What does 'synthetic biology' mean?
What does 'synthetic biology' mean?
Signup and view all the flashcards
What is CLUSTAL-W?
What is CLUSTAL-W?
Signup and view all the flashcards
Genes changing during evolution
Genes changing during evolution
Signup and view all the flashcards
Study Notes
Genomics and Bioinformatics Overview
- Genomics involves studying all information within a genome.
- Bioinformatics is a new area of genetics developed to analyze sequence information from genomics.
- Advances in sequencing technology enable the rapid sequencing of genomes.
- Bioinformatics uses computational techniques to organize, share, and analyze genomic information.
Genomics Focus
- Seeks to identify genome organization, including the number and arrangement of genes, and the role of non-coding DNA.
- Aims to identify similarities and differences between genomes of various species and individual humans.
- Led to the development of computational tools for analyzing large amounts of information, i.e., Bioinformatics
Bioinformatics Applications
- Involves the compilation and stitching together of genomic sequences to create complete genome sequences.
- Used for comparing gene sequences between species and identifying genes in genomic sequences.
- Aids in predicting amino acid sequences of potential proteins encoded by genes.
- Enables analysis of protein structure and prediction of protein functions.
- Helps in finding gene regulatory regions such as promoters and enhancers.
- Used to deduce evolutionary relationships between genes and organisms, and to identify where and when genes are expressed.
Genome Sequencing
- Genomic DNA is cut with different restriction enzymes to create overlapping fragments.
- Computer programs align overlapping sequenced fragments to assemble an entire chromosome.
- Alignment of fragments based on identical DNA sequences creates contigs.
- Software is used to find sequence overlaps in the fragments, and this is used to generate a full sequence from all fragments.
Shotgun Sequencing
- Whole Genome Shotgun Sequencing Method
- Genomic DNA is fragmented.
- Each fragment is sequenced.
- Align Contiguous Sequences
- A finished sequence is generated.
Genome Analysis
- A major aim of genomics is to identify the protein coding genes present in the genome.
- Genes can be identified by comparing sequences between species.
- BLAST (Basic Local Alignment Search Tool) can be used to perform this comparison.
Annotation of the Genome
- Programs have been developed to locate protein coding genes within genomes.
- Specific DNA sequences associated with genes are: TATA box (TATA(A/T)A(A/T)), CAAT box, translation initiation sites, splice sites, exons, introns, stop codon (ATG) and poly A addition (AATAAA) sites.
- Sequencing of cDNAs helps identify protein-coding genes and the location of exons.
Genomic Sequences and Comparisons
- A wide variety of genomes have been sequenced, and the sequence information is available in public databases.
- This allows for comparisons of genome size, number of genes and similarity to human genomes.
- Conserved genes can be identified and their evolution examined.
- Homologues of human genetic disease genes can be identified in other species.
- Roughly 50,000 species sequenced from https://www.ensembl.org/info/about/species.html
Protein Sequence Prediction
- The amino acid sequence of proteins can be predicted when protein coding genes have been identified, because the triplet code is known.
- It is possible to predict the order of amino acids of certain human growth hormones
Prediction of Protein Function
- Predicting the function of proteins is possible once related, protein coding genes have been identified.
- Previous work has identified proteins with particular functions, e.g., kinases, transcription factors.
- Knowing the sequence of these genes has allowed for the identification of amino acids characteristic of protein function.
- Proteins can then be searched for based on the characteristic features
Gene Families
- CLUSTAL-W allows the identification of predicted proteins that contain similar sequences, identifying gene families within species that have similar functions.
- Some genes are present in multiple copies within a species.
- Multiple sodium channel (SCN) proteins are encoded in the human genome.
Conserved Genes
- Investigating whether protein sequences are conserved between species can be performed once genes and predicted protein sequences have been identified.
- This can be performed using CLUSTAL-W which can be used to identify functional regions in proteins.
- Also allows study of how protein sequences encoded by conserved genes have changed during evolution
Proteins Encoded by Genes
- Comparison of sequences allows prediction of the function for the majority of the proteins encoded by human genes.
- However, the function of just over 40% of human genes is still unknown
Mapping Genes
- Characterization of the human genome sequence allows the mapping of genes to each of the chromosomes.
- Possible to locate the position of genes coding for specific protein sequences and the genes associated with human genetic disease
Protein Structure
- Protein structure can be identified using an X-ray diffraction pattern generated from protein crystals or using prediction software.
- Predict protein 3D structure from amino acid sequence
- Sequence -> secondary structure -> 3D structure -> function
Deepmind - Alphafold
- Biochemists teamed up with computational scientists to improve protein structure prediction, forming a consortium called Deepmind in collaboration with Google.
- Used Al and machine learning to develop a program, Alphafold, that can predict protein structures
- This is a huge advancement, as we can now predict sites in proteins where drugs could bind
Human Genome Sequence
- Was completed in 2003.
- Original human genome was sequenced from samples combined from a number of individuals.
- Sequences both by publicly funded consortium and a private company, Celera Genomics
- Identified the human genome contains 3 billion nucleotides
- 2% is protein coding sequences, while the other 98% is non-coding.
- The human genome contains 20,000 - 30,000 protein coding genes.
- A gene can often undergo alternative splicing therefore produce more than 1 protein
Databases
- Massive amount of sequence information is held on public databases
- The largest set of databases is held at The National Centre for Biotechnological Information (NCBI)
- NCBI holds databases of genomic DNA sequences, identified genes/proteins & genes are associated with human disease.
- NCBI also holds a database of cDNA sequences.
- By comparing cDNA sequences with the genomic sequence it is possible to identify the location of exons and introns.
- These cDNA sequences only contain exons which allows identification of gene regions showing alternative splicing.
Databases - NCBI
- NCBI’s Genome Browser enables viewing gene organization and identifying alternative splicing.
- Example: human growth hormone.
ENCODE Project
- ENCODE (Encyclopedia of DNA Elements) was set up to identify all elements in the genome not coding genes.
- ENCODE identified promoter and enhancer sequences.
- The project found that the genome contains regions of repetitive DNA that can vary between individuals.
- ENCODE also analysed all regions transcribed, identifying that most regions are transcribed into RNA, even if they do not encode proteins.
- Resulted in realization of the functional activities of RNAs and led to field of transcriptomics. Transcriptomics studies all genes expressed in different cell types and how expression changes given disease conditions.
1000 Genomes Project
- It followed original human genome sequence.
- Genomes of 1092 humans sequenced from different populations to identify small number of genetic differences
- Individual's genomes are 99.9% the same, but individuals have differences at single nucleotides spread across the genome known as SNPs (single nucleotide polymorphisms).
- Individuals also have variations in their repetitive DNA.
100,000 Genomes Project
- Followed completion of the 1000 genomes Project. Began by Genomics England in 2013 and was funded by the NHS.
- The project aimed to sequence 100,000 genomes of patients affected by rare disease to find genetic differences that lead to diseases
- Completed in 2018, this has allows identification of genetic differences in these patients.
- Genomics England expects the NHS will be first healthcare system to diagnose human disease by examining genomic sequence of the affected patient.
- This approach is known as personalized medicine, since patients are treated based on their DNA sequence.
Cost of Sequencing
- Cost of sequencing genomes has decreased exponentially from $10,000 in year 2000 to ~$1 in the current year/present day.
- Cost drop has been due to automated techniques
Personal Genomics Services
- The cost of genome sequencing and analysis has decreased leading to many companies now offering personalized genome services.
- Companies like 23andMe and AncestryDNA analyze genomes, compare sequences to populations, assist in analysis of ancestors and report on ancestry.
- 23 and me also offers reports on risk of disease.
Ancient DNA
- Researchers are also investigating the genomes of extinct species to help us learn more on evolution
- DNA can be extracted (esp. frozen sample) from bone and hair of extinct species (tens/hundreds of thousands years ago) and can be sequenced.
- The genomes of mammoths, cave bears, and ancient fish have already been sequenced
- Mummified remains and Neanderthals are also possible to have Ancient DNA
- Neanderthals are our closest extinct human relative.
- By analyzing Neanderthal DNA, we can study how modern humans evolved, including adaptation.
- These genes can be linked to acquisition of language.
- Small regions of Neanderthal DNA remain in the genomes of modern humans, and 23andMe can provide a report on the amount of Neanderthal DNA that you have in your genome.
Microbiome
- 600 to 1000 species of microorganisms are estimated to live on humans, primarily in the digestive tract.
- Microbiomes are the specific sets of microorganisms found on individuals.
- Microbiomes can be identified via sequencing of an individual.
- Differ between each individuals even though each individual has a generally constant personal microbiome
Changes to Microbiome
- Changes can occur in the microbiomes of individuals suffering from illnesses, e.g. IBS or acne.
- New therapies involve microbiome transplants from healthy patients.
Creating Genomes
- Synthetic genomes can address what the minimum number of genes necessary for a cell and if genomes can be synthesized.
- The JC Venter Institute demonstrated that 473 genes are sufficient in a microorganism.
- DNA with 473 genes was generated for a bacterial cell, and it was found the cell survived and grew which was found to be the key organism for new genes and synthetic biology.
- These genes synthesized in synthetic genomes could allow certain microbes to degrade pollutants, express proteins or synthesize biofuels
Omics
- Genomics is study of all genes in genome, its an advance in classic genetics because it considers entire genome rather than individual genes
- Genomics has led to the development of:
- Transcriptomics – study of all expressed genes in a cell.
- Proteomics – Study of any proteins in cell.
- Metabolomics – study of all proteins and enzymes for metabolism.
- Glycomics – Study of every carbohydrate in cell/ the carbohydrate-associated omics.
- Metagenomics - Analysis of genomes from entire environmental community.,
Studying That Suits You
Use AI to generate personalized quizzes and flashcards to suit your learning preferences.