Structural Bioinformatics Lecture 8 PDF

Bioinformatics BIO417 Lecture 8 Structural Bioinformatics Prepared by Dr. Rami Elshazli Associate Professor of Biochemistry and Molecular Genetics The building blocks of proteins  Proteins often mediate the essential structure and function of cells, maintaining the integrity of all molecular and biological functions.  The protein sequence consists of 20 different naturally occurring amino acids that serve as building blocks of proteins.  Each amino acid contains a central alpha carbon (Cα) which is attached to an amino group (NH2), a hydrogen atom (H), and a carboxyl group (COOH).  It is diverse from one another due to the presence of side chain which is represented by the R group attached to Cα.  Glycine is an exception with only single H atoms in its side chain.  The peptide bond is formed through a condensation reaction, eliminating water between the carboxyl group (COOH) of one amino acid and the amino group (NH2) of other amino acids. Dr. Rami Elshazli Associate Professor of Biochemistry and Molecular Genetics Hierarchal Representation of Proteins  The protein molecules and their complexity in arrangements are described conventionally by four levels of structure; primary, secondary, tertiary, and quaternary.  Primary structure  The linear sequence of amino acids in the protein is generally referred to as the primary structure of the protein which includes all the covalent bonds between the amino acids.  Proteins are connected as a linear polymer of 20 different amino acids by forming a peptide bond between amino acids. Dr. Rami Elshazli Associate Professor of Biochemistry and Molecular Genetics  Secondary structure  The secondary structure of a protein refers to regular spatial arrangements of adjacent linear amino acids of the polypeptide chain.  The major secondary structural which are identified during protein structure are alpha (α) helix and beta (β) sheets.  Both helix and sheets are the only regular secondary structural components present in proteins due to the H-bond interactions between the backbone atoms of the amino acids which help them to make highly favorable and stable conformation.  The irregular structural components such as turns and loops are also observed in protein, mainly in the globular proteins. Dr. Rami Elshazli Associate Professor of Biochemistry and Molecular Genetics  Secondary structure  The α-helical conformation is generated by curving the polypeptide backbone to produce a regular coil.  In this helical structure, the backbone of the polypeptide chain is coiled around the axis of the molecule in such a way that the side-chain R groups of residues project toward outside from the helical backbone.  In α-helix, the H-bond is formed between the neighboring residues within a single chain.  β-sheets are formed by H-bond between adjacent polypeptide backbones in chains.  These sections of adjacent polypeptide chains are known as β-strands.  The β-sheets comprise of H-bonds formed between carbonyl oxygens and amide hydrogen on adjacent β-strands.  H-bond is almost in perpendicular to the extended β-strands. Dr. Rami Elshazli Associate Professor of Biochemistry and Molecular Genetics  Tertiary structure  The tertiary structure (3°) of a protein is described as the spatial and global 3D structure among all the amino acids in a polypeptide chain.  The tertiary structure was described as a molecule consisting of one polypeptide chain which is folded into unique configuration.  Quaternary structure  The tertiary structure describes the organization of a single polypeptide chain.  The quaternary structure is an association of two or more independently folded polypeptides within the protein through noncovalent interactions.  Most proteins do not function as a monomer but rather function as multi-subunit proteins.  The stabilization forces and interactions in the quaternary structure are of the same types as observed in the secondary Dr. Rami Elshazli and tertiary structure stabilization. Associate Professor of Biochemistry and Molecular Genetics Protein Structure Predictions  The goal of protein structure prediction is to elucidate a structure from its primary sequence, with accuracy comparable to results achieved experimentally using X-ray crystallography.  Why do we need to predict the structures of protein?  The answer lies in the fact that the protein structural attributes lead to biological functions, and computational prediction methods are the only way in all contexts when experimental techniques fail.  The 3D structure prediction from its primary structure is the much-debated area of structural bioinformatics.  Critical Assessment of Techniques for Protein Structure Prediction (CASP) provides a global benchmark for the exhaustive computational purpose.  The homology modeling methods are known as comparative modeling. Dr. Rami Elshazli Associate Professor of Biochemistry and Molecular Genetics Homology Modeling  This comparative modeling method is attributed to the fact that when the amino acid sequence of the query structure is homologous to that of the one or more experimentally known structures, the resulting structural fold will be similar.  If the percentage identity between the query sequence and the known structures falls in the “Safe” region two sequences may practically have an identical fold.  This “Safe” zone should have at least 30–50% identical amino acids in an optimal sequence alignment.  The process starts with sequence similarity search for the target sequence and the known structure of a protein using  Swiss Model, Phyre, and MODELLER are currently widely used as BLAST and PDB structure database (www.rcsb.org). free homology modeling tools.  The Swiss Model server (http://swissmodel.expasy.org/) uses multiple templates to create optimum backbone to Dr. Rami Elshazli compensate the missing residues which are not aligned in Associate Professor of Biochemistry and Molecular Genetics single template selection. Protein Structure Validations  Proteins are the workhorse of all the biological processes in an organism, and the key to their functions is the 3D structure and dynamics of the protein.  To get a better understanding of these functions, we need correct native fold prediction about the protein model.  Different packages and online servers are available for modeling quality assessment.  Ramachandran Plot  In a peptide bond, the likely conformations for a polypeptide chain are quite restricted due to the limitation of rotational freedom at φ (Cα−N) and ψ (Cα−C) angles by steric hindrance between peptide backbone and the side chains of the residues.  Ramachandran plot maps the entire conformational space for a polypeptide (plot of ψ vs φ) and illustrates the allowed and disallowed residues in this conformational space. Dr. Rami Elshazli Associate Professor of Biochemistry and Molecular Genetics  Ramachandran Plot  One can check the Ramachandran statistics to assess the allowed and disallowed residues in the protein model and select those folds in which more than 90% of residues fall in the allowable region.  As a rule of thumb, >90% allowed region criteria should be followed, or at least the residues critical for the function of protein or residues in the active site should be in the allowed region.  CASP: Benchmarking Validation Test  The combination of multiple methods called as “metaserver” approach is more reliable prediction than individual methods.  An ideal way to select the best prediction server is to model the protein structure using the validation benchmark.  CASP - a Critical Assessment of Protein Structure Prediction, is a method for model quality assessment and validation.  CASP is a benchmarking for all the available servers/programs. Dr. Rami Elshazli Associate Professor of Biochemistry and Molecular Genetics  Swiss Model Validation Server  Swiss model output page describes the model quality in two ways: GMQE (Global Model Quality Estimation) and QMEAN.  The GMQE is a model quality approximation which is based upon the target-template alignment and the template identification method.  The GMQE score should be between 0 and 1, where the number close to 1 indicates the higher reliability of the predicted model.  The QMEAN score (Quantitative model energy analysis) is an estimator based on geometrical properties providing both global and local score.  QMEAN Z-score around 0 indicates good agreement between  The QMEAN scores are transformed into Z-score whose value the model structure and the known experimental structures of indicates the experimentally determined X-ray structures. related attributes.  This QMEAN Z-score is an estimation of the “degree of  The prediction where the Z-score is

Structural Bioinformatics Lecture 8 PDF

Document Details

Tags

Related

Summary

Full Transcript