Bio16 Computational Biology: Predictive Modelling using Response Surface Method PDF

Summary

This document provides an overview of predictive modeling using response surface methodology (RSM) in computational biology. The document covers experimental design principles tailored to RSM, outlines the process, and describes essential considerations. It's a good resource for understanding this approach.

Full Transcript

Bio16 Computational Biology Predictive Modelling using Response Surface Method Prepared by: Joseph Martin Q. Paet Biology Department, College of Science B...

Bio16 Computational Biology Predictive Modelling using Response Surface Method Prepared by: Joseph Martin Q. Paet Biology Department, College of Science Bicol University 1 Choosing Your Experimental Design Depends on the objective of the experiment Number Comparative Screening Response Surface of Factors 1-factor completely 1 _ _ randomized design Randomized block Central composite or 2-4 Full or fractional factorial design Box-Behnken Fractional Randomized block Screen first to reduce 5 or more factorial or Plackett- design number of factors Burman National Institute of Standards and Technology. (n.d.). How do you select an experimental design? https://www.itl.nist.gov/div898/handbook/pri/section3/pri33.htm 2 1 Response Surface Methodology (RSM) a collection of mathematical and statistical method for modeling analyze a process in which the response of interest is affected by various variables Main goal is to optimize the process Aydar, A.Y. (2018). Utilization of Response Surface Methodology in Optimization of Extraction of Plant Materials. http://dx.doi.org/10.5772/intechopen.73690 3 Considerations in RSM Requires a quantitative response affected by continuous factors Works best with only a handful of critical factors (screening) Produces an empirical polynomial model which approximates true response Seeks the optimal setting of factors Levels of Measurement (maximize, minimize, stabilize response) Myers, R. H., Montgomery, D. C., and Anderson-Cook, C. M. (2009). Response surface methodology: Process and product optimization using designed experiments (3rd ed.). New York, NY: Wiley. 4 2 RSM Workflow Screening Known Factors Unknown Factors Screening Trivial Vital Characterization Factor Effects and Interactions No Curvature? Yes Optimization RSM No Verification Confirm? Backup Yes Eureka! Myers, R. H., Montgomery, D. C., and Anderson-Cook, C. M. (2009). Response surface methodology: Process and product optimization using designed experiments (3rd ed.). New York, NY: Wiley. 5 RSM Workflow: Screening Ex: Anticancer Compound from Fungi Factors: Time of incubation Known Factors Unknown Factors Screening Temperature of incubation Screening Trivial Vital Yield Time Time + Temp Characterization Factor Effects and Interactions y = β0 + β1x1 + β2x2 + β12x1x2 + e No Curvature? Intercept Temperature Error Yes Optimization RSM No Verification Confirm? Backup Yield Yield Yield Yes Eureka! Time Temp Time+Temp Myers, R. H., Montgomery, D. C., and Anderson-Cook, C. M. (2009). Response surface methodology: Process and product optimization using designed experiments (3rd ed.). New York, NY: Wiley. 6 3 RSM Workflow: Characterization First-Order Model Screening Known Factors Unknown Factors y = β0 + β1x1 + β2x2 + β12x1x2 + e Screening Trivial Vital Characterization Factor Effects and Interactions Path of Region of No Improvement Curvature? Optimum Yes Optimization RSM No Current Operating Verification Confirm? Backup Conditions Yes Eureka! Contours of Constant Response “ “ y = β0 + β1x1 + β2x2 + e 7 RSM Workflow: Optimization First-Order Model Screening Known Factors Unknown Factors y = β0 + β1x1 + β2x2 + β12x1x2 + e Screening Trivial Vital Characterization Factor Effects and Interactions Path of Region of No Improvement Curvature? Optimum Yes Optimization RSM No Current Operating Verification Confirm? Backup Conditions Yes Eureka! Contours of Constant Response Linear Interaction Quadratic Second-Order Model y = β0 + β1x1 + β2x2 + β12x1x2 + β11x12 + β22x22 + e 8 4 Full Factorial Design (FFD) an experimental design that consists of two or more factors, with each factor having multiple discrete possible values or “levels” = nK 2k factorial = 2 levels at high (+) or low (-) values Consider an experiment with 2 factors at 3 levels each. How many trials would a FFD have if the set-up is done in triplicates? At 3 factors? At 4 levels? 3 levels 2 factors x 3 replicates = 27 runs 3 levels 3 factors x 3 replicates = 81 runs! 4 levels 3 factors x 3 replicates = 256 runs!! Myers, R. H., Montgomery, D. C., and Anderson-Cook, C. M. (2009). Response surface methodology: Process and product optimization using designed experiments (3rd ed.). New York, NY: Wiley. 9 FFD vs RSM Designs: CCD and BBD Full Factorial Central Composite Box-Behnken Design Design Design Myers, R. H., Montgomery, D. C., and Anderson-Cook, C. M. (2009). Response surface methodology: Process and product optimization using designed experiments (3rd ed.). New York, NY: Wiley. 10 5 FFD vs RSM Designs: CCD and BBD Contains an imbedded factorial or fractional factorial design with center points that is augmented with a group of 'star points' that allow estimation of curvature. If the distance from the center of the design space to a factorial point is ±1 unit for each factor, the distance from the center of the design space to a star point is |α| > 1. Central Composite Preferred 5 levels for each factor. Design Is an array with all the corner points (Red dots) and in the center the center points (Blue dot) and extra points, “star points”, circumscribed from the sides (Green dots). Myers, R. H., Montgomery, D. C., and Anderson-Cook, C. M. (2009). Response surface methodology: Process and product optimization using designed experiments (3rd ed.). New York, NY: Wiley. 11 FFD vs RSM Designs: CCD and BBD An independent quadratic design in that it does not contain an embedded factorial or fractional factorial design. The treatment combinations are at the midpoints of edges of the process space and at the center. These designs are rotatable (or near rotatable) and require 3 levels of each factor. Box-Behnken Design The factors are at the midpoints of the edges (Red dots) and in the center the Center point (Blue dot) Myers, R. H., Montgomery, D. C., and Anderson-Cook, C. M. (2009). Response surface methodology: Process and product optimization using designed experiments (3rd ed.). New York, NY: Wiley. 12 6 RSM Designs: CCD vs BBD Central Composite Box-Behnken Design Corner Points Design Extreme | No Extreme Levels 5|3 Number of Tests More | Less Star Points +|- Rotatable Yes | Yes Myers, R. H., Montgomery, D. C., and Anderson-Cook, C. M. (2009). Response surface methodology: Process and product optimization using designed experiments (3rd ed.). New York, NY: Wiley. 13 Sample Problems using Design Expert Std Order Run Order Time Temp Yield Problem: Optimize extraction process 1 12 80 170 76 for metabolites 2 9 90 170 78 Design: CCD 3 7 80 180 76 Number of Factors: 2 4 8 90 180 78 Number of Runs: 13 5 6 77.9 175 76 Factorial Points Replicates: 1 6 3 92.1 175 78 Star Points Replicates: 1 7 2 85 167.9 77 Center Point: 5 8 11 85 182.1 78 X1 = Time (min) -1 is 80, +1 is 90 9 4 85 175 80 10 10 85 175 81 X2 = Temperature (F) -1 is 170, +1 is 180 11 5 85 175 80 Response = Yield (ug/ml) 12 1 85 175 80 13 13 85 175 79 14 7 Analysis Procedure 1. Configure/Transform : Start with no transformation. 2. Fit Summary : Comparative statistics on polynomial models. 3. Model : Choose best (suggested) model for in-depth analysis. Return here for model reduction. 4. ANOVA : Check model, lack of fit values, R-square values. 5. Diagnostics : Examine diagnostic graphs to validate model. 6. Model Graphs : If model adequately represents response, generate contour and 3D plots. 7. Confirmation : Verify the model predictions with confirmation runs. Myers, R. H., Montgomery, D. C., and Anderson-Cook, C. M. (2009). Response surface methodology: Process and product optimization using designed experiments (3rd ed.). New York, NY: Wiley. 15 Fit Summary Guidelines Terminology Limit Correlation coefficient R2 Near or close to 1 R2adjusted Near or close to 1 R2predicted R2adjusted -R2predicted is less than 0.2 Lack of fit Greater than p-value (p = 0.05) Model Less than p-value (p = 0.05) Myers, R. H., Montgomery, D. C., and Anderson-Cook, C. M. (2009). Response surface methodology: Process and product optimization using designed experiments (3rd ed.). New York, NY: Wiley. 16 8 Lack of Fit Test Linear Model = significant lack of fit Quadratic Model = insignificant lack of fit It compares the variation between the actual data and the predictive value, to the variation between the replicates. 17 Diagnostics Data Analysis Model (Observed Values) (Predicted Values) yi Filter Signal ŷi Signal + Noise Signal Residuals (Observed – Predicted) ei = yi - ŷi Noise Myers, R. H., Montgomery, D. C., and Anderson-Cook, C. M. (2009). Response surface methodology: Process and product optimization using designed experiments (3rd ed.). New York, NY: Wiley. 18 9 Diagnostics ❑ Normal Plot of Residuals : if residuals follow a normal distribution : they should fall approximately along a straight line ❑ Residuals vs. Predicted : shows residual fitted against the model : points should be randomly scattered around zero ❑ Residuals vs. Run : identify trends or patterns in the residuals : should have no systematic pattern ❑ Predicted vs. Actual : compares predicted values vs. observed values : points should fall along a diagonal line ❑ Box-Cox Plot : assess the need for power transformation of the response : lambda of 1 suggests no transformation is needed Myers, R. H., Montgomery, D. C., and Anderson-Cook, C. M. (2009). Response surface methodology: Process and product optimization using designed experiments (3rd ed.). New York, NY: Wiley. 19 Features of a Good DOE using RSM Provides reasonable distribution of data points throughout the region of interest. Allows testing model adequacy – lack of fit Allows experiments to be performed in blocks Allows designs of higher-order to be built sequentially Provides an internal estimate of the error Does not require large number of runs Provides reasonable robustness against outliers or missing values. All statistical analysis should be guided by subject matter knowledge. Myers, R. H., Montgomery, D. C., and Anderson-Cook, C. M. (2009). Response surface methodology: Process and product optimization using designed experiments (3rd ed.). New York, NY: Wiley. 20 10 Bio16 Computational Biology Predictive Modelling using Response Surface Method References: Aydar, A.Y. (2018). Utilization of Response Surface Methodology in Optimization of Extraction of Plant Materials. http://dx.doi.org/10.5772/intechopen.73690 Myers, R. H., Montgomery, D. C., and Anderson-Cook, C. M. (2009). Response surface methodology: Process and product optimization using designed experiments (3rd ed.). New York, NY: Wiley. National Institute of Standards and Technology. (n.d.). How do you select an experimental design? https://www.itl.nist.gov/div898/handbook/pri/section3/pri33.htm Prepared by: Joseph Martin Q. Paet Biology Department, College of Science Bicol University 21 11

Use Quizgecko on...
Browser
Browser