2024 BIOL391 Research - #3 Model (Ch9-11) PDF
Document Details
Uploaded by TenaciousNephrite186
Burman University
2024
Burman
Tags
Summary
This document discusses the philosophy of experimentation, focusing on the question/model-building framework and inductive reasoning. It examines how scientists ask questions, gather data, and build models. Furthermore, it explores practical examples like using Google Maps to find the fastest route.
Full Transcript
9/22/24 Chapter 9 Philosophy of The Question and the Model. Experimentation Forming an Inductive Framework...
9/22/24 Chapter 9 Philosophy of The Question and the Model. Experimentation Forming an Inductive Framework for Ch. 9 - 11, Glass (2014) Scientific Projects (e.g., by getting to Carnegie Hall). BIOL 391 – We gather useful information by Intro to Biological Research asking questions. Burman, Fall 2024 1 2 Scientists also Question. Carnegie Hall When they do not have enough information. 1. Ask a question and then, after performing an experiment, they determine an answer. 2. They also seek to know if this answer is “stable”, i.e., whether repeatedly similar in the future. 3 4 How do I get to Carnegie Hall? The fastest route? Ask the first New Yorker you encounter. – Route given may be limited by the New Yorker’s knowledge of the possible routes. – The New Yorker’s prior experience may determine his/her ability to provide this information. This limitation of individual experience was one of the early criticisms of inductive reasoning. – A small number of prior observations is an inaccurate basis by which to determine probability. 5 6 1 9/22/24 How do I get to Carnegie Hall? The fastest route? If the New Yorker is “new” to this city, then Google maps, what is the fastest route there is doubt cast on the accuracy of his/her advise. to Carnegie Hall from Columbia – This doubt causes skepticism about the validity University? of his/her advice. – As a skeptic, you discard his/her advice. What next? – Ask another older New Yorker, one that has lived all of his/her life in New York. – Google Maps! 7 8 9 10 Google maps – a Large Data Set The ability to access very large, comprehensive samples significantly reduces the problem of limited experience, – thereby minimizing the chance that the data set is unrepresentative – and/or that the probability derived from a sample size is inaccurate. 11 12 2 9/22/24 Google maps – the fastest route Google maps – query output Computer program A calculation is made and an output produced. Two possible interpretations: – Measures the distance between the two points 1. Output is a determination of routes that were (i.e., Columbia University and Carnegie Hall) determined to be fastest, based on parameters based on various possible routes, and prior defined by Google maps (output is backward inputted data regarding the speed it took to looking based on prior data). traverse various parts of potential routes. 2. Output is a prediction of what the fastest route will be for the tourist, based on the parameters – May include current/contemporaneous defined by Google maps, and a calculated information such as traffic alerts, and eliminating probability that the recommended route will be the impassable routes. fastest. 13 14 Google maps – inductive output Google maps – testing the model The 2nd output is desired. “Does the probability derived from the model accurately predict my future – Predictive with a high probability that experience?” the prediction is accurate. Subject model to experimentation to validate – The higher the probability, the better the model. the model. – Travel suggested fastest route versus alternate – Inductive output with a statement of route, timing the trips. what will happen in the future based on – If the Google map’s prediction is found to be consistently accurate, the program’s prior data. methodology would be validated. 15 16 Google maps – testing the model Inductive Model Subject model to experimentation to validate A research project is designed to answer a the model. question. – If inaccurate, the program’s methodology would A data set is gathered, and a potential answer need to be revised/reprogrammed. is induced, using the data set. – If we add more data, the current model would be The answer is validated according to its ability more accurate. to predict the future, which is further act of – After making adjustments to the model after induction. further experimentation, we would repeat the Accuracy means that the model should be able validation experiment until we had an accurate to give the probability that a conclusion is best model – one that predicts the fastest route. and that we can confirm this probability. 17 18 3 9/22/24 Hypothesis-falsification (critical rationalism Question/Model-validation Framework. framework). 1. Decide on an experimental project. 1. Decide on an experimental project. 2. Ask a question. 2. Make a hypothesis. 3. Get an answer. 3. Subject the hypothesis to 4. Use the answer to build a model. falsification. 5. Determine whether the model is accurate by asking whether the model predicts the answer to the question 4. Get a result. asked in Step 2, when it is asked again. 5. Determine if the result is reproducible 6. Modify the model based on data obtained in Step 5. by repeating the hypothesis-test in 7. Reevaluate the modified model to see if it is better at Step 3. predicting the answer when the question from Step 2 is asked again. 19 20 Chapter “take home”. Question – Inductive approach – Model-validation framework. Large data set significantly reduces the problem of limited experience, minimizing the chance that the data set is unrepresentative and that the probability derived from a sample size is inaccurate. Google Maps – Awesome! 21 22 23 24 4 9/22/24 Chapter 10 Question as a framework. Advantages of the Question/Model- The question has a different Building Inductive Framework. grammatical structure from the conclusion and thus reminds the scientist that s/he does not know the answer in advance of experimentation. 25 26 Question as a framework. Question as a framework. In the case of the question approach, there is a Hypothesis falsification framework clear difference between the state of not – Untested hypothesis: The shy is red. knowing (because the experiment has not yet – Conclusion after the experiment: The sky is been performed and there is a lack of prior knowlwedge) and the state of knowing [result of experiment]. (because an experiment has been performed to Question/Model framework address the lack of prior knowledge). – Question: Is the sky red? The question framework forces the scientist to – Conclusion after experimentation: Yes, the sky confront the fact that, before experimentation, is red. No the sky is not red. The sky is red at the result of the experiment is unknown – the dawn and at dusk. question emphasizes the scientist is operating in question mode. 27 28 Psychological component of hypothesis making. Avoiding positive data bias. The question/model-verification inductive To stoke a scientist’s ego. framework do not create a bias for positive – The experiment confirms the scientist’s data, because models derived from all brilliance, because the outcome has already been data/observations can be modified deduced, due to the scientist’s insight. rather than simply rejected by falsification. In the question mode, the scientist has to go Both the open-ended question performed in in the opposite direction – admitting that advance of any data collection and the model s/he does not know and therefore needs tested after the initial data set is validated to ask for information from an have the virtue that do not create any experiment. tension between the scientist and the result. 29 30 5 9/22/24 Positive filter – Fig 1. Hypothesis may induce scientist to filter for “the positive response”. In the desire to prove a hypothesis correct, as opposed to falsifying it, the scientist will wait until the hypothesis “It is dark outside” is verified, even if it did not describe the situation for which it was originally constructed. 31 32 What color is the sky? No need to filter the answer through a preestablished hypothesis. Any data is successful. Model: The sky is blue (determined from initial experiment). – When tested for validation, the scientist will achieve validation at certain times (during the day; no clouds in the sky) but not always (during the night or at dawn or sunset). – Failure to validate does not cause rejection of model, but rather the need for modification and further testing. 33 34 Why is the Model-verification Framework The A Priori Model not being used more broadly? Def’n: relating to what can be known through an understanding of how certain things work rather Hypothesis framework allows scientists to focus on a limited scope – e.g., Yes-No than by observation ("knowing without answer. experience"). Limited resources may warrant a If data already exist when a project is being hypothetical-falsification framework. started, this prior data can be immediately tested Systems biology approach, with a broad open- for its probability of predicting future data by the ended question, may require equipment/ reagents that the scientist do not possess. construction of an “a priori model”. Could still be useful to use questions, in an A priori model --> data --> modification of inductive mode, when experiments that can be model --> model validation (vs question --> are limited. data --> model --> model validation). 35 36 6 9/22/24 Chapter 11 What is the function of MuRF1? Scientist wants to understand A biological example of the the function of a protein called MuRF1. question/model-building framework. – MuRF1 is a protein (Fig. 1). “What is the function of MuRF1?” – No other information about type of protein, or function is implied and scientist is open to any possibility (entire set of proteins equally available as context for MuRF1 function). – This situation is referred to as the “privileged state”. 37 38 Does MuRF1 resemble any proteins of Open-ended vs Close-ended Questions known function? Open-ended: What is the function of MuRF1? By comparing MuRF1 to other proteins whose Close-ended: Does MuRF1 do X? functions are known, the scientist is The close-ended question does not allow attempting to access relevant data on related subjects to gain an understanding of the for the unbiased question because the current subject (i.e., MuRF1). jump to X without any background data biases the scientist to start the project by examining Contextualizing a specific protein MuRF1 to X, and may bias the scientist to finish the the broad group of all proteins. project with this examination, because This is called accessing the “inductive space”, the only requirement is to study X. or prior information relevant to the project. 39 40 Literature Use of Prior knowledge Inductive space is the prior knowledge The hypothesis-falsification framework uses prior available in the literature. knowledge to frame a hypothesis. By going back to relevant published literature The question/answer process in the inductive and experimental findings, the scientist is model uses the inductive space as a basis on asking what might be generalized from which to ask a question, not to deduce the the past so that a relevant question or answer. prior model might be applied to the present unknown protein. The difference between the question/answer methodology and the hypothesis-falsification Even if the induced model derived from the framework is that the former method uses prior known subject turns out wrong when applied knowledge as a basis on which to ask a to the new subject, discovering that will be question about the unknown rather than helpful. draw a conclusion about that unknown. 41 42 7 9/22/24 Inductive space informs the Inductive reasoning experimental questions. Past findings --> questions --> answer Often scientists are not starting from the questions to obtain data --> model for “true” beginning but is operating in a future prediction/testing. background of accumulated knowledge that It took several decades to gather information is used inductively. from answering broad, descriptive questions to e.g., Fig. 3. The unknown subject vs the alien define the initial inductive space for proteins. subject. – For a protein, starts with structural information (i.e., composition/structure of protein) and proceeding to a 1. What is the function of an elephant’s trunk? mechanistic understanding of how the protein 2. What is the function of a trunk-like structure functions (i.e., function and structure-function on an alien from outer space? studies). 43 44 With the elephant, one does not need to Fig. 3 Fig. 2 do an extensive study on the nature of the elephant, because that is known. – Elephant has a heart, blood, mouth, nasal Once the broad set of proteins passages, etc. has been subdivided into – The location of the trunk close to the mouth different categories of proteins and nasal passages may lead to one to ask if it functions as a nose. by understanding – Induction is used to asked the question to be structure/function relationships, researched. a more focused question can be However, in the 2nd question, one needs asked. to do all the work on the elephant and Prior information is used other animals before it can be answered. inductively to determine – A model cannot be built until many more where MuRF1 fits in the basic questions are answered to give a knowledge base as to what the trunk-like broader scheme of proteins. structure might do. 45 46 More focused question: What type of Does MuRF1 resemble proteins of known protein is MuRF1? function? “What is the function of MuRF1?” seeks to Yes, MuRF1 resemble certain E3 unbiquitin categorize MuRF1 in the context of previously ligases. studied proteins. This realization allows the scientists to Alternately, “Does MuRF1 resemble proteins make faster progress, because s/he can of known function?” focus on the smaller subset of proteins. – Bioinformatics query using BLAST (to compare MuRF1 sequences to other protein sequences). Further detailed structural questions can be – Hundreds of proteins that share a domain with addressed about MuRF1. MuRF1. – X-ray diffraction/crystallography – Proteins belong to class of proteins called “E3 ubiquitin ligases”. – Studies to compare structural features in related proteins. 47 48 8 9/22/24 Initial structural questions to functional analysis. After obtaining the structural information, the next step is to determine whether MuRF1 actually functions as an E3 ubiquitin ligase. “Does MuRF1 function as an E3 ubiquitin ligase?” – Binary answer “yes” or “no” (typical of a hypothesis framework that can introduce bias). – However, we are working in a greater framework “What is the function of MuRF1?” 49 50 Build the model Therefore, determining whether MuRF1 From the “inductive space” information, functions as an E3 ubiquitin ligase should not many E3 ubiquitin ligases, in addition to impede further explorations in other areas, as ubiquitinating other molecules, can also ubiquitinate themselves. it is allowed in the model-verification – Useful information as the substrates have not framework. been defined. The answer “no” to “Does MuRF1 function – Maybe a self-ubiquitination assay is performed. as an E3 ubiquitin ligase?” is a possibility and A model is built to accommodate the highly acceptable under this framework. results. 51 52 Once it is determined that Model MuRF1 belongs to the Fig. 4 subcategory of proteins called “E3 ubiquitin ligases”, Conclusions --> model: a much smaller set of 1. MuFR1 shares homology with other proteins is deemed to be known E3 ubiquitin ligases. relevant “inductive space” – i.e., E3 ubiquitin ligases. 2. MuRF1 functions as an E3 ubiquitin Smaller inductive ligase. spaces are opportunities Each new conclusion allows the for rapid experimental scientist to focus on particular progress and potential filters that may cause inductive space that allows for new the scientist to miss questions to be asked. important findings. 53 54 9 9/22/24 As one refines the More focused inductive space. inductive space, the Fig. 5 amount of information Allows questions to be more specific and that needs to be focused as model is constructed from more immediately accessed experimentations: becomes smaller and 1. Framework question – “What is the function of thus is more accessible. MuRF1?” Once established that 2. Subset Questions: MuRF1 belongs to the set - “Does MuRF1 resemble other proteins of known of E3 ligases, data on other function?” E3 ligases immediately - “Does MuRF1 function as an E3 ubiquitin become available to aid in ligase?” the work on MuRF1 (e.g., - Which leads to the more pertinent question from other known E3 “What are the protein substrates of MuRF1?” ligases - MDM2 and Skp2). 55 56 Potential problem: Question/Model Building Focusing the model can also create a problem for 1. Decide on an experimental project. the question model-building method. 2. Ask a broad question that frames the – Once MuRF1’s function was determined to be an E3 experimental project. This question helps to ubiquitin ligase (i.e., the model), one might forget to determine other functions of MuRF1 that are define the “inductive space”, which includes the distinct from ligase activity. literature and prior data that can be used to ask The importance of the framework question questions concerning the issue at hand. “What are the functions of MuRF1?” may prevent 3. Ask a subset question to frame a specific the scientist from allowing the model to act as a experiment that will generate data to be used filter. to answer the question in Step 2. 57 58 Question/Model Building Question/Model Building 4. Get an answer to subset question. 7. Retest this model in particular settings 5. Determine whether the answer is accurate and refine the model based on the results. by asking how well (i.e., probability) the 8. Ask a new question querying the derived answer can predict the same answer predictive power of the model. to the question when it is asked again. 9. Based on the data derived from answering 6. Use the answer to build the model; in this question, either validate the model this case, a model of how MuRF1 might or, if it is found wanting in some function in the future. respect, refine it. 59 60 10 9/22/24 Summary Queries are used to generate useful data in an experimental project that data are used to build a model and the model is tested for its predictive ability. The critical feature is that this approach is an explicitly inductive, iterative method that allows for falsification without rejection and recognizes that falsification or lack of predictive ability forces limits or modifications to be placed on the model. 61 62 11