PSYC 5120 - COMP: Statistics Practice Questions PDF

Summary

This document contains practice questions for PSYC 5120 - COMP, covering topics in statistics such as variable types, measures of central tendency, standard deviation, linear regression assumptions, and ANOVA. It includes questions on both descriptive and inferential statistics, research designs, and sampling methods. The questions are designed to test understanding of statistical concepts and their application in research.

Full Transcript

PSYC 5120 - COMP What are the 4 variable types? nominal, interval, ratio, ordinal What are the three variable types that SPSS has? nominal, ordinal, ratio/interval data measured on a scale along the whole of which intervals are equal (does not have true zero) - the change...

PSYC 5120 - COMP What are the 4 variable types? nominal, interval, ratio, ordinal What are the three variable types that SPSS has? nominal, ordinal, ratio/interval data measured on a scale along the whole of which intervals are equal (does not have true zero) - the change in scores are equal in each interval number (ex: the distance between first and second place is the same as the distance between third and fourth place data tells us not only that things have occurred but also the order in which they occurred, tells ordinal us nothing about the difference between values (ex: gold, silver, and bronze medals) when numbers merely represent names (ex: the number on a sport’s players back), numbers do nominal not have any meaning except noting the type of player must have true zero, similar to interval but the numbers should have meaning in their place ratio compared with other numbers (ex: 2 is twice as much as one) Name and distinguish between the three mean, median, and mode measures of central tendency. a simple statistical model of the centre of a distribution of scores (a hypothetical estimate of mean the typical score) the middle score of a set of observed observations (when there is even number of median observations, it is the average of the two scores in the middle) the most frequently occurring scores in a set of mode data an estimate of the average variability of a set of data measured in the same units of standard deviation measurement as the original data a probability distribution of a random variable that is known to have certain properties (ex: normal distribution perfectly symmetrical, kurtosis of 3, etc.) helps us visualize how unusual or average a data falls in, it is a visual representation of normative standard deviation scores, gives us a clear idea of where most of the data falls z-scores help us take raw data scores and Briefly describe how the z-score "normalizes" transfer them into more objective scores that we or "standardizes" a variable. can generalize and get a better picture of We use what to help us calculate the z-score? standard deviation What tells us how many SD away from the mean z-score our new score is? standard deviation quantifies the amount of variation or dispersion in a set of values Distinguish between the standard deviation standard error assesses how much the and the standard error (standard deviation of sample mean will vary from the actual the mean). population mean standard error is calculated by dividing the standard deviation by the square root of the sample size. - strength: how closely the data points fit a straight line (this lets us know how strong the Briefly describe the three general properties correlation is vs. Loose random dots) of a data set of ordered pairs (x,y) that are - form: describes the type of relationship (linear explained by the Pearson correlation relationship) coefficient, r. - direction: describes the relationship between the two variables (is it negative or positive?) - this basically says that we are 95% sure, with 5% chance of error, that our datum or number falls between the given confidence interval Briefly describe the 95% confidence interval - as the CI gets smaller, we are less confident, of the mean. but we normally use 95% - “I am 95% confident that Jane Doe’s score falls between 60-65.” Explain the four main assumptions of linear - linearity regression. - homoskedasticity - independence of errors - normality of error distribution the spread of residuals (errors) should be roughly the same across all levels of the homoskedasticity independent variable observations and their errors should be independence of errors dependent of one another the residuals should be normally distributed normality of error distribution when both the IV and DV are nominal When is the best time to use a chi square when the research question is whether goodness of fit? the distribution of the DV is similar to a known distribution or different this test should be used when we have a small sample size and want to compare means of different groups independent t-tests: used when When should you use t-test: paired and observations are assumed to come independent samples? from different entities paired samples: used when we want to compare two means that come from conditions consisting of the same or related entities “"Does a mindfulness meditation program Give example of a research question that could reduce stress levels in college students after 8 be used for a paired samples t-test. weeks compared to their baseline levels?" "Do college students who participate in a Give example of a research question that could mindfulness meditation program have lower be used for an independent samples t-test. stress levels after 8 weeks compared to students who do not participate?" - Normality: the sample should be normally distributed and try to have a bigger sample size - independence: the two groups and participants in said groups must be separate and What are the assumptions of an independent different from each other samples t-test? - homogeneity of variance: we can assume equal variances between groups (test with Levene’s test) - the variances in the two groups should be roughly equal we can assume equal variances between groups (test with Levene’s test) - the variances in the homogeneity of variance two groups should be roughly equal the two groups and participants in said groups assumption of independence must be separate and different from each other if it is significant (less than.05), this means that What does it mean when Levene's test is homogeneity assumption is not met, if the p- significant? value is higher than.05, assumption of homogeneity is met - when we are comparing three or more levels (nominal is IV and DV is interval or ratio) - we need one IV and one DV but there are When should you use one-way ANOVA? different levels or groups for the study - we conduct it when we want to compare the means of three or more independent groups, but some have more than one level a statistical technique used when you have one dependent variable (outcome) and you want to see if there are differences in the means of ANOVA this variable across different groups defined by one or more categorical independent variables used when you have several dependent variables (outcomes) that are being considered simultaneously, and you want to see if there are MANOVA overall differences between groups on this set of dependent variables What are the assumptions of a one-way - independence of observations, normality and ANOVA? homogeneity of variances - Ex of alternative: "There is a significant difference in academic self-efficacy scores among at least one of the college prep groups." State the null and alternate hypotheses for a one-way ANOVA. - Ex of null: "There is no significant difference in academic self-efficacy scores among students based on the type of college prep program they participated in." - a statistical test used to analyze the effects of two or more independent variables (factors) on When should you use factorial ANOVA? a dependent variable, while also examining how these factors interact with each other State the 3 null and alternate hypothesis - Alternative: pairs associated with a two-way factorial 1. There is a difference in mean weight loss ANOVA. To make this easier, suppose that we between males and females. are comparing the weight loss of males and females on three diet plans. 2. There is a difference in mean weight loss for at least one diet plan. 3. There is an interaction between gender and diet plan on weight loss. - Null: 1. There is no difference in mean weight loss between males and females. 2. There is no difference in mean weight loss across the three diet plans. 3. There is no interaction between gender and diet plan on weight loss. a type of Analysis of Variance that is used when you have two independent categorical two-way factorial ANOVA variables (factors) and you want to examine their effect on one dependent variable Which one we use depends on how many predictors we have We use multiple if we have more than When should you use simple and multiple two predictors (ex: Predicting GPA linear regression? based on study hours, attendance, and sleep.) and simple if we have one (ex: Predicting a student's GPA based on their number of study hours per week.) Independence Normality: difference between predicted and observed data should be What are the assumptions of multiple linear close to zero regression analyses? Homoscedasticity Linearity Multicollinearity occurs when two or more independent variables (predictor variables) are highly correlated, making it difficult to multicollinearity isolate the individual effect of each variable on the dependent variable the errors (the difference between the predicted and actual values) has the homoscedasticity same variance (or spread) across all values of the independent variable(s) State the null and alternate hypotheses for Alternate: Participation in an ACT prep linear regression analyses? program does predict academic self- efficacy. Null: Participation in an ACT prep program does not predict academic self-efficacy. If we are examining if there’s a relationship between two categorical variables When should you use Chi Square? The statistic is based on the idea of comparing the frequencies you observe in certain categories you might expect to get in those categories by chance the chi square test statistic the degrees of freedom What should you examine when looking at the significance value SPSS output of Chi Square Analyses? standardized residuals an effect size related to the size of your contingency table (number of rows and columns) and influence the degrees of freedom shape of the chi-square distribution. This is denoted as χ² in the textbook's list of symbols2. It is a measure of the discrepancy between the observed frequencies in your data chi-square test statistic and the frequencies you would expect if there were no association between the categorical variables. This value tells you the probability of obtaining a chi-square statistic as extreme as, or more extreme than, the one calculated if there was no p-value true association between your variables (i.e., if the null hypothesis were true) provides an indication of the strength or magnitude of the association between the effect size variables for a chi-square test Alt: There is a relationship between State the null and alternate hypotheses for gender and sports goals. Chi Square. Null: There is no relationship between variables. When should you use MANOVA? When we want to consider the relationship between different DVs It allows for more than one outcome variable When we have multiple DVs Ex: Independent Variable (IV): Type of after-school program (Tutoring, Sports, Arts) Dependent Variables (DVs): Math test score Reading test score Self-esteem rating Hotelling’s Trace Wilk’s Lambda What are the various tests? Roy’s largest root Pillai’s Trace Independent Variable: Study Technique (with three levels: Rote memorization, Concept mapping, Practice problems) Dependent Variables: Maths Exam Score and English Exam Score Null Hypothesis: There is no statistically significant difference between the mean scores of the three study technique groups (Rote memorization, Concept mapping, Practice problems) State null and alternate hypotheses for on the combination of the two MANOVA. dependent variables (Maths Exam Score and English Exam Score) in the population Alternative Hypothesis: There is a statistically significant difference between the mean scores of the three study technique groups (Rote memorization, Concept mapping, Practice problems) on the combination of the two dependent variables (Maths Exam Score and English Exam Score) in the population Use both quantitative and qualitative mixed-methods design methods in the same study Can include things like conducting interviews but also giving questionnaires as well an approach that combines both quantitative and qualitative methodologies within a single study or a series of related studies What is this an example of: to assess the impact of a community development program, researchers used surveys to measure changes in participants' well-being and community mixed-method design engagement. They also conducted focus groups and observations to understand participants' experiences and perceptions of empowerment. What is this an example of: a researcher conducts a large-scale survey about attitudes on a controversial topic using rating-scale items (quantitative data) but also includes several convergent research design open-ended items where participants explain their ratings (qualitative data). The qualitative data helps the researcher better understand the numerical findings Mixed-methods design that includes three or more phases; earlier phases inform multiphase iterative conceptualization and implementation of subsequent phases mixed methods design in which a researcher collects both quantitative and qualitative data in parallel, usually at the same time and with respect to the same general research problem; convergent data similar weight is given to the two types of data, with the hope that they will yield consistent or complementary findings In this design, one type of data (qualitative or quantitative) is embedded within a larger study that primarily uses the other type; data provides embedded supportive information to enhance the overall study What is this an example of: design used in drug multiphase-iterative prevention research. This involves several steps, including: Assessing the prevalence of drug use through a survey (quantitative). Conducting focus groups to explore reasons for drug use (qualitative). Developing an intervention program based on the findings. Implementing the program. Collecting follow-up quantitative data on drug use and qualitative data on opinions about the program. Repeating the modification and data collection steps as needed Two-phase mixed methods design in which qualitative data are collected in an effort to exploratory inform the planning and implementation of subsequent quantitative data collection Two-phase mixed methods design in which quantitative data collection is followed by the collection of qualitative information that can explanatory help clarify the meanings of the quantitative findings What is this an example of: where qualitative observations of a phenomenon in a real- world setting (Phase 1) are used to develop exploratory hypotheses to be systematically tested in an experimental study (Phase 2) What type of design is this an example of: The Conceptual Analysis Exercise presents a scenario where an initial questionnaire yields quantitative data about popular and unpopular explanatory design individuals, and subsequent interviews yield qualitative data that might explain why certain individuals are well-liked or disliked Advantages: completeness (researchers can more fully address the research problem) triangulation (when both quantitative and qualitative data lead to the same conclusion, researchers can make a more convincing case) mixed methods allow for researchers to address different subproblems or sub What are the advantages and disadvantages of questions related to the major mixed method designs? research Disadvantages: Complexity (this type of design is inherently more complex), requires more time and energy, requires a bigger skill set, integration of data, etc. Sampling selection process in which participants purposive sampling or other units of study are chosen on the basis of how much and what types of information they can yield about the topic under investigation In a study of social and political processes involved in implementing a new school policy, a What type of sampling is this an example of? researcher would almost certainly want to talk to the school principal as well as teachers closely involved in or affected by the policy. Sample selection process in which each member of a population has an equal chance of being probability sampling chosen every member of the population has an equal probability sampling chance of being selected selected in such a way that each member of the population has an equal chance of being random sampling chosen What is this an example of: assigning each person in the population a unique number and then using an arbitrary method of picking random sampling certain numbers, perhaps by using a roulette wheel (for small populations) or drawing numbers out of a hat the researcher samples equally from each of the stratified random sampling layers in the overall population What is this an example of: An example is sampling from grades 4, 5, and 6 in a public stratified random sampling school, which is considered a stratified population with three different layers. to use research not just to describe or understand an issue or problem but also to take Action research and participatory action action and make concrete changes and research aim to do research for what? transformations in conditions, resources, practices, and/or policy. 1.Determine a research question that, when answered, can yield concrete strategies in the here and now. 2.Collect data that might help to What is Mills’ (2018) four-step process answer that question. 3.Analyze and interpret the data relative to the research question. 4.Develop a plan for research. Aims to produce theories and knowledge about learning and teaching that are grounded in real-life contexts. Involves manipulating real-life design-based research contexts in particular ways and conducting and refining experiments and interventions so that researchers can make evidence-based claims about learning. an action-oriented research approach involving reciprocal collaboration with individuals typically viewed as "subjects" of research, aiming to participatory action research investigate important issues and challenge status-quo explanations involves developing partnerships within a community setting with community community based research organizations and groups to investigate issues This integrates community-based research and participatory action research, bringing together undergraduate students, faculty, and local course based action research community partners in collaborative research projects that benefit the community a type of PAR involving young people as co- YPAR researchers used to summarize and describe the main features of a data set includes measures of central tendency descriptive stats (mean, median, and mode) and variability (SD, range) used to draw conclusions about a larger population based on data from a sample allow researchers to estimate inferential stats population parameters and test hypotheses A numerical value that describes a characteristic of a sample. Examples include statistic the sample mean (M) and sample standard deviation (s) A numerical value that describes a parameter characteristic of a population ex: population SD The variable that is manipulated by the IV researcher to see its effect on another variable The variable that is measured to see if it is DV affected by the independent variable Involve categories with no inherent order, such as different flavors of ice cream or political affiliations. Statistical possibilities include nominal data determining the mode and calculating percentages Involve categories with a meaningful order or ranking, but the intervals between categories ordinal data are not necessarily equal. An example is ranking students from highest to lowest test scores Have ordered values with consistent intervals between them, but there is no true zero point. interval data Temperature in Celsius is an example. You can calculate the mean and standard deviation Have ordered values with consistent intervals and a true zero point, indicating the complete absence of the quantity. Examples include age ratio data and blood pressure. Allow for virtually any inferential statistical analysis describe a typical or representative value in a data set. These include the mean (arithmetic measures of central tendency average), median (middle score), and mode (most frequent score) Indicate the spread or scatter of scores in a data set. These include the range (difference between highest and lowest), variance (average measures of dispersion of squared deviations from the mean), standard deviation (square root of variance), and interquartile range Indicates that the data points are clustered closely around the mean, suggesting low small SD variability Indicates that the data points are more spread out from the mean, suggesting high large SD variability To provide context to raw scores by indicating purpose of z-scores (standard scores) how many standard deviations a particular raw score falls above or below the mean of its distribution. Allows for comparison across different scales Indicates that as one variable increases, the positive correlation other variable tends to increase as well Indicates that as one variable increases, the negative correlation other variable tends to decrease occurs when we believe that there is a genuine effect in our population, when in fact there Type 1 error isn't occurs when we believe that there is no effect Type 2 error in the population when, in reality, there is What is a weak correlation? a coefficient closer to 0 but from 0-0.3 What is a moderate correlation? a coefficient from 0.3-0.5 a coefficient closer to +1 or -1 - typically 0.7 or What is a strong correlation? higher the initial phase where data re broken into segments and analyzed for emerging categories or themes; it is used to explore various possible open coding meanings and begin grouping them into meaningful categories or codes this follows open coding and involves organizing and connecting the categories around a central (or “core”) category. It axial coding examines relationships such as causes, contexts, strategies, and consequences related to that core category This is the final phase in grounded theory analysis. A single core category is selected, and a theory is developed around it by relating it to selective coding other categories and forming a conceptual framework or storyline (mentioned as part of second-cycle coding) Involves identifying recurring themes or pattern coding explanations across different data segments​ These are codes derived directly from the participants’ own words. This approach helps in vivo coding maintain the authenticity of the participants’ experiences Used in action-oriented and participatory research. Multiple team members independently collective coding code data and then come together to create a common code list What is the purpose of interrater reliability in To ensure consistency and credibility in how qualitative research? multiple researchers code and interpret data. It reduces individual bias, strengthens rigor, and How does interrater reliability improve the confirms that findings are not solely one quality of a qualitative study? person's interpretation. What strategy is used to measure interrater Multiple coders analyze the same data and reliability? compare coding decisions to ensure agreement. What should you do if one participant’s explore the unexpected results further rather interview results are unexpected compared to than ignore them others’? Why is it important to examine unexpected they may offer new insights or lead to refined results in qualitative research? research questions To understand cultural patterns and practices of What is the main goal of analyzing ethnographic a group through immersive observation and research? interaction. What is the primary purpose of a research to clearly define and plan the research study proposal? before it begins This type of design involves collaboration and change action research and participatory designs useful for real world settings like classrooms or communities this type of design combines quantitative and qualitative approaches to capitalize on the mixed-methods design strengths of both type of mixed methods design that collects both convergent types of data simultaneously type of mixed methods design that collects quantitative data first, then qualitative to explain explanatory findings results type of research design that collects qualitative exploratory sequential first, then quantitative to test findings type of research design that has multiple phases multiphase iterative with alternating methods designed to test cause and effect relationships experimental, quasi-experimental, and ex post and the strategies depend on level of control facto design and manipulation this type of design lacks random assignment but quasi experimental includes pre/posttesting