A Statistics Refresher PDF
Document Details
Uploaded by EducatedMandelbrot
Mapúa University
Tags
Summary
This document provides a refresher on statistical concepts, including key terms, functions, and scales of measurement. It is focused on psychological research and may be suitable for postgraduate studies in psychology or education.
Full Transcript
II. THE SCIENCE OF PSYCHOLOGICAL MEASUREMENT Chapter 3: A Statistics Refresher Key Terms and Concepts Measurement – the act of assigning numbers or symbols to characteristics of things; the rules in assigning numbers are guidelines for representing the magnitude of the...
II. THE SCIENCE OF PSYCHOLOGICAL MEASUREMENT Chapter 3: A Statistics Refresher Key Terms and Concepts Measurement – the act of assigning numbers or symbols to characteristics of things; the rules in assigning numbers are guidelines for representing the magnitude of the object being measured. Scale – a set of numbers whose properties model empirical properties of the objects to which the numbers are assigned Error – collective influence of all of the factors on a test score or measurement beyond those specifically measured by the test or measurement Psychological Trait – covers a wide range of possible characteristics Construct – an informed, scientific concept developed or constructed to describe or explain behavior. It can’t be seen, hear or touch constructs, but their existence from overt behavior can be inferred. Overt Behavior – an observable action or the product of an observable action, including test- or assessment- related responses. Norm-references testing and assessment – a method of evaluation and a way of deriving meaning from test scores by evaluation an individual testtaker’s score and comparing it to scores of a group of testtakers. Normative Sample – group of people whose performance on a particular test is analyzed for reference in evaluating the performance of the individual testtakers. Frequency – the number of times the event occurred in an experiment or study Proportion – number of cases compared to the total size of distribution Percentage – the frequency of occurrence of a category per 100 cases Ratio – compares the frequency of one category to another Rate – compares between actual and potential cases Deviations – the distance and direction of any raw score from the mean Functions of Statistics Description Frequency and grouped-frequency distributions Graphs and tables Arithmetic averages Decisions Inferences and generalizations from a sample to a population Testing hypotheses regarding the nature of social reality Developing Norms Administer the test with standard set of instructions Recommend a setting for test administration Collect and analyze data Summarize data using descriptive statistics including measure of central tendency and variability Provide a detailed description of the standardization In selecting a test Research the test’s available norms to check how appropriate they are for use with the targeted testtaker population When interpreting test results it helps to know about the culture and era of the testtaker It is important to conduct culturally informed assessment Scales of Measurement (Stevens, 1946) A Statistics Refresher, Psychological Research, Advanced Statistics | 1 II. THE SCIENCE OF PSYCHOLOGICAL MEASUREMENT Two Categories of a Scale Continuous scales – can be any real number in the scale’s sample space; theoretically possible to divide any of the values of the scale. o Typically having a wide range of possible values (e.g., height or a depression scale). o Most scales used in psychological and educational assessment are continuous and can be subjected to error o Continuous scores are not true scores, but an approximation of the real score Discrete scales – categorical values (e.g., male or female); can be counted Sample Space – refers to the values that a variable can take on Predicted Statistical Language Discriminant Analysis Categorical Logistic Regression Factor Analysis Continuous Linear/Multiple Regression The Four Scales of Measurement (NOIR) Nominal Scales (= ≠) - involve classification or categorization based on one or more distinguishing characteristics; simplest form of measurement; no defined order All things measured must be placed into mutually exclusive and exhaustive categories (e.g., apples and oranges, DSM-IV diagnoses, etc.). Naming or Labelling (e.g. Are you currently in pain? Yes or No; how would you characterize the type of pain? Sharp, Dull, Throbbing) Can be also be counted for the purpose of determining how many cases fall into each category and a resulting determination of proportion or percentages Ordinal Scales – involve classifications, like nominal scales but also allow rank ordering (e.g., Olympic medalists); have clear and uncontroversial order Ordering of categories. It can be treated as interval/ration variables if the distances between response categories are assumed to be equal. (e.g. How bad is the pain right now? None, Mild, Moderate, Severe; compared with yesterday, is the pain less severe, about the same, or more severe?) The numbers do not indicate units of measurement Binet: IQ test was not meant to measure people, but merely to classify based on their performance ❖ Most psychological measures are truly ordinal but are treated as interval measures for statistical purposes. ❖ Thus, such psychological tests are not true interval scales ❖ Ordinal scales are for developmental periods (Anastasi & Urbina) Interval/Ratio – ordering and exact distances (e.g. on a scale of 1-10, how is your pain?) Interval Scales - contain equal intervals between numbers. Have meaningful distances between numbers Allows for calculating means and standard deviations Equal interval: each unit on the scale is exactly equal to any other unit on the scale (e.g., IQ scores and most other psychological measures); no unit No absolute zero – zero has meaning ESTIMATE Most typical scale of measurements A Statistics Refresher, Psychological Research, Advanced Statistics | 2 II. THE SCIENCE OF PSYCHOLOGICAL MEASUREMENT Ratio Scales – Interval scales with a true zero point (e.g., height or reaction time); there is unit (except for degree Celsius) Absolute Zero – zero has no meaning EXACT For countable quantities, negative numbers are meaningless; for some quantities, negative numbers are possible Mostly used in neurological functioning assessment In Likert Scale Interval – when you assign numbers Ordinal – when you rank; no equivalent numbers Properties of Scales of Measurement: Scale of Magnitude Equal Interval Absolute Zero Measurement Nominal No No No Ordinal Yes No No Interval Yes Yes No Ratio Yes Yes Yes Describing Data Distributions - a set of test scores arrayed for recording or study. Raw Score - a straightforward, unmodified accounting of performance that is usually numerical. Frequency Distribution - all scores are listed alongside the number of times each score occurred It is a simple frequency distribution (scores have not been grouped). Grouped frequency distributions have class intervals rather than actual test scores Class Interval – smaller categories or groups containing more than one score Class interval size determined by the number of score values it contains Class Limits – the point halfway between adjacent intervals Upper and lower limits Distance from upper and lower limit determines the size of class interval The Midpoint – the middlemost score value in a class interval Cumulative Frequencies (cf) – total number of cases having a given score or a score that is lower. Cumulative Percentage – percentage of cases having a given score or a score that is lower Percentiles – the percentage of cases falling at or below a given score Deciles – points that divide a distribution into 10 equally sized portions Quartiles – points that divide a distribution into quarters A Statistics Refresher, Psychological Research, Advanced Statistics | 3 II. THE SCIENCE OF PSYCHOLOGICAL MEASUREMENT Histogram -- is a graph with vertical lines drawn at the true limits of each test score (or class interval), forming a series of contiguous rectangles Bar graph – numbers indicative of frequency appears on the Y-axis, and reference to some categorization (e.g., yes/ no/ maybe, male/female) appears on the X -axis. frequency polygon – test scores or class intervals (as indicated on the X-axis) meet frequencies (as indicated on the Y-axis). Types of Distributions Measures of Central Tendency Central tendency - a statistic that indicates the average or midmost score between the extreme scores in a distribution. Mean - sum of the observations (or test scores), in this case divided by the number of observations. o The “center of gravity” of a distribution o The Weighted Mean – the overall mean for a number of groups o Arithmetic Mean – typically the most appropriate measure of central tendency for interval or ratio data o Not recommended if the scores are extreme (too high or too low) o The most stable and useful measure of central tendency Median – the middle score in a distribution. Particularly useful when there are outliers, or extreme scores in a distribution. A Statistics Refresher, Psychological Research, Advanced Statistics | 4 II. THE SCIENCE OF PSYCHOLOGICAL MEASUREMENT o The point that divides a distribution in two, half above it and half below it o It is determined by ordering the scores in a list by magnitude o The median is an appropriate measure of central tendency for ordinal, interval, and ratio data o Useful when scores are at extremes Mode – the most frequently occurring score in a distribution. When two scores occur with the highest frequency a distribution is said to be bimodal. o the modal score may be totally atypical—one at an extreme end o useful in analyses of a qualitative or verbal nature Three factors in choosing a measure of central tendency Level of Measurement Mode Median Mean Nominal Yes No No Ordinal Yes Yes No Interval Yes Yes Yes Measures of Variability Measures of variability – statistics that describe the amount of variation in a distribution. Variability is an indication of the degree to which scores are scattered or dispersed in a distribution. Allows for a comparison between a given raw score in a set against a standardized measure Distributions A and B have the same mean score but Distribution has greater variability in scores (scores are more spread out). Range - difference between the highest and the lowest scores. o The simplest measure of variability, but its potential use is limited; one extreme score can radically alter the value of the range Quartiles (3) – dividing points between the four quarters o Q2 – the same as the median o Q1 and Q3 – quarter points Interquartile range – difference between the third and first quartiles of a distribution. Semi-interquartile range – the interquartile range divided by 2 A Statistics Refresher, Psychological Research, Advanced Statistics | 5 II. THE SCIENCE OF PSYCHOLOGICAL MEASUREMENT In a perfectly symmetrical distribution, Q1 and Q3 will be exactly the same distance from the median Mean Absolute Deviation (MAD) – the average deviation of scores in a distribution from the mean. o Rarely used due to the deletion of algebraic renders Variance – the arithmetic mean of the squares of the differences between the scores in a distribution and their mean o we need a measure of variability that considers every score o Mean of the deviations: how far? How dispersed? Standard deviation – represents the average variability in a distribution. It is the average deviations from the mean; to make variance simple o The greater the variability, the larger the standard deviation 2 Σ(𝑥𝑖−μ) σ= 𝑁 σ – population standard deviation 𝑁 – the size of the population 𝑥𝑖 – each value from the population µ – the population mean Skewness – the nature and extent to which symmetry is absent in a distribution. Positive skew – relatively few of the scores fall at the high end of the distribution. Negative skew – relatively few of the scores fall at the low end of the distribution. ❖ In positively skewed distribution, Q3 – Q2 will be greater than the distance of Q2 – Q1 o Low scorers, difficult, tail is to the right o Mean > median > mode ❖ In negatively skewed distribution, Q3 – Q2 will be less than the distance of Q2 – Q1 o High scorers, easy, tail is to the left o Mean < median < mode ❖ In a symmetrical distribution, the distances from Q1 and Q3 to the median (Q2) are the same Kurtosis – the steepness of a distribution in its center; how much outliers Platykurtic – relatively flat; lesser errors Leptokurtic – relatively peaked; many average scorers Mesokurtic – somewhere in the middle; few outliers A Statistics Refresher, Psychological Research, Advanced Statistics | 6 II. THE SCIENCE OF PSYCHOLOGICAL MEASUREMENT Shape of the Distributions Symmetrical Distribution – the mode, median, and mean have identical values Skewed Distributions The mode is the peak of the curve The mean is closer to the tail The median falls between the two Bimodal Distribution – both modes should be used to describe the data Parametric vs. Non-parametric Parametric – allows assumption on the population o Normal distribution of data o Means and standard deviation o May either be the sampling distribution or errors in the model ▪ Check skewness and kurtosis (should be close to 0) ▪ Check using the Kolmogorov-Smirnov Test or Shapiro-Wilk (should not be significant) Tests of Normality Kolmogorov-Smirnov – large number of participants Shapiro-Wilk – small number of participants o Homogeneity of Variance ▪ Variances of the data should be the same throughout the data. ▪ Check using Levene’s Test (should not be significant) o Data at least at interval level ▪ Interval up to ratio data are used o Independence ▪ Means that data from different participants are independent (behavior of one pax does not influence another) Non-Parametric – makes no assumption on the population o Skewed distribution of data o Ordinal and frequencies Research Objective Fast and Simple Research 🡪 Skewed Distribution 🡪 Advanced Statistics Analysis 🡪 Mode Median Mean The Normal Curve The normal curve is a bell-shaped, smooth, mathematically defined curve that is highest at its center. Perfectly symmetrical. Probability The relative likelihood of occurrence of any given outcome The outcome is impossible: p = 0 The outcome is as likely to happen as not happen: p =.5 The outcome is certain: p = 1 A Statistics Refresher, Psychological Research, Advanced Statistics | 7 II. THE SCIENCE OF PSYCHOLOGICAL MEASUREMENT Area Under the Normal Curve The normal curve can be conveniently divided into areas defined by units of standard deviations. 50% of the scores occur above the mean and 50% of the scores occur below the mean Approximately 34% of all scores occur between the mean and 1 standard deviation above/below the mean Approximately 68% of all scores occur between the mean and ±1 standard deviation Approximately 95% of all scores occur between the mean and ±2 standard deviation Approximately 99% of all scores occur between the mean and ±3 standard deviation Standard Scores A standard score is a raw score that has been converted from one scale to another scale, where the latter scale has some arbitrarily set mean and standard deviation. Linear transformation – standard score α original raw score Non-linear transformation – may be required when the data under construction are not normally distributed yet comparisons with normal distributions need to be made. As a result, the original distribution is normalized Normalizing a distribution - involves “stretching” the skewed curve into the shape of a normal curve and creating a corresponding scale of standard scores, a scale that is referred to as normalize standard score scale One of the primary advantages of a standard score on one test is that it can readily be compared with a standard score on another test It is generally preferrable to fine-tune the test according to difficulty or other relevant variables so that resulting distribution will approximate the normal curve, than attempting to normalize skewed distribution There are technical cautions to be observed before attempting normalization For example, transformations should be made only when there is good reason to believe that the test sample was large enough and representative enough and that the failure to obtain normally distributed scores was due to the measuring instrument Z-score - conversion of a raw score into a number indicating how many standard deviation units the raw score is below or above the mean of the distribution. Raw scores are meaningless, z-scores are golden A Statistics Refresher, Psychological Research, Advanced Statistics | 8 II. THE SCIENCE OF PSYCHOLOGICAL MEASUREMENT T scores - can be called a fifty plus or minus ten scale; that is, a scale with a mean set at 50 and a standard deviation set at 10 Stanine - a standard score with a mean of 5 and a standard deviation of approximately 2. Divided into nine units. From ¼ standard deviation below/above = 20% From 4th and 6th stanine (½) Whole numbers only Correlation and Inference A coefficient of correlation (or correlation coefficient) is a number that provides us with an index of the strength of the relationship between two things. Correlation coefficients vary in magnitude between -1 and +1. A correlation of 0 indicates no relationship between two variables. Correlation between variables does not imply causation but it does aid in prediction. Double-headed arrow is used in conceptual framework Strength of Correlation (significant at p >.05) This can be visualized using a scatter plot Strength increases as the points more closely form an imaginary diagonal line across the center Correlation Coefficient Value Strength.80 – 1.00 Very High.60 –.79 High.40 –.59 Substantial.20 –.39 Low.00 –.19 Negligible Note: closer to 1.00 accounts for a very relationship farther from 1.00 accounts very low or negligible relationship Magnitude/ Direction of Correlation Positive correlation: both variables move in the same direction/ directly proportional Negative correlation: the variables move in opposite directions/ inversely proportional Pearson r – a method of computing correlation when both variables are linearly related and continuous. Once a correlation coefficient is obtained, it needs to be checked for statistical significance (typically a probability level below.05). o 0.1 level or.05 level provides a basis for concluding that a correlation does indeed exist, occurring by chance alone 1 time or 5 times respectively Also called Pearson correlation coefficient, Pearson product-moment coefficient of correlation By squaring r, one is able to obtain a coefficient of determination, or the variance that the variables share with one another. A Statistics Refresher, Psychological Research, Advanced Statistics | 9 II. THE SCIENCE OF PSYCHOLOGICAL MEASUREMENT Spearman Rho – a method for computing correlation, used primarily when sample sizes are small or the variables are ordinal in nature. Also called rank-order correlation coefficient, rank-difference correlation coefficient Scatterplot – involves simply plotting one variable on the X (horizontal) axis and the other on the Y (vertical) axis Also called bivariate distribution, scatter diagram, scattergram Quick indication of the direction and magnitude of the relationship Useful in revealing the presence of curvilinearity – refers to an eyeball gauge of how curved a graph is Pearson r should be used only if the relationship between the variables is linear If not linear, other statistical tools and techniques may be employed Scatterplots of strong correlations feature points tightly clustered together in a diagonal line. For positive correlations the line goes from bottom left to top right. Strong negative correlations form a tightly clustered diagonal line from top left to bottom right. Outlier – an extremely atypical point (case), lying relatively far away from the other points in a scatterplot Outliers can sometimes provide a hint of some deficiency Restriction of range leads to weaker correlations A Statistics Refresher, Psychological Research, Advanced Statistics | 10 II. THE SCIENCE OF PSYCHOLOGICAL MEASUREMENT Meta-Analysis Meta-analysis allows researchers to look at the relationship between variables across many separate studies. Meta-analysis is a family of techniques to statistically combine information across studies to produce single estimates of the data under study. The estimates are in the form of effect size, which is often expressed as a correlation coefficient. Advantage: more weight can be given to studies that have larger numbers of subjects, which results in more accurate estimates Some advantages to meta-analyses Can be replicated The conclusions tend to be more reliable and precise More focus on effect size rather than statistical significance alone Promotes evidenced-based practice Psychological Research Population – a group of a set of individuals that share at least one characteristic Sample – a small number of individuals from the population Note: Social researchers generally are not able to measure an entire population due to limited time and resources, but sampling allows researcher to generalize Sampling Error – error in a statistical analysis arising from the unrepresentativeness of the sample taken Sampling Distribution of Means A frequency distribution of a large number of random sample means that have been drawn from the same population The sampling distribution of means approximates a normal curve The mean of the sample distribution of means (the mean of means) is equal to the true population mean The standard deviation of a sampling distribution of means is smaller than the standard deviation of the population Standard Error of the Mean – the standard deviation of the sampling distribution of means Confidence Intervals – the range of mean values within which the population mean is likely to fall 68% = Score ± SEM 95% = Score ± (1.96*SEM) 99% = Score ± (2.58*SEM) Testing Differences, Relationships, and Advanced Statistics Determining Statistical Tool 1. Identify the goal a. Describe (central tendency, variability, or locations) b. Correlate (Pearson, point biserial, etc.) c. Compare (T-tests, ANOVA, non-parametric tests) A Statistics Refresher, Psychological Research, Advanced Statistics | 11 II. THE SCIENCE OF PSYCHOLOGICAL MEASUREMENT d. Predict (Regressions) 2. Sampling Technique a. Randomized/parametric – interval data, normal curve b. Non-Randomized/Nonparametric – skewed, median, ordinal (spearman) Conditions Parametric Test Non-Parametric Test Random selection of subjects from Whether the sign or rank was used a normal population with equal variances Samples of groups are independent Whether the groups to be or correlated compared are independent samples or correlated Data being analyzed must be Whether the number of groups to interval be compared is ≥2 Comparison More power; higher power-efficacy Power often not far from parametric equivalent. May need higher N to match power of parametric test More sensitive to features of data Simpler and quicker to calculate collected Robust – data can depart No need to meet data requirements somewhat from assumptions of parametric tests at all 3. Identify the variables a. Categorical – nominal/ordinal b. Continuous – interval/ratio 4. Statistical goal A Statistics Refresher, Psychological Research, Advanced Statistics | 12 II. THE SCIENCE OF PSYCHOLOGICAL MEASUREMENT Testing Significant Differences and Relationships among Groups and Variables T- Test An analysis of two populations means A t-test with two samples is commonly used with small sample sizes, testing difference between the samples when the variances of two normal distributions are not known Alpha – the level of probability where decisions can be made with confidence Effect Size – the magnitude of the differences between two populations means Non-parametric Measures of Differences If the requirements for parametric tests are not met, such as normality and interval level data The One-Way Chi-Square Test Observed frequency: the set of frequencies obtained in an actual frequency distribution Expected frequency: the frequencies that are expected to occur under the terms of the null hypothesis Chi-square allows us to test the significance of differences between observed and expected frequencies Goodness of Fit – how well an observed distribution fits a theoretical data (e.g., Grego Mendell’s) Test of homogeneity – how different are the frequencies observed in comparison to an expected frequency Example: The agreement (Yes/No) of democrats and liberal in a given law The Two-Way Chi-Square Test Comparing observed and expected frequencies for more than one variable Involves cross- tabulations The Median Test Used when dealing with ordinal data Determines the likelihood that two or more random samples have been taken from populations with the same median Ignores the specific rank-order of cases The Mann-Whitney U Test This test examines the rank-ordering of all cases Determines whether the rank values for a variable are equally distributed throughout two samples The Kruskal- Wallis Test Can be used to compare several independent samples Requires only ordinal- level data ANOVA (Analysis of Variance) A special case of general linear regression model most often used to analyze data collected using experimentation 3 controlled variables Significant ANOVA o Bonferroni – post hoc (test utilized after ANOVA is significant) test; where is the difference? o Scheffe A Statistics Refresher, Psychological Research, Advanced Statistics | 13 II. THE SCIENCE OF PSYCHOLOGICAL MEASUREMENT o Tukey Types of ANOVA Variables 1 Categorical IV (with at least 2 One- Way ANOVA categories) & 1 Continuous DV 2 or more Categorical IVs & 2 Main Effects ANOVA Continuous DV 2 Categorical IVs (interaction of 2 Factorial ANOVA IVs) & 1 Continuous DV Two-Way ANOVA – 2 independent variables with at least 2 levels each Main effects – effect of one IV on single DV Interaction effects – combined effect of the 2 IVs on single DV Non-parametric Measures of Differences (ordinal) Statistical Tool Correlated/uncorrelated No of Groups Median Test Uncorrelated 2 Mann-Whitney (U) Test Uncorrelated 2 Wilcoxon Rank Sum Test Uncorrelated 2 Kruskal-Wallis H Test Uncorrelated ≥3 Friedman Rank Correlated ≥3 Fisher’s Sign Test Correlated 1 Wilcoxon Signed-Rank Test (T) Correlated 1 Comparative – different, effect No. of Time Number Scale of Statistical Tool DV is of Measurement measured Groups of DV t-test independent means 1 2 Interval/Ratio t-test dependent means 2 1 Interval/Ratio ANOVA 1 way 1 >2 Interval/Ratio ANOVA Repeated Measures >2 1 Interval/Ratio Mann Whitney U Test 1 2 Ordinal Wilcoxon Signed Rank Test 2 1 Ordinal Kruskal Wallis H Test 1 >2 Ordinal Friedman Test >2 1 Ordinal z-test of one sample 1 Interval Proportion/percentages 1 Interval Variances Interval 2 correlation coefficients 2 Interval Two-Way ANOVA 1 (2 IV) Interval Three-Way ANOVA 1 (3 IV) Interval Goodness of fit (chi-square) 1 nominal Categorical Data/ Measure Continuous Data/ Measure 1 IV – 1 DV = T-test/ Chi- Square 1 IV – 1 DV = Pearson r ≥ 2 IV – 1 DV = ANOVA ≥ 2 IV – 1 DV = Multiple Regression 1 IV – ≥ DV = MANOVA ≥ 2 IV – ≥ DV = Structural Equation Modeling Multivariate Analysis of Variance (MANOVA) – many measures of dependent variable Analysis of Covariance (ANCOVA) – covariance (control variables) Variables that systematically affect the IV in its influence to the DV A Statistics Refresher, Psychological Research, Advanced Statistics | 14 II. THE SCIENCE OF PSYCHOLOGICAL MEASUREMENT Example: Therapy (IV), Depression level (DV) o Pre-therapy depression (covariate, can influence the DV) Levels One IV Two IVs of 2 Treatments >2 treatments Factorial Design Measure ment of the DV Design Two Two Multiple Multiple Independent Matched Independent independent Matched Independent Matched Groups Groups and Groups Groups Groups Groups Matched (within-subj Groups ects) Interval/ T-Test T-test One way One way Two-way Two-way Two-way Ratio independent matched ANOVA ANOVA ANOVA ANOVA ANOVA groups (repeated) (repeater) (mixed) Ordinal Mann-Whitn Wilcoxon Kruskal-Wall Friedman ey U Test Test is Test Test Nominal Chi-Square Chi-Square Chi-Square Non-parametric Measures of Correlation Are used when researchers are using nominal or ordinal data or when normality cannot be assumed Spearman’s Rank- Order Correlation Coefficient Used with ordinal data Random sampling Goodman’s and Kruskal’s Gamma Tied ranks at the ordinal level of measurement are common A Statistics Refresher, Psychological Research, Advanced Statistics | 15 II. THE SCIENCE OF PSYCHOLOGICAL MEASUREMENT Especially with large data sets Cross- tabulated ordinal variables Phi Coefficient A simple extension of the chi- square test The degree of association between variables measured at the nominal level measurement can be determined The significance of phi is tested by examining the significance of chi-square Contingency Coefficient The contingency coefficient has an important disadvantage The number of rows and columns in a chi- square table will influence the maximum size taken by C To avoid this disadvantage use Cramér’s V Correlational – link, association, relationship Statistical Tool Variable 1 Variable 2 Pearson Product Moment Correlation Continuous Continuous Spearman Rho/Rank Correlation Rank Rank Point Biserial Correlation True Dichotomy Continuous Biserial Correlation Artificial Dichotomy Continuous Phi-coefficient True Dichotomy T/A Dichotomy Tetrachoric Correlation Artificial Dichotomy Artificial Dichotomy Kendall Coefficient of Concordance – agreement among 3 or more raters Lambda/Guttman’s Coefficient of Predictability – groups/categories; nominal (dependent and independent (≥2 groups) Chi-square test of independence – nominal data (2 groups) Regression Analysis A statistical process for estimating the relationship among variables It includes many techniques for modeling and analysing several variables The focus is on the relationship between a dependent variable and one or more independent variables (or predictors) Helps one understand how the typical value of the dependent variable (or criterion variable) changes when any one of the independent variables is varied, while the other independent variables are held fixed F-test If the F- value is statistically significant (typically p <.05), this signifies that the model (the predictors) did a good job of predicting the outcome variable. There is a significant relationship between the set of predictors and the dependent variable R- Square “Of all of the reasons why the outcome variable can vary, what percent of those reasons can be accounted for by the predictors Adjusted R- Square If the researcher used this model on a new data set, this would be the amount of variability accounted for in the new data set Important reference for goodness of fit in multiple regression Simple Linear Regression A single independent variable is used to predict the value of a dependent variable A Statistics Refresher, Psychological Research, Advanced Statistics | 16 II. THE SCIENCE OF PSYCHOLOGICAL MEASUREMENT To predict the value of the dependent variable based upon the values of one or more independent variable R coefficient The degree to which two or more predictors are related to the dependent variable Can assume values between 0 and 1 B coefficient To interpret the direction of the relationship between variables If positive, the relationship of variable with the dependent variable is positive (e.g. the greater the IQ, the better the grade point average) If negative, the relationship of variable with the dependent variable is negative (e.g. the lower the class size, the better the average test scores) Beta Coefficient The beta coefficients can be negative or positive, and have a t-value and significance of that t-value associated with it. Think of the regression beta coefficient as the slope of a line: the t-value and significance assess the extent to which the magnitude of the slope is significantly different from the line laying on the X-axis. If the beta coefficient is not statistically significant (i.e., the t-value is not significant), no statistical significance can be interpreted from that predictor. If the beta coefficient is sufficient, examine the sign of the beta. If the regression beta coefficient is positive, the interpretation is that for every 1-unit increase in the predictor variable, the dependent variable will increase by the unstandardized beta coefficient value. For example, if the beta coefficient is.80 and statistically significant, then for each unit increase in the predictor variable, the outcome variable will increase by.80 units Multiple Regression Two or more independent variables are used to predict the value of a dependent variable Can be used to determine the overall fit of the model and the relative contribution of each of the predictors to the total variance explained Quantitative Research – based on the assumption of standards, normality, and commonalities of sample behaviors Bivariate – assumes that one variable affects another one variable Multivariate – assumes that more than one variables are affecting other variables Statistics Refresher Goodness of fit – not significant Inferential – significant A Statistics Refresher, Psychological Research, Advanced Statistics | 17 II. THE SCIENCE OF PSYCHOLOGICAL MEASUREMENT Multicollinearity – same concepts, different theory Variable Items (redundancy) Predictive validity One-tailed – specific direction Two-tailed – either direction Correlation – relation (parametric, nonparametric), regression Differences – paired (parametric, nonparametric), unpaired External validity – very small sample Graphic rating – most common scale in measuring performance appraisal Job knowledge A Statistics Refresher, Psychological Research, Advanced Statistics | 18 II. THE SCIENCE OF PSYCHOLOGICAL MEASUREMENT ANOVA (F-test) Fail to reject there is no significant difference Taylor-Russell - Criterion validity - Selection ration - Base rate - Determine the overall impact of a testing procedure Lawshe A Statistics Refresher, Psychological Research, Advanced Statistics | 19 II. THE SCIENCE OF PSYCHOLOGICAL MEASUREMENT - Validity coefficient - Base rate - Applicant’s test scores - Probability of successful applicants - “Did the person score in top 20%, next 20%, etc.” X TALLY f % fx 4 I 1 5% 4 5 0 0% 0 6 II 2 10% 12 7 III 3 15% 21 8 IIIII-II 7 35% 56 9 IIIII 5 25% 45 10 II 2 10% 20 Σf = 20 100% Σfx = 158 X TALLY f m % fm 4-5 I 1 4.5 5% 4.5 6-7 IIIII 5 6.5 25% 32.5 8-9 IIIII-IIIII-II 12 8.5 60% 102 10-11 II 2 10.5 10% 12.5 Σf = 20 100% Σfx = 158 Get the mean after fx M = 7.9 Find the mode A Statistics Refresher, Psychological Research, Advanced Statistics | 20