Psychological Statistics Prelims Reviewer PDF (A.Y. 2024-2025)

FIRST SEMESTER | BACHELOR OF SCIENCE IN PSYCHOLOGY | BACHELOR OF ARTS IN PSCYHOLOGY PSYCHOLOGICAL STATISTICS PRELIMS REVIEWER Note: The reviewers created by the HAU Most important variable Psychology Society ensure consistency and e.g., The effect of sleep deprivation quality during your review process. Be on driving performance. reminded that the content of the reviewers is o Driving performance is the based ONLY ON THE GIVEN MODULES by the subject’s instructor. DV, as it is being measured Thank you and Goodluck on your Exam! based on the effect of sleep Laus Deo Semper! deprivation. Independent Variable (IV) Lesson 1: Fundamentals of Research and Statistics The variable that an experimenter uses to describe or explain the Statistics differences in the dependent variable The process of collecting data in a or to cause changes in the dependent systematic manner, examining those variable. data, and making inferences from e.g., The effect of sleep deprivation them. on driving performance. o Sleep deprivation is the IV Variable since it is the one that is Any measurable characteristic of a manipulated and affects person, environment, driving performance. or experimental treatment that varies Types of Independent Variables from person to person, environment to environment, or Subject Variable - based on a experimental situation to measurable characteristic of the experimental situation. subject that the experimenter does not directly change. A condition Constant of the subject that exists before A number that represents a construct the research begins. that does not change. o e.g., Diagnostic categories e.g., The number of days in the month of clients participating in a of June study Dependent Variable (DV) Manipulated or Experimental Variable - the experimenter An outcome of interest (e.g., some systematically controls or aspect of behavior) that is observed manipulates and to which the and measured by a researcher in order subjects are assigned. to assess the effects of the independent variable. - e.g., Amount of drug administered in a study This reviewer is not for sale. FIRST SEMESTER | BACHELOR OF SCIENCE IN PSYCHOLOGY | BACHELOR OF ARTS IN PSCYHOLOGY PSYCHOLOGICAL STATISTICS PRELIMS REVIEWER Data are represented by Greek letters (e.g., Numbers or measurements that are µ, s, r). collected as a result of observations. Data Set If statistics can vary from sample to sample, how do we know that they can be considered A collection of measurements or representative of the population? observations If the sample is selected from the Population population using sampling methods A complete set of individuals, and statistical methods to ensure it is objects, or measurements having reliable. some common observable characteristic Relationship between a Population and a Sample Sample A subset of a population that shares the same characteristics as the population Statistic A number resulting from the manipulation of sample data according to certain specified procedures; statistics for samples are variable rather than constant. are numbers based on direct observation and measurement They are reported by Roman letters (e.g., x, s, r) Figure 1.1 The sample is selected to represent a population. Parameter A value summarizing a measurable SAMPLING characteristic of a Simple Random Sample total population that is estimated o A subset of a population based on the value of a selected in such a way that statistic. each member of the Population parameters are constants. population has an equal and are often inferred values based on statistics. This reviewer is not for sale. FIRST SEMESTER | BACHELOR OF SCIENCE IN PSYCHOLOGY | BACHELOR OF ARTS IN PSCYHOLOGY PSYCHOLOGICAL STATISTICS PRELIMS REVIEWER independent opportunity to be Sampling Error selected. “margin of error” Random Assignment is the naturally occurring o Assigning subjects to discrepancy, or error, that exists different treatment conditions between a sample statistic and the in such a way that each corresponding population subject has an equal and parameter independent opportunity to be placed in each treatment To check if the difference is condition. significant, if it is small, it does not o The groups are equivalent to really matter; if it is big, such as a 10 each other before the difference, it might matter. research begins. Sample statistics vary from one Descriptive Statistics sample to another and typically are different from the corresponding A set of statistical procedures used to population parameters. organize, summarize, and present the data collected in a research project. Inferential Statistics A collection of statistical procedures that allow one to make generalizations about population parameters based on sample statistics to determine if there is a systematic relation between the IV and the DV to determine if there is a cause-and- effect relation between the IV and the DV. The goal is to detect meaningful and significant patterns in research results. Figure 1.2 A demonstration of sampling The basic question is whether the error. Two samples are selected from the patterns observed in the sample data same population. Notice that the sample reflect corresponding patterns that statistics are different from one sample to exist in the population, or are simply another and all the sample statistics are random fluctuations that occur by different from the corresponding population chance. parameters. The natural differences that exist, This reviewer is not for sale. FIRST SEMESTER | BACHELOR OF SCIENCE IN PSYCHOLOGY | BACHELOR OF ARTS IN PSCYHOLOGY PSYCHOLOGICAL STATISTICS PRELIMS REVIEWER by chance, between a sample statistic and I. An attempt to answer empirical population parameter are called sampling questions error. An attempt to answer empirical Average age, average IQ, and the questions percentage of males and females in the population parameter differ from II. The use of publicly verifiable the two samples, which is called the information sampling error. Operational Definition The definition that a researcher uses The Inferential Function of Statistics to describe the processes by which an 2 Types of Inferences object, event, or a construct is 1. Parameter Estimation or measured. Generalization Direct Replication o involves using samples to Repeating an experimental make estimates about manipulation under the same parameters. conditions as the previous o Most population parameters experiment. can never be directly known Systematic Replication except in cases in which the Repeating a previous experimental population is small or there are sufficient funds to study preparation but with one or more the entire population. changes to the independent variable 2. Induction o A conclusion based on III. The use of systematic empiricism observation and experience. suggests that we attempt to make our o is said to be probably true. observations in a controlled manner. Specifically, we attempt to Fundamentals of Research account for alternative hypotheses and determine if the behavior we are 3 Common Features of All Scientific exploring is due to the specific Research conditions we are examining or due to 1. An attempt to answer empirical other factors. questions Data 2. The use of publicly verifiable information The numerical form of information 3. the use of systematic empiricism from the end products of most research efforts This reviewer is not for sale. FIRST SEMESTER | BACHELOR OF SCIENCE IN PSYCHOLOGY | BACHELOR OF ARTS IN PSCYHOLOGY PSYCHOLOGICAL STATISTICS PRELIMS REVIEWER Our task is to organize, summarize Role of Statistics in Research and interpret these data Statistics in the Context of Research Statistical Analysis is concerned specifically with making sense out of data and permitting valid conclusions or inferences to be drawn from these data. o Word of caution: The nature of the research design and the quality of the data we collect impose restrictions on the types of conclusions that can validly be drawn. § research design and data quality limit the conclusions we can draw o No statistic is meaningful by itself. Methods of Gathering Information A single piece of data tells us Correlational Study nothing about relative frequencies examines the relationship between and absolutely nothing about variables causation. does not support cause-and-effect o A single statistic lacks context conclusions and cannot show frequency always question with: Is there a patterns or cause-and-effect relationship between… relationships. Case I – Correlation but no Cause and Effect Population | Random Selection | Sample (number of subjects) IV DV This reviewer is not for sale. FIRST SEMESTER | BACHELOR OF SCIENCE IN PSYCHOLOGY | BACHELOR OF ARTS IN PSCYHOLOGY PSYCHOLOGICAL STATISTICS PRELIMS REVIEWER Statistical Analysis: Correlation coefficient Hypothesis: Parental attachment between two variables significantly predicts marital Statistical Inference: Correlation between satisfaction. variables – no cause and effect Case II – Intact Group Design Paradigm: Correlation Use of subject variable as a grouping variable – significant difference among groups no cause and effect group is based on certain characteristics The line is without an arrow, as it look for a difference indicates the relationship is not basis of an independent variable (e.g., specified between the IV and DV. Will age make a difference in IQ?) Research Question: Is there a relationship between parental attachment and marital satisfaction in adult life? (degree) Hypothesis: There is a significant relationship between parental attachment and marital satisfaction in adult life. Statistical Analysis: Significant difference between two variables Paradigm: Linear Regression Statistical Inference: Difference between variables – no cause and effect Intact Group/Casual Comparative/Ex Post Facto/Differential Research Design § The arrow indicates that the relationship between IV and DV is predicted. Research Question: Does parental attachment predict marital Research Question: Are there satisfaction? (pattern) differences in cultural adaptation strategies used by foreign students based on their personality? This reviewer is not for sale. FIRST SEMESTER | BACHELOR OF SCIENCE IN PSYCHOLOGY | BACHELOR OF ARTS IN PSCYHOLOGY PSYCHOLOGICAL STATISTICS PRELIMS REVIEWER o Does personality influence the Best research design as it can be seen cultural adaptation strategies a and observed in real time foreign student use? Hypotheses: There are significant differences in cultural adaptation strategies used by foreign students based on their personality. Personality significantly influences the cultural adaptation strategies a foreign student use. Statistical Analysis: Comparison of group Intact Group Design differences Grouping is based on a subject Statistical Inference: Significant difference variable, that is, a measurable between groups, significant relation between characteristic of the subject that the IV and DV – cause and effect experimenter does not change. Paradigm: Experimental Design Strength: It allows a researcher to determine if there is a systematic difference among different groups. Weakness: It does not allow the researcher to determine whether the Research Question: Does grouping variable causes changes in motivational interviewing decrease the dependent variable. symptoms of alcohol addiction better Confounding: A condition where the than group therapy? DV is affected by a variable that is o Does mindfulness-based CT related to the IV. Confounding decrease symptoms of anxiety? prohibits one from assuming that the Hypotheses: Motivational IV causes the DV. interviewing significantly decreases o Difference may not be symptoms of alcohol addiction better based on the IV but on than group therapy. other variables Mindfulness-based CT significantly decreases symptoms of anxiety. Case III - Experimental Design Control Conditions in an Experiment Use of random assignment to groups Experimental Condition to identify significant Individuals receive the experimental Differences among groups and to treatment make cause and effect inferences Control Condition This reviewer is not for sale. FIRST SEMESTER | BACHELOR OF SCIENCE IN PSYCHOLOGY | BACHELOR OF ARTS IN PSCYHOLOGY PSYCHOLOGICAL STATISTICS PRELIMS REVIEWER to provide a baseline for comparison Observing two situations that are Individuals in a control condition do alike in all ways except one. not receive the experimental If different effects are found, they are treatment. ascribed to the variable that was not they either receive no treatment or common to two situations. they receive a neutral, placebo treatment. The IV has no effect The difference is the exposure of children to violent video games. Those who have Methods of Gathering Information exposure are more likely to have aggressive True Experiment behavior than those without exposure. subjects are randomly assigned to a 3. Joint Method of Agreement and treatment condition Difference the researcher manipulates the A method of investigation that independent variable, and measures combines methods of agreement and the dependent variable difference. Essential Elements of a True Experiment Have a more valid conclusion 1. The IV must be under the control of the experimenter. 2. The subjects must be randomly assigned to the treatments. 3. There must be controls for alternative Methodological Way: hypotheses. Double Blind Establishing Cause and Effect Relations Logical Ways: A control procedure wherein the subject and the researcher 1. Method of Agreement collecting the data are unaware of Searching for the presence of one which experimental condition the element whenever another element is subject is in. present. o Placebo Effect - an inert Both elements should be present to treatment that appears to have cause and effect cause an improvement Children who play violent video games have more aggressive behavior. 2. Method of Difference This reviewer is not for sale. FIRST SEMESTER | BACHELOR OF SCIENCE IN PSYCHOLOGY | BACHELOR OF ARTS IN PSCYHOLOGY PSYCHOLOGICAL STATISTICS PRELIMS REVIEWER there are an infinite number of possible values that fall between any two observed values. A continuous variable is divisible into an infinite number of fractional parts. Can be measured Real limits boundaries of intervals for scores that are represented on a continuous Measurement Scales number line. Constructs located exactly halfway between the scores. internal attributes or characteristics that cannot be directly observed but are useful for describing and explaining behavior. Operational Definition identifies a measurement procedure (a set of operations) for measuring an external behavior and uses the resulting measurements as a definition and a measurement of a Each score has two real limits hypothetical construct. o upper real limit 2 Components of Operational Definition o lower real limit It describes a set of operations for e.g., 150-151 = 150.5 (real limits) measuring a construct. Nominal Scale It defines the construct in terms of the consists of a set of categories that resulting measurements. have different names Discrete Variable Measurements on a nominal scale consists of separate, divisible label and categorize observations, but categories. do not make any quantitative No values can exist between two distinctions between observations. neighboring categories. Examples: Can be counted o Sex of subject (male=0, female=1) Continuous Variable o DSM-5 diagnosis o Religion of person o Political party This reviewer is not for sale. FIRST SEMESTER | BACHELOR OF SCIENCE IN PSYCHOLOGY | BACHELOR OF ARTS IN PSCYHOLOGY PSYCHOLOGICAL STATISTICS PRELIMS REVIEWER Scale Characteristics Examples Ordinal Scale consists of a set of categories that are Ratio Scale organized in an ordered sequence. is an interval scale with the additional measurements on an ordinal scale feature of an absolute zero point. rank observations in terms of size or With a ratio scale, ratios of numbers magnitude do reflect ratios of magnitude. Examples: Examples: o Rank in class (1st, 2nd, 3rd) o Height, weight, age, running speed o Rank on personality dimension o Number of words recalled in a (high vs. low self-esteem) memory experiment o Aggressiveness Scale Scores o Response latency (10=high aggression, 1=low o Grams of food consumed aggression) o Number of problems solved Scale Characteristics Examples Scale Characteristics Examples Interval Scale consists of ordered categories that are all intervals of exactly the same Lesson 2: Frequency Tables, Percentiles, and Data Exploration size. Equal differences between numbers Frequency Distribution on scale reflect equal differences in is an organized tabulation of the magnitude. number of individuals located in each o Arbitrary - zero point on an category on the scale of interval scale measurement. Examples: Two elements of Frequency Distribution: o Temperature of 98.6 °F o SAT verbal score of 540* 1. The set of categories that make up the o Most psychological test scores original measurement scale. o Raw score on a statistical test 2. A record of the frequency, or number of individuals in each category. This reviewer is not for sale. FIRST SEMESTER | BACHELOR OF SCIENCE IN PSYCHOLOGY | BACHELOR OF ARTS IN PSCYHOLOGY PSYCHOLOGICAL STATISTICS PRELIMS REVIEWER x = categories of measurement scale f = frequency fx = product of f and x Bimodal distribution ∑𝑋 = sum of fx Skewed Distribution the scores tend to pile up toward one Frequencies and Percentages end of the scale and taper off gradually at the other end. Tail the section where the scores taper off toward one end of a distribution. Positively Skewed p = relative frequency or proportion a skewed distribution with the tail on ! the right- hand side; the tail points = f (frequency), n (total number of " toward the positive (above-zero) end observations) of the X-axis. % = percentage form of p Mean is greater than the median e.g., negative scores are on the left, Shape of a Frequency Distribution therefore it is more prevalent Symmetrical Distribution In a symmetrical distribution, it is possible to draw a vertical line through the middle so that one side of the distribution is a mirror image of the other. Negatively Skewed It has balance the tail points to the left of zero. It is a bell-shaped curve Mean is less than the median This reviewer is not for sale. FIRST SEMESTER | BACHELOR OF SCIENCE IN PSYCHOLOGY | BACHELOR OF ARTS IN PSCYHOLOGY PSYCHOLOGICAL STATISTICS PRELIMS REVIEWER e.g., positive scores are plenty higher *It is a positive skew as it decreases from left than the negative scores to right. Rank or Percentile Rank Is defined as the percentage of individuals in the distribution with scores at or below the particular value Examples Percentile 1. A group of quiz scores ranging from 4–9 A score identified by its percentile are shown in a histogram. If the bars in the rank histogram gradually increase in height from left to right, what can you conclude about the set of quiz scores? a. There are more high scores than there are low scores. b. There are more low scores than there are Cf = cumulative frequency high scores. *To get the cf, add all frequencies from the c. The height of the bars always increases as starting point of the number (x = 1) the scores increase. d. None of the above + + *It is a negative skew as it increases from + + left to right. c% = cumulative percentage 2. A set of scores is presented in a frequency N = total number of observations distribution histogram. If the histogram shows a series of bars that tend to decrease in height from left to right, then what is the shape of the distribution? a. symmetrical b. positively skewed 1. What is the 95th percentile? c. negatively skewed (P) 95 = 0.95 n = 20 (add all f) d. normal ! ! P=" 0.95 = #$ multiply 0.95 and 20 This reviewer is not for sale. FIRST SEMESTER | BACHELOR OF SCIENCE IN PSYCHOLOGY | BACHELOR OF ARTS IN PSCYHOLOGY PSYCHOLOGICAL STATISTICS PRELIMS REVIEWER cf = 19 81, 82, 83, 85 The 95th percentile is 4.5 93, 97 2. What is the percentile rank for X = 3.5? Separate the stem and leaf according %! &' to tenths (30s, 40s,50s,…) p= " (100) p = #$ (100) The percentile rank is 70% Box and Whisker Plot or Box Plot is a way of summarizing a set of data J.W. Tukey measured on an interval scale. It is often used in explanatory data presented a technique for organizing analysis. This type of graph is used to data that provides a simple alternative show the shape of the distribution, its to a grouped frequency distribution central value, and its variability. table or graph in 1977. Stem and Leaf Display This technique requires that each score be separated into two parts: The first digit (or digits) is called the the ends of the box are the upper and stem, and; lower quartiles, so the box spans the the last digit is called the leaf. interquartile range e.g., X = 85 would be separated into a the median is marked by a vertical stem of 8 and a leaf of 5. line inside the box the whiskers are the two lines outside the box that extend to the highest and lowest observations. 1. Where is the median? Arrange the data first from lowest to highest Arrange data from lowest, it can repeat number 32, 33 Look at the middle for the median 42, 46 The median is 21.5. 52, 56, 57, 59 2. What is the range of scores? 62, 63, 68 71, 73, 74, 74, 76, 76, 78 This reviewer is not for sale. FIRST SEMESTER | BACHELOR OF SCIENCE IN PSYCHOLOGY | BACHELOR OF ARTS IN PSCYHOLOGY PSYCHOLOGICAL STATISTICS PRELIMS REVIEWER Look at the whiskers–the lower is not necessarily located at the extreme and the upper extreme. exact center of the group of scores. The range is from 15 to 29. Mean for a population 3. Where is the interquartile range? is identified by the Greek letter mu, μ The interquartile range is defined (pronounced “mew”) () by the lower quartile and the Formula: 𝜇 = * upper quartile. Mean for a sample The interquartile range is 19 to 25. is identified by M or 𝑥̅ (read “x-bar”). The convention in many statistics Lesson 3: Measures of Central Tendency textbooks is to use - 𝒙 to represent the Central tendency mean for a sample. However, in is a statistical measure to determine a manuscripts and in published single score that defines the center of research reports the letter M is the a distribution. standard notation for a sample The goal is to find the single score mean. that is most typical or most Formula: 𝑀 = () " representative of the entire group. Alternative Definitions of the Mean Dividing the Total Equally. The first alternative is to think of the mean as the amount each individual receives when the total (ΣX) is divided equally among all the individuals (N) in the distribution. The Mean as a Balance Point for the a. Mean Distribution b. Median The Weighted Mean c. Mode Also called the overall mean Mean ()! + ()# Formula: 𝑀 = "! +"# also known as the arithmetic average is computed by adding all the scores Example: in the distribution and dividing by the o Group 1 = 12 cases with a number of scores. total score of 72 o Group 2 = 8 cases with a total located in the middle of the score of 56 distribution if you use the concept of distance to define the “middle.” This reviewer is not for sale. FIRST SEMESTER | BACHELOR OF SCIENCE IN PSYCHOLOGY | BACHELOR OF ARTS IN PSCYHOLOGY PSYCHOLOGICAL STATISTICS PRELIMS REVIEWER Alternative Solution: 4. If every score in a distribution is multiplied by (or divided by) a Characteristics of the Mean constant value, the mean will change in the same way. In general, these characteristics result Multiplying (or dividing) each score from the fact that every score in the by a constant value is a common distribution contributes to the value method for changing the unit of of the mean. Specifically, every measurement. score adds to the total (𝜮X) and e.g., To change a set of measurements every score contributes one point to the number of scores (n). from minutes to seconds, you multiply by 60; to change from inches These two values (𝜮X and n) to feet, you divide by 12. determine the value of the mean. We now discuss four of the more important characteristics of the mean. Median 1. Changing the value of any score will If the scores in a distribution are change the mean. listed in order from smallest to 2. Adding a new score to a distribution, largest, the median is the midpoint or removing an existing score, will of the list. usually change the mean. The is the point on the measurement scale exception is when the new score (or below which 50% of the scores in the the removed score) is exactly equal to distribution are located. the mean. defines the middle of the distribution 3. If a constant value is added or in terms of scores subtracted to every score in a The goal of the median is to locate distribution, the same constant will the midpoint of the distribution. be added or subtracted to the mean. is located so that half of the scores are e.g., attractiveness ratings and alcohol on one side and half are on the other consumption side. This reviewer is not for sale. FIRST SEMESTER | BACHELOR OF SCIENCE IN PSYCHOLOGY | BACHELOR OF ARTS IN PSCYHOLOGY PSYCHOLOGICAL STATISTICS PRELIMS REVIEWER There are no specific symbols or 2. The value of 50% is located 37.5 notation to identify the median. APA points down from the top of the uses Mdn. percentage interval. As a fraction of The definition and computations for the whole interval, this is 37.5 out of the median are identical for a sample 50, or 0.75 of the total interval. Thus, and for a population. the 50% point is located 0.75 or ¾ Examples: down from the top of the interval. Find the median: 3. For the scores, the interval width is 1 point and 0.75 down from the top of 3, 5, 8, 10, 11 the interval corresponds to a distance 1, 1, 4, 5, 7, 8 = 4.5 of 0.75(1) = 0.75 points. *If there are two, get the average of both 4. Because the top of the interval is 4.5, midpoints. the position we want is Find the median for 1, 2, 3, 4, 4, 4, 4, 6 4.5 - 0.75 = 3.75 For this distribution, the 50% point (the 50th percentile) corresponds to a score of X = 3.75. Note that this is exactly the same value that we obtained for the median in Example A distribution with several scores clustered at 3.9. the median. The median for this distribution When to use the Median is positioned so that each of the four boxes at 1. Extreme Scores or Skewed Distributions X = 4 is divided into two sections with ¼ of When a distribution has a few each box below the median (to the left) and extreme scores, scores that are very ¾ of each box above the median (to the right). different in value from most of the As a result, there are exactly 4 boxes, 50% of others, then the mean may not be a the distribution, on each side of the median. good representative of the majority of the distribution. The problem comes Using Interpolation to locate the Median from the fact that one or two extreme values can have a large influence and cause the mean to be displaced. In this situation, the fact that the mean uses all of the scores equally can be a disadvantage. 1. For the scores, the width of the interval is 1 point. For the 2. Undetermined Values percentages, the width is 50 points. Occasionally, you will encounter a situation in which an individual has This reviewer is not for sale. FIRST SEMESTER | BACHELOR OF SCIENCE IN PSYCHOLOGY | BACHELOR OF ARTS IN PSCYHOLOGY PSYCHOLOGICAL STATISTICS PRELIMS REVIEWER an unknown or undetermined In the table, the top category in this score. This often occurs when you are distribution shows that three of the students measuring the number of errors (or consumed “5 or more” pizzas. This is an amount of time) required for an open-ended category. individual to complete a task. Notice that it is impossible to compute a mean for these data because you cannot find SX (the total number of pizzas for all 20 students). number of pizzas eaten during a 1 month period for a sample of n = 20 For these data, the median is 1.5. 4. Ordinal Scale When scores are measured on an ordinal scale, the median is always For example, suppose that preschool appropriate and is usually the children are asked to assemble a preferred measure of central wooden puzzle as quickly as possible. tendency. The experimenter records how long Ordinal measurements allow you to (in minutes) it takes each child to determine direction (greater than or arrange all the pieces to complete the less than) but do not allow you to puzzle. determine distance. 3. Open-ended Distributions The median is compatible with this A distribution is said to be open- type of measurement because it is ended when there is no upper limit defined by direction: half of the (or lower limit) for one of the scores are above the median and half categories. are below the median. Because the mean is defined in terms of distances, and because ordinal scales do not measure distance, it is not appropriate to compute a mean for scores from an ordinal scale. Mode is the score or category that has the greatest frequency in a distribution. This reviewer is not for sale. FIRST SEMESTER | BACHELOR OF SCIENCE IN PSYCHOLOGY | BACHELOR OF ARTS IN PSCYHOLOGY PSYCHOLOGICAL STATISTICS PRELIMS REVIEWER There are no symbols or special Answer: Luigi’s notation used to identify the mode or When to use the Mode to differentiate between a sample 1. Nominal Scales mode and a population mode. The primary advantage of the mode The definition of the mode is the same Recall that the categories that make for a population and for a sample up a nominal scale are differentiated distribution. only by name, such as classifying It can be used to determine the typical people by occupation or college or most frequent value for any scale major. Because nominal scales do not of measurement, including a nominal measure quantity (distance or scale direction), it is impossible to compute It is a score or category, not a a mean or a median for data from a frequency. nominal scale. it is possible to have more than one 2. Discrete Variables mode. Specifically, it is possible to Recall that discrete variables are have two or more scores that have the those that exist only in whole, same highest frequency. indivisible categories. Often, discrete Bimodal variables are numerical values, such distribution with two modes as the number of children in a family Multimodal or the number of rooms in a house. distribution with more than two When these variables produce modes numerical scores, it is possible to No Mode calculate means. However, the calculated means are usually distribution with several equally high fractional values that cannot points actually exist. For example, computing means will What is the mode for this sample? generate results such as “the average family has 2.4 children and a house with 5.33 rooms.” The mode, on the other hand, always identifies an actual score (the most typical case) 3. Describing Shape Because the mode requires little or no calculation, it is often included as a This reviewer is not for sale. FIRST SEMESTER | BACHELOR OF SCIENCE IN PSYCHOLOGY | BACHELOR OF ARTS IN PSCYHOLOGY PSYCHOLOGICAL STATISTICS PRELIMS REVIEWER supplementary measure along with The means (or medians) are then the mean or median as a no- cost displayed using a line graph, extra. histogram, or bar graph, depending on The value of the mode (or modes) in the scale of measurement used for the this situation is that it gives an independent variable. indication of the shape of the Line Graph distribution as well as a measure of is used when the values on the central tendency. horizontal axis are measured. Remember that the mode identifies the location of the peak (or peaks) in the frequency distribution graph. For example, if you are told that a set of exam scores has a mean of 72 and a mode of 80, you should have a better picture of the distribution than would be available from the mean alone Presenting Means and Medians in Graphs The figure shows an example of a line Graphs graph displaying the relationship can be used to report and compare between drug dose (the independent measures of central tendency variable) and food consumption (the dependent variable). In this study, it allows several means (or medians) there were five different drug doses to be shown simultaneously so it is (treatment conditions) and they are possible to make quick comparisons listed along the horizontal axis. The between groups or treatment five means appear as points in the conditions. graph. When preparing a graph, To construct this graph, a point was o list the different groups or placed above each treatment treatment conditions on the condition so that the vertical position horizontal axis. Typically, these of the point corresponds to the are the different values that make mean score for the treatment up the independent variable or condition. The points are then the quasi-independent variable. connected with straight lines. o Values for the dependent variable (the scores) are listed on the Histogram vertical axis. An alternative to the line graph on an interval or ratio scale This reviewer is not for sale. FIRST SEMESTER | BACHELOR OF SCIENCE IN PSYCHOLOGY | BACHELOR OF ARTS IN PSCYHOLOGY PSYCHOLOGICAL STATISTICS PRELIMS REVIEWER For a bar graph, a space is left between adjacent bars to indicate that the scale of measurement is nominal or ordinal. Basic rules in constructing graphs of any type: 1. The height of a graph should be approximately two-thirds to three-quarters of For this example, the histogram its length. would show a bar above each drug 2. Normally, you start numbering both the X- dose so that the height of each bar axis and the Y-axis with zero at the point corresponds to the mean food where the two axes intersect. However, when consumption for that group, with no a value of zero is part of the data, it is space between adjacent bars. common to move the zero point away from Bar Graph the intersection so that the graph does not is used to present means or medians overlap the axes. when the groups or treatments shown on the horizontal axis are measured Lesson 4: Measures of Variability on a nominal or an ordinal scale. Variability provides a quantitative measure of the differences between scores in a distribution and describes the degree to which the scores are spread out or clustered together. The figure shows a bar graph displaying the median selling price for single-family homes in different regions of the United States. To construct a bar graph, you simply *Standard deviation is emphasized in this draw a bar directly above each figure to determine how far apart their group or treatment so that the characteristic are. height of the bar corresponds to the plays an important role in the mean (or median) for that group or inferential process because the treatment. variability in the data influences how This reviewer is not for sale. FIRST SEMESTER | BACHELOR OF SCIENCE IN PSYCHOLOGY | BACHELOR OF ARTS IN PSCYHOLOGY PSYCHOLOGICAL STATISTICS PRELIMS REVIEWER easy it is to see patterns. In general, When the scores are measurements of low variability means that existing a continuous variable, the range can patterns can be seen clearly, be defined as the difference between whereas high variability tends to the upper real limit (URL) for the obscure any patterns that might largest score (Xmax) and the lower exist. real limit (LRL) for the smallest score (Xmin) Purpose for measuring variability range = URL for 𝑋-./ − is to obtain an objective measure of LRL for 𝑋-0" how the scores are spread out in a when the scores are all whole distribution. In general, a good numbers, the range can be obtained measure of variability serves two by purposes: 𝑋-./ − 𝑋-0" + 1 o 1. Variability describes the distribution. Specifically, it tells whether the scores are clustered Which of the following sets of scores has the close together or are spread out greatest variability? over a large distance. Usually, a. 3, 5, 7, 10, 11, 13 variability is defined in terms of b. 23, 25, 26, 27 distance. o 2. Variability measures how well c. 34, 35, 36, 37 an individual score (or group of d. 42, 44, 45, 46 scores) represents the entire The lowest value is 3, and the highest distribution. This aspect of is 13, therefore the range is 10. variability is very important for How many scores in the distribution are used inferential statistics, in which to compute the range? relatively small samples are used to answer questions about a. only 1 populations b. 2 Measures of Variability c. 50% of them 1. Range d. all of the scores the distance covered by the scores in The lowest score is subtracted from a distribution, from the smallest score the highest score. to the largest score. range = 𝑋-./ − 𝑋-0" This reviewer is not for sale. FIRST SEMESTER | BACHELOR OF SCIENCE IN PSYCHOLOGY | BACHELOR OF ARTS IN PSCYHOLOGY PSYCHOLOGICAL STATISTICS PRELIMS REVIEWER Standard Deviation from the mean). This solves the problem of uses the mean of the distribution as a the positive and negative values canceling reference point and measures each other out in step 2. variability by considering the 4. Get the square root of the variance to distance between each score and the obtain the standard deviation, which mean. measures the standard distance from the The most commonly used and the mean. most important measure of 5. Standard deviation is the square root of variability. the variance and provides a measure of the Measure of the average distance or standard, or average distance from the mean. difference between each score and the Standard deviation = √variance mean. Steps in computing SD: 1. Determine the deviation, or distance from the mean, for each individual score. Deviation score = 𝑋 − 𝜇 2. Calculate the mean of the deviation scores. Get the mean first (add all X then For Example: divide 5). The mean is ^. We start with the following set of N = 4 Subtract X from mean in every score scores. These scores add up to Σ𝑋 = 12, so Square the result of the deviation to &# the mean is 𝜇 = ' = 3. For each score, we not get a zero. have computed the deviation. The sum of squared deviations (SS) is adding the result from the squared deviation. Divide SS from the total N = 5, (40/5 = 8) Mean is 8 To get the standard deviation, square 8 2.83 A frequency distribution histogram for a population of N = 5 scores. The mean for this *To correct this from happening, proceed to population is μ = 6. The smallest distance step 3. from the mean is 1 point and the largest 3. Compute the variance or the mean distance is 5 points. The standard distance squared deviation (average squared distance This reviewer is not for sale. FIRST SEMESTER | BACHELOR OF SCIENCE IN PSYCHOLOGY | BACHELOR OF ARTS IN PSCYHOLOGY PSYCHOLOGICAL STATISTICS PRELIMS REVIEWER (or standard deviation) should be between 1 and 5 points. To get the sum of squares (SS): get the mean (2) subtract the mean from each score What is the variance for the following set of square the deviation scores? 2, 2, 2, 2, 2 add all squared deviation a. 0 the result is the SS b. 2 c. 4 Other formula for finding the SS without d. 5 using the mean: *There is no difference between the scores. ((2)# Formula: 𝑆𝑆 = Σ𝑋 # − * Measuring Variance and Standard Deviation for a Population SS or sum of squares is the sum of the squared deviation scores. Variance = mean squared deviation = !"# %& !'"()*+ +*,-(.-%/ ΣX: sum of scores /"#0*) %& !1%)*! Definitional Formula: 𝑆𝑆 = Σ(𝑋 − 𝜇)# Σ𝑋 # : sum of 𝑋 # Steps: 1. Find each deviation score (𝑋 − 𝜇) Mean Square or MS 2. Square each deviation score (𝑋 − 𝜇)# is often used to refer to variance, which is the mean squared deviation. 3. Add the squared deviations. 𝑆𝑆 = Σ(𝑋 − 𝜇)# Example: Population variance is represented by the symbol σ2 and equals the mean squared distance This reviewer is not for sale. FIRST SEMESTER | BACHELOR OF SCIENCE IN PSYCHOLOGY | BACHELOR OF ARTS IN PSCYHOLOGY PSYCHOLOGICAL STATISTICS PRELIMS REVIEWER from the mean. Population variance is 𝜇=3 obtained by dividing the sum of SS = (1-3)2 + (4-3)2 + (6-3)2 + (1-3)2 squares by N. = 18 𝑆𝑆 b) variance 𝜎# = 𝑁 𝑆𝑆 𝜎# = 𝑁 Population standard deviation is represented by the symbol σ and 18 equals the square root of the 𝜎# = = 𝟒. 𝟓 4 population variance. 𝑆𝑆 c) Standard deviation 𝜎 = E𝜎 # = F 𝑁 𝑆𝑆 𝜎 = E𝜎 # = F 𝑁 Example: 𝜎 = √4.5 = 𝟐. 𝟏𝟐 Measuring Standard Deviation and Variance for a Sample The Problem with Sample Variability The goal of inferential statistics is to use the limited information from samples to draw general conclusions about populations. The Variance: basic assumption of this process is that samples should be representative of the populations from which they come. A sample statistic is said to be Standard Deviation: biased if, on the average, it consistently overestimates or underestimates the corresponding population parameter. For the population N = 4 scores: 1, 4, 6, 1, find a) the sum of the squared deviations 𝑆𝑆 = Σ(𝑋 − 𝜇)# This reviewer is not for sale. FIRST SEMESTER | BACHELOR OF SCIENCE IN PSYCHOLOGY | BACHELOR OF ARTS IN PSCYHOLOGY PSYCHOLOGICAL STATISTICS PRELIMS REVIEWER Sample standard deviation is represented by the symbol s and equal the square root of the sample variance. 𝑆𝑆 𝑠 = E𝑠 # = F 𝑛−1 (ΣX)# 𝑆𝑆 = Σ𝑋 # − 𝑁 minus 1 in the score (denominator) to make it smaller, to make the variance The population of adult heights forms bigger. a normal distribution. If you select a Remember, sample variability tends sample from this population, you are to underestimate population most likely to obtain individuals who variability unless some correction is are near average in height. As a made. result, the variability for the scores in the sample is smaller than the variability for the scores in the Example: population. The sample will most likely be taken in the area where the majority of the people is (colored area in the graph). Sample variance is represented by the symbol s2 and equals the mean squared distance from the mean. Sample variance is obtained by dividing the sum of squares by n - 1. 𝑆𝑆 𝑠# = 𝑛−1 This reviewer is not for sale. FIRST SEMESTER | BACHELOR OF SCIENCE IN PSYCHOLOGY | BACHELOR OF ARTS IN PSCYHOLOGY PSYCHOLOGICAL STATISTICS PRELIMS REVIEWER In general, with a sample of n scores, the first n - 1 scores are free to vary, but the final score is restricted. As a result, the sample is said to have n - 1 degrees of freedom. The third score will remain constant; Sample variance: scores that can vary are limited. Example: If sample variance is computed by dividing by n, instead of n - 1, how will the obtained Sample standard deviation: values be related to the corresponding population variance. A. They will consistently underestimate the population variance. Sample Variability and Degrees of B. They will consistently overestimate the Freedom population variance. C. The average value will be exactly equal to Degrees of freedom the population variance. Written as df D. The average value will be close to, but not sample variance are defined as exactly equal to, the population variance. df = n - 1. If we get a sample, it is most likely in determine the number of scores in the the middle because it is close to the sample that are independent and free mean, which is where the majority of to vary. the variables are. The variability is A form of correction, we being limited; it gets smaller and acknowledge that we are only underestimated. estimating. Biased and Unbiased Statistics Unbiased if the average value of the statistic is equal to the population parameter. (The average value of the statistic is This reviewer is not for sale. FIRST SEMESTER | BACHELOR OF SCIENCE IN PSYCHOLOGY | BACHELOR OF ARTS IN PSCYHOLOGY PSYCHOLOGICAL STATISTICS PRELIMS REVIEWER obtained from all the possible Transformations of Scale samples for a specific sample size, n.) Adding a constant to each score Biased does not change the standard if the average value of the statistic deviation. either underestimates or X1 = 41, X2 = 43, μ = 40, σ = 10 overestimates the corresponding o If 2 is added to 41 and 43, population parameter. their difference is still 2. Multiplying each score by a constant causes the standard deviation to be multiplied by the same constant. o If 41 and 43 are each multiplied by 2, they become 82 and 86 and their difference is 4. Error Variance The mean is computed for each In the context of inferential statistics, sample, and the variance is computed this is the variance that exists in a set two different ways: of sample data. (1) dividing by n, which is incorrect is used to indicate that the sample and produces a biased statistic; and variance represents unexplained and (2) dividing by n-1, which is correct uncontrolled differences between and produces an unbiased statistic scores. As the error variance increases, it becomes more difficult to see any Presenting the Mean and Standard systematic differences or patterns that Deviation in a Frequency Distribution might exist in the data. Graph This reviewer is not for sale. FIRST SEMESTER | BACHELOR OF SCIENCE IN PSYCHOLOGY | BACHELOR OF ARTS IN PSCYHOLOGY PSYCHOLOGICAL STATISTICS PRELIMS REVIEWER o 2. The z-scores form a standardized distribution that can be directly compared to other distributions that also have been transformed into z- scores. (e.g., to determine the IQ and to interpret) Suppose you received a score of X = 76 on a statistics exam. How did you do? Depends on the mean and standard deviation Graphs showing the results from two experiments. In experiment A, the variability is small and it is easy to see the 5-point mean difference between the two treatments. In experiment B, the 5-point mean difference between treatments is obscured by the large Two distributions of exam scores. For both variability. distribution μ = 70, but for one distribution, σ = 3, for the other. σ = 12. The relative Lesson 5: z-Scores: Location of Scores and position of X = 76 is very different for the two Standardized Distributions distributions. A raw score by itself does not Figure a necessarily provide much it is 2 SD away. It means the score is information about its position within high relative to the mean. a distribution. Figure b The process of transforming X values into z-scores serves two useful 76 is average as it is only half of the purposes: SD (12), so the score is close to o 1. Each z-score tells the exact average, not very high. location of the original X z-score value within the distribution. specifies the precise location of each X value within a distribution. This reviewer is not for sale. FIRST SEMESTER | BACHELOR OF SCIENCE IN PSYCHOLOGY | BACHELOR OF ARTS IN PSCYHOLOGY PSYCHOLOGICAL STATISTICS PRELIMS REVIEWER sign of the z-score (+ or −) signifies Example: whether the score is above the mean A distribution of scores has a mean of μ = 86 (positive) or below the mean and a standard deviation of σ = 7. What z- (negative). score corresponds to a score of X = 95 in this o Positive = right side of the distribution? mean o Negative = left side of the mean The numerical value of the z-score specifies the distance from the mean by counting the number of standard deviations between X and μ. The locations identified by z-scores are the same for all distributions, no matter what mean or standard deviation the distributions may have. o The mean is always 0 and the standard deviation is always The 86 is above the 0 as it is the mean, 1 in a z-score distribution. SD is 1, and 1.29 is equivalent to the score of 95. A distribution of scores has a mean of μ = 100 and a standard deviation of σ = 10. What z- score corresponds to a score of X = 130 in this distribution? The relationship between z-score values and locations in a population distribution. z-score formula 𝑥− 𝜇 𝑧= 𝜎 This reviewer is not for sale. FIRST SEMESTER | BACHELOR OF SCIENCE IN PSYCHOLOGY | BACHELOR OF ARTS IN PSCYHOLOGY PSYCHOLOGICAL STATISTICS PRELIMS REVIEWER c. z = −1.00 d. z = −2.00 * The left side is the negative value. For a population with μ = 100 and σ = 20, what is the z-score corresponding to X = 105? a. +0.25 b. +0.50 Determining a Raw Score from a z-Score c. +4.00 For a distribution with a mean of μ= 60 and σ = 8, what X value corresponds to a z-score of d. +5.00 z = −1.50? /4 5 &$74 &$$ 𝑧= 6 >𝑧= #$ Formula: 𝑋 = 𝜇 + 𝑧𝜎 𝑋 = 60 + (−1.50)(8) Standardized scores 𝑋 = 60 + (−12) Standardize a distribution by 𝑋 = 48 transforming the scores into a new distribution with a predetermined mean and standard deviation that are whole round numbers. The goal is to create a new (standardized) distribution that has “simple” values for the mean and standard deviation but does not change any individual’s location within the distribution. X is 12 points below the mean or x = To get whole numbers for easier 48. interpretation Examples: Examples: o SAT scores μ=500, σ=100 Of the following z-score values, which one o IQ tests μ = 100, σ = 15 represents the most extreme location on the left-hand side of the distribution? a. z = +1.00 b. z = +2.00 This reviewer is not for sale. FIRST SEMESTER | BACHELOR OF SCIENCE IN PSYCHOLOGY | BACHELOR OF ARTS IN PSCYHOLOGY PSYCHOLOGICAL STATISTICS PRELIMS REVIEWER Using z-Scores to Standardize a 3. The Standard Deviation. The Distribution distribution of z-scores will always If every X value is transformed into a have a standard deviation of 1. In z-score, then the distribution of z- Figure 5.6, the original distribution of scores will have the following X values has μ = 100 and σ = 10. In properties: this distribution, a value of X = 110 is 1. Shape - The distribution of z- above the mean by exactly 10 points scores will have exactly the same or 1 standard deviation. When X = shape as the original distribution of 110 is transformed, it becomes z = scores. If the original distribution is +1.00, which is above the mean by negatively skewed, for example, then exactly 1 point in the z-score the z-score distribution will also be distribution. negatively skewed. Transforming raw o Thus, the standard deviation scores into z-scores does not change corresponds to a 10-point anyone’s position in the distribution. distance in the X distribution and is transformed into a 1- For example, any raw score that is point distance in the z-score above the mean by 1 standard distribution. deviation will be transformed to a z- o The advantage of having a score of +1.00. standard deviation of 1 is that 2. The Mean - The z-score the numerical value of a z- distribution will always have a mean score is exactly the same as of zero. In Figure 5.6, the original the number of standard distribution of X values has a mean of deviations from the mean. For μ = 100. When this value, X = 100, is example, a z-score of z = 1.50 transformed into a z-score, the result is exactly 1.50 standard is deviations from the mean. Figure 5.6 An entire population of scores is transformed into z-scores. The transformation does not change the shape of the distribution but the mean is transformed This reviewer is not for sale. FIRST SEMESTER | BACHELOR OF SCIENCE IN PSYCHOLOGY | BACHELOR OF ARTS IN PSCYHOLOGY PSYCHOLOGICAL STATISTICS PRELIMS REVIEWER into a value of 0 and the standard deviation is Remember: The values of μ and σ are transformed to a value of 1. for the distribution from which X was taken. Change each z-score into an X Transforming z-Scores to a Distribution value in the new standardized with a Predetermined μ and σ distribution that has a mean of μ = 50 The procedure for standardizing a and a standard deviation of σ = 10. distribution to create new values for μ and σ Apply the formula 𝑋 = 𝜇 + 𝑧𝜎 is a two-step process: 1. The original raw scores are transformed into z-scores. 2. The z-scores are then transformed into new X values so that the specific μ and σ are attained. o Maria’s z-score, z = +0.50, Example: indicates that she is located above An instructor gives an exam to a psychology the mean by 1/2 standard class. For this exam, the distribution of raw deviation. In the new, standardized scores has a mean of μ = 57 with σ = 14. The distribution, this location instructor would like to simplify the corresponds to X = 55 (above the distribution by transforming all scores into a mean by 5 points). new, standardized distribution with μ = 50 o 𝑋 = 50 + (0.5)(10) = 𝟓𝟓 and σ = 10. To demonstrate this process, we o Joe’s z-score, z = −1.00, indicates will consider what happens to two specific that he is located below the mean students: Maria, who has a raw score of X = by exactly 1 standard deviation. In 64 in the original distribution; and Joe, the new distribution, this location whose original raw score is X = 43. corresponds to X = 40 (below the Transform each of the original raw mean by 10 points). scores into z-scores o 𝑋 = 50 + (−1.00)(10) = 𝟒𝟎 o Maria: X = 64, z-score is +0.5 o Joe: X = 43, z-score is -1.0 This reviewer is not for sale. FIRST SEMESTER | BACHELOR OF SCIENCE IN PSYCHOLOGY | BACHELOR OF ARTS IN PSCYHOLOGY PSYCHOLOGICAL STATISTICS PRELIMS REVIEWER standard deviation will be μ = 50 and σ = 10. In the new, standardized distribution your score is X = 60. What was your score in the original distribution? a. X = 45 b. X = 43 c. X = 1.00 Figure 5.9 The distribution of exam scores d. impossible to determine without more from example 5.9. The original distribution information was standardized to produce a distribution Do it in reverse with μ = 50 and σ = 10. Note that each 𝑧= /4 5 𝑧= 8$4 7$ =1 6 &$ individual is identified by an original score, a 𝑋 = 𝜇 + 𝑧𝜎 z-score, and a new, standardized score. For 𝑋 = 35 + (1)(8) = 43 example, Joe has an original score of 43, a z- score of -1.00, and a standardized score of 40. Standardizing a Sample Distribution Example: the same principles can be used to 1. A distribution with μ = 47 and σ = 6 is identify individual locations within a being standardized so that the new mean and sample. standard deviation will be μ = 100 and σ = 20. provided that you use the sample What is the standardized score for a person mean and the sample standard with X = 56 in the original distribution? deviation to specify each z-score location. Thus, for a sample, each X a. 110 value is transformed into a z-score so b. 115 that c. 120 o 1. The sign of the z-score indicates whether the X value d. 130 is above (+) or below (−) the /4 5 784 '9 𝑧= 6 𝑧= 8 = 1.5 sample mean, and 𝑋 = 𝜇 + 𝑧𝜎 o 2. The numerical value of the 𝑋 = 100 + (1.5)(20) = 130 z-score identifies the distance from the sample mean by measuring the number of 2. A distribution with μ = 35 and σ = 8 is sample standard deviations being standardized so that the new mean and This reviewer is not for sale. FIRST SEMESTER | BACHELOR OF SCIENCE IN PSYCHOLOGY | BACHELOR OF ARTS IN PSCYHOLOGY PSYCHOLOGICAL STATISTICS PRELIMS REVIEWER between the score (X) and the o The sample of z-scores will sample mean (M). have a standard deviation of sz Formula:

Psychological Statistics Prelims Reviewer PDF (A.Y. 2024-2025)

Document Details

Tags

Related

Summary

Full Transcript

Upgrade to continue