Summary

These notes cover fundamental concepts in statistics, including populations, samples, variables, data, parameters, statistics, descriptive and inferential statistics, and the correlational and experimental methods. The notes also discuss how to control variables in research designs.

Full Transcript

Wednesday January 8 2025 Statistics ​ Help to organize and summarize data ​ Help researchers to answer questions to determine the general conclusions are Population And Samples ​ Population is the individuals of interest ​ Sample is the individual selected ○​ Intended to rep...

Wednesday January 8 2025 Statistics ​ Help to organize and summarize data ​ Help researchers to answer questions to determine the general conclusions are Population And Samples ​ Population is the individuals of interest ​ Sample is the individual selected ○​ Intended to represent the population Variables And Data ​ Variable is a characteristic or condition that changes ​ Data (plural) are measurements and observations ○​ Data set: collection of measurements or observations ○​ Datum (singular): single measurement or observation ​ Score or raw Parameters and Statistics ​ Parameter is a value that describes a population (both state with the letter P) ○​ Usually derived from measurements from the population ​ Statistic is a value that describes a sample (both start with the letter S) ○​ Usually derived from measurements from a sample Descriptive And Inferential Statistics ​ Descriptive statistics summarize, organize, and simplify data ​ Inferential statistics are techniques that allow is to study samples and make generalizations ​ Sampling error are naturally occurring discrepancies (errors) between a sample statistic and the corresponding population parameter ​ The goal in inferential statistics is to help researchers decide between the two interpretations Individual Variables And Relationships Among Variables ​ Some describe individual variables ​ Most intend to examine the relationship between two or more variables The Correlational Method ​ Observe two different variables to determine whether there is a relationship between them ○​ Correlation: numerical scores ○​ Chi-square test: categories ​ The results form a correlational study can demonstrate the existence of a relationship between two variables ○​ They do not provide an explanation for the relationship ​ Cannot demonstrate a cause and effect relationship Experimental And Nonexperimental Methods ​ Examines the relationship between variables by using one of the variables to define groups, and then measuring the second variable to obtain scores for each group ○​ Experimental study: results allow for a cause and effect explanation ○​ Nonexperimental study: results does not permit a cause and effect explanation The Experimental Method ​ The goal of an experimental study is to demonstrate a cause and effect relationship ​ 2 distinguishing characteristics: ○​ Manipulation ○​ Control ​ 2 general categories of variables ○​ Participant: age, gender, and intelligence ○​ Environmental: lighting, time of day, and weather conditions ​ 3 methods to control variables ○​ Random assignment: assign participants to each group randomly ○​ Matching: match the demographics in the two groups (there is a 45 year old female in group A, you are going to put another 45 year old female in group B) ○​ Holding constant: only sampling a specific demographic (only selecting male participants) Terminology In The Experimental Method ​ Independent variable: manipulated by the researcher ○​ Behavioural research: ​ Two or more treatment conditions ​ Participants are exposed to the conditions chosen by the researcher ○​ Antecedent conditions: ​ Conditions that are present or manipulated prior to observing the dependant variable ​ Dependent variable ○​ Observed to assess the effect of the independent variable (in the video game example it would be the aggression) ​ Control condition: participants do not receive the experimental treatment ○​ Receive no treatment of a neutral or placebo treatment ○​ Purpose: provide a baseline for comparison (different from exerting control over your variables) ​ Experimental condition: participants receive the experimental treatment Non Experimental Method: Nonequivalent Groups ​ Nonequivalent groups is when researchers have no ability to control which group participants are assigned to ○​ Compares pre existing groups Non Experimental Method: Pre-Post Studies ​ Pre-post study is when the same variable is measured twice for each participant, once before and again after the treatment ○​ No control over the passage of time ○​ “Before” scores are always earlier than the “after” scores ○​ Aa difference may be caused by the treatment, but it is possible that they simply change as time goes on Terminology In Nonexperimental Research ​ “Independent variable”(being able to manipulate it)= Quasi- independent variable (not being able to manipulate it) Friday January 10 2025 Constructs And Operational Definitions ​ Constructs are attributes and characteristics that cannot be directly observed but are useful for describing/ explaining behaviour (an example can be happiness) ​ Operational definition: ○​ Measurement procedure for measuring an external behaviour ( happiness could be measured by how many times a person smiled during a conversation) ○​ Uses the resulting measurements as a definition/ measurement of a hypothetical construct Types Of Variables ​ Discrete variable: ○​ Separate indivisible categories ○​ No vales exist between categories ○​ Example: dogs vs cats ​ Continuous variable: ○​ Contains infinite number of possible values ○​ Divisible into an infinite number of fractional parts ○​ Example: temperature Continuous Variables ​ Very rare to obtain identical measurements for two different individuals ○​ Each measurement category is an interval that must be defined by boundaries ○​ Real limits: the boundaries of intervals for scores that are represented on a continuous number line ​ Upper real limit: at the top ​ Lower limit: at the bottom Measuring Variables ​ Researchers must observe the variables and record the observations to establish relationships between variables ○​ Requires measurement ○​ Scale of measurement: the set of categories required to measure a variable ​ The process classified each individual into one category Four Types Of Measurement Scales ​ Nominal scale is a set of categories that have different names ○​ Label and categorize observations ○​ Do not make any qualitative distinctions between observations ​ Examples: ​ Where do you live: suburbs, city, town ​ Sex: female and male ​ Ordinal scale is a set of categories that are organized on an ordered sequence ○​ Ranks observations in terms of size or magnitude ​ Example: ​ How satisfied are you: 1. Very unsatisfied, 2. Unsatisfied, 3. Neutral, 4. Satisfied, 5. Very satisfied ​ Interval scale is ordered categories that are intervals of exactly the same size ○​ Equal differences between numbers on scale reflect equal differences in magnitude ○​ The zero point is a arbitrary and does not indicate a zero amount of the variable being measured ​ Example: ​ Temperature; the difference between one degree and 3 degrees is the same as the difference between 4 degrees and 6 degrees ​ Ratio scale is an interval scale with the additional feature of an absolute zero point ○​ Ratios of numbers reflect ratios of magnitude ○​ Zero actually means zero ○​ The happiness example with the amount of smiles would be classified as a ratio scale ​ Examples: ​ Height ​ Weight Statistical Notation ​ The individual measurements or scores obtained for a research participant will be identified by the letter X (or X and Y if there are multiple scores for each individual) ​ The number of scores in a data set is identified by N for a population or n for a sample ​ Summing a set of variables is a common operation in statistics and has its own notation ○​ The greek letter sigma, Σ, stands for “the sum of” ​ Example ΣX identifies the sum of the scores Order Of Operations ​ BEDMAS ○​ All calculations within parentheses ○​ Squaring or raising to other exponents ○​ Multiplying, dividing, completed in order from left to right ○​ Summation with the Σ notation ○​ Any additional adding and subtracting, completed in order from left to right Monday January 13 2025 Frequency Distributions ​ After collecting data, the first task for a researcher is to organize and simplify the data so that it is possible to get a general overview of the results ​ This is the goal of descriptive statistical techniques ​ One method for simplifying and organizing data is to construct a frequency distribution ​ Frequency distribution: an organized tabulation showing exactly how many individuals are located in each category on the scale of measurement ○​ Can be structured either as a table or as a graph, and presents the same two elements ○​ The set of categories that make up the original measurement role ○​ A record of the frequency, or sumner of individuals in each category Frequency Distribution Tables ​ A frequency table consists of at least two columns ○​ Lists the categories on the scale of measurement (X) ​ Values are listed highest to lowest, without skipping any ○​ Frequency (⨏) ​ Tallies are determined for each value (how often each X value occurs in the data set) ​ The sum of the frequencies should equal N Building A Frequency Table * in notes Frequency Distribution Tables ​ A third column can be used for the proportion (p) for each category: p=⨏/N ○​ Because the proportions describe the frequency in relation to the total number, they often are called relative frequencies ​ A fourth column can display the percentage of the distribution corresponding to each X value ○​ The percentage is found by multiplying p by 100 Grouped Frequency Distribution ​ Sometimes a set of scores covers a wide range of values ○​ A list of all the X values be too long to be a “simple” presentation of the data ​ Grouped frequency distribution ○​ In a grouped table, the X column lists groups of scores, called class intervals, rather than individual values ​ The grouped frequency distribution table should have about 10 class intervals ​ The width of each interval should be a relatively simple number such as 2,5,10 or 20 ​ The bottom score is each class interval should be a multiple of the width ​ All intervals should be the same width Example: Grouped Frequency Distribution ​ Frequency Distribution Graphs ​ X axis: contains the score categories (X) ​ Y axis: contain the frequencies ​ When the score categories consist of numerical scored from an interval or ratio scale, the graph should be either a histogram or a polygon Histogram ​ In a histogram, a bar is centered above each score (or class interval) ○​ The height of the bar corresponds to the frequency ○​ The width extends to the real limits, so that adjacent bars touch Polygons ​ In a polygon, a dot is centered above each score ○​ The height if the not corresponds to the frequency ○​ A continuous line is drawn from dot to dot to connect the series of dots ○​ The graph is completed by drawing a line down the x axis (zero frequency) at each end of the range of scores Bar Graph ​ Used when score categories (X values) are measurements from a nominal or ordinal scale ​ A bar graph is just like a histogram, except that gaps or spaces are left between adjacent bars ○​ Nominal scale: the space emphasis that the scale consists of separate, distinct categories ○​ Ordinal scale: separate bars are used because you cannot assume that the categories are all the same size Graphs For Population Distributions ​ When you can obtain an exact frequency for each score in a population, you can construct frequency distribution graphs that are exactly the same as the histograms, polygons and bar graphs that are typically used for samples ○​ Many populations are so large that it is impossible to know the exact number of individuals (frequency) for any specific category Relative Frequencies ​ When the exact number of individuals is not known, population distributions can be shown using relative frequency instead of the absolute number of individuals for each category Smooth Curve ​ If the scores in a population are measured on an interval or ratio scale, it is a customary to present the distribution and a smooth curve rather than a jagged histogram or polygon ​ The smooth curve emphasis the fact that the distribution is not showing the exact frequency for each category ​ Normal curve: one commonly occurring population distribution ○​ The word nominal refers to a specific shape that can be precisely defined by an equation Wednesday January 15 2025 Describing Frequency Distributions ​ Researchers often simply describe a distribution by listing its characteristics ​ Characteristics: ○​ Central tendency: measures where the center of the distribution is located ○​ Variability: measures the degree to which the scores are spread over a wide range or are clustered ○​ Shape Shape ​ A graph shows the shape of the distribution ​ Symmetrical: the left side is roughly a mirror image of the right side ​ Skewed: the scores tend to pile up toward one end of the scale and taper off gradually at the other end ○​ Tall: the section where the scores taper off towards the end of the disrobing Positively And Negatively Skewed Distributions ​ Positively skewed: the scores tend to pile up on the left side of the distribution with the tail tapering of to the right ○​ Example: wealth among citizens ​ Negatively skewed: the scores tend to pile up on the right side and the tail points to the left ○​ Example: life expectancy Different Shapes For Distributions Stem-And-Leaf Displays ​ Stem-and-leaf: provides an efficient method for obtaining and displaying a frequency distribution ○​ Each score is divided into a stem consisting of the first digit/digits, and a leaf consisting of the final digit ○​ Then, go through the list of scores, one at a time, and write the leaf for each score beside the stem ​ The resulting display provides an organized picture of the entire distribution ​ The number of leaves beside each stem corresponds to the frequency, and the individual leaves identify the individual scores Building A Stem-And-Leaf Display X values: 25, 77, 38, 57, 52, 69, 64, 57, 44, 56, 52, 60, 39, 58, 58, 30, 50, 54, 51, 65 Stem Leaf 2 5 3 8, 9, 0 4 4 5 7, 7, 6, 2, 8, 8, 4, 1 6 9, 4, 0, 5 7 7, Central Tendency ​ A statistical measure to determine a single score that defines the centre of a distribution ○​ Goal: to find the single score that is most typical or representative of the entire group ​ “Average” or “typical” score ​ This average value can be used to provide a simple description of an entire population or sample ​ Measures of central tendency are also useful for making comparisons between groups of individuals or between data sets ​ There is no single, standard procedure for determining central tendency ○​ The problem is that no single measures produces a central, representative value in every situation ○​ There can be problems in defining the “center” of a distribution ​ To deal with these problems, statisticians have developed 3 different methods for measuring central tendency ○​ Mean, median, mode Mean ​ The sum of scores divided by the number of scores * example in notes Alternative Definitions Of The Mean ​ Dividing the total equally: ○​ Think of the mean as the amount each individual received when the total (ΣX) is divided equally among all the individuals (N) in the distribution ​ The mean is a balance point: ○​ Think of the mean as a balance point for the distribution The Weighted Mean ​ Often it is necessary to combine two sets of scores and then find the overall mean for the combined group * example in notes ○​ To calculate the overall mean, we need 2 values: ​ The overall sum of scores for the entire group (ΣX) ​ The total number of scores in the combined group (n) ​ Unless there are the same number of scores for each group, the overall mean will not be halfway between the original two sample means ○​ When the samples are not the same size, one makes a larger contribution to the total group and therefore carries more weight in determining the overall mean Alternative Methods For Calculating The Mean ​ Frequency table ○​ Determine the number of scores, n, by adding the frequencies ○​ Find the sum of the scores, ΣX, by multiplying each X value by its frequency Friday January 17 2025 Characteristics Of The Mean ​ In general, the characteristics of the men result from the fact that every score in the distribution contributes to the value of the mean ○​ Specifically, every score adds to the total (ΣX) and every score contributes one point to the number of scores (n) ​ Changing the value of any score will change the mean ​ Adding a new score to a distribution, or removing an existing score will usually chang the mean ○​ The exception is when the new score (or the removed score) is exactly equal to the mean * in notes ​ If a constant value is added to every score in distribution, the saame constant will be added to the mean ○​ Similarly, if you subtract a constant from every score, the same constant will be subtracted from the mean ○​ Adding 2 to every value of the data set also adds 2 to the original mean (same with subtraction) ​ If every score in a distribution is multiplied by (or divided) a constant value, the mean will change in the same way The Median ​ Goal: to locate the midpoint of aa distribution ​ If the scores in a distribution are listed in order from smallest to largest, the median is the midpoint of the list ​ Defining the median as the midpoint of a distribution means that the scores are being divided into two equal size groups ​ Calculating the median: ○​ With an odd number of scores * in notes ○​ With an even number of scores * in notes The Mode ​ The score or category that has the greatest frequency ​ The only measure of central tendency that will always correspond to an actual score in the data ○​ The mean and median are both calculated values and often produce an answer that does not equal any score in the distribution mode= 8 ​ Although a distribution will only have one mean, and only one median, it is possible to have more than one mode ​ Bimodal: a distribution with 2 modes ​ Multimodal distribution: a distribution with more than 2 modes The Mean, The Median, And The Mode ​ Mean: a balance point- the distances above the mean have the same tidal as the distances below the mean ​ Median: the middle of the distribution (in terms of scores) ​ Mode: the score or value that occurs most often Selecting A Measure Of Central Tendency ​ Extreme score or skewed distributions ○​ When a distribution has few extreme cores, scores that are very different in value from most of the others, then mean may not be a good representative of the majority of the distribution ○​ Because it is relatively unaffected by extreme scores, the median commonly is used when reporting the average value for a skewed distribution ​ Undetermined values ○​ Occasionally, you will encounter a situation in which an individual has unknown or undetermined score ​ This often occurs when you are messing the number of errors (or amount of time) required for an individual to complete a task ○​ It is impossible to compute the mean for these data because of the undetermined value ○​ However, it is possible to determine the ___________ ​ Open ended distributions ○​ When there is no upper limit )or lower limit) for one of the categories ○​ It is impossible to compute a mean for these data because you cannot find the ΣX ​ You can find the median ​ Ordinal data ○​ Many researchers believe that it is not appropriate to use the mean to describe central tendency for ordinal data ○​ When scores are measured on an ordinal scale, the median is always appropriate and is usually the preferred measure of central tendency ​ When to use the mode: ○​ Nominal scale ○​ Always identifies an actual score and is thus useful in describing discrete variables ○​ The mode gives an indication of the shape of the distribution as well aas a measure of central tendency Reporting Measures Of Central Tendency ​ Measures of central tendency are commonly used in behavioural sciences to summarize and describe the results of a research study ○​ These values may be reported in text describing the results, or presented in tables or graphs ​ Graphs can also be used to report and compare measures of central tendency ○​ The means (or medians) are displayed using a line graph histogram, or bar graph, depending on the scale of measurement used for the independent variable ​ The height of a graph should be approximately two- thirds to three- quarters of its length ​ Normally, the zero point for both the x and y axis is at the point where the two axes intersect ○​ However, when a value of zero is part of the data, it is common to move the zero point so that the graph does not overlap the axes Central Tendency And The Shape Of The Distribution ​ Symmetrical distribution: the right hand side is a mirror image if the left hand side ○​ The median is exactly at the center because exactly half of the area in the graph will be on either side of the center ○​ The mean is exactly at the center because each score on the left side of the distribution is balanced by a corresponding score on the right ○​ If a symmetrical distribution has only one mode, it will also be in the center of the distribution Measures Of Central Tendency For Skewed Distributions ​ Skewed distributions: there is a strong tendency for the mean, median, and mode, to be located in predictability different positions (especially for continuous variables) ○​ Positively skewed: the most likely order of the 3 measurements of central tendency from smallest to largest (left to right) is the mode, median, and the mean ○​ Negatively skewed: the most probably order is mean, median, and mode Monday January 20 2025 Central Tendency Recap ​ A statistical measure to determine a single score that defines the center of a distribution ○​ Goal: to find the single score that is most typical or most representative of the entire group ​ Mean: a “balance point”- the distances above the mean have the same total as the distances below the mean ​ Median: the middle of the distribution (in terms of scores) ​ Mode: the score or value that occurs most often When To Use Measures Of Central Tendency ​ Median: ○​ When the data contains extreme score ○​ When the distribution is skewed ○​ When there are undetermined or unknown values within the data set ○​ When you have an open-ended distribution ○​ When you are using an ordinal scale ​ Mode: ○​ When you are using a nominal scale ​ Mean: ○​ Most common ○​ Used in all other instances Central Tendency And The Shape Of The Distribution Variability ​ Variability: provides a quantitative measure of the differences between scores in a distribution and describes the degree to which the scores are spread out or clustered together ○​ Describes the distribution ○​ Measures how well an individual score (or group of scores) represents the entire distribution Variability: The Range ​ Range: the distance covered by the scores in a distribution, from the smallest score to the largest score ○​ Common definition: measures the difference between the largest score (Xmax) and the smallest score (Xmin) The Range: Example * in notebook Wednesday January 22 2025 Standard Deviation And Variance ​ Deviation: the distance from the mean ○​ Deviation score= X- μ ​ Variance: equals the mean of the squared deviations ○​ The average squared distance from the mean ​ Standard deviation: the square root of the variance ○​ Provides a measure of the standard, or average distance from the mean The Calculation Of Variance And Standard Deviation * in notebook Measuring Variance And Standard Deviation For A Population ​ Variance: the mean of the squared deviations ○​ Find the sum, and then divide by the number of scores ​ Definitional formula: ○​ SS=Σ(X-μ) ​ Population Variance: represented by the symbol 𝜎^2 and equals the mean squared distance from the mena ○​ Population variance is obtained by dividing the sum of squares by n ​ Population standard deviation: represented by the symbol 𝜎 and equals the square root of the population variance Measuring Variance And Standard Deviation For A Sample ​ The goal of the inferential statistics is to use the limited information from samples to draw general conclusions about populations ○​ The basic assumption of this process is that samples should be representative of the populations from which they are drawn ​ This assumption possess a special problem for variability because samples consistently tend to be less variable than their populations ​ Fortunately, the bias in sample variability is consistent and predictable, which means it can be corrected Population Variability Formulas For Sample Variation And Standard Deviation ​ Definitional formula: ​ ​ ​ SS=∑(X-M)^2 ○​ Calculating SS for a sample is exactly the same as for a population, except for minor changes in notation ​ After you compute SS, however, it becomes critical to differentiate between samples and populations ​ Sample variance: represented by the symbol s^2 and it equals the mean squared distance from the mean ○​ Sample variance is obtained by dividing the sum of squares by (n-1) ​ Sample standard deviation: is represented by the symbol s and equals the square root of the sample variance Sample Variance And Standard Deviation: Example * in notebook Friday January 24 2025 Frequency Distribution Histogram And Standard Deviation Sample Variability And Degrees Of Freedom ​ With a population, you find deviation for each score by measuring its distance from the population mean ○​ With a sample, the value of 𝜇 is unknown and you must measure distances from the sample mean ○​ Because the value of the sample mean varies from one sample to another, you must first compute the sample mean ○​ However, calculating the value of M places a restriction on the variability of the scores in the sample ​ In general, when a sample has n scores, the first (n-1) scores are free to vary, but the final score is restricted ​ For a sample the n scores, the degrees of freedom, of df, for the sample variance are defined as 𝑑𝑓=n-1 ​ The degrees of freedom determine the number of scores in the sample that are independent and free to vary ​ This is obviously bad research practice ○​ The data shoe never be modified by a researcher ​ The concept of degrees of freedom however allows us to determine what the missing value is, regardless of which data set we use ○​ The number of values within the dat set that are free to vary is equal to number of the values (n) minus 1 ○​ It “doesn’t matter” that Suzie modified some of the data, because we know the mean she originally calculated, the missing datum is restricted based on the mean and the other values Sample Variance As A Unbiased Statistic ​ A sample statistic is unbiased if the average value of the statistic is equal to the population parameter ○​ The average value of the statistic is obtained from all the possible samples for a specific sample size, n ​ A sample statistic is biased if the average value of the statistic either underestimates or overestimates the corresponding population parameter Presenting The Mean And Standard Deviation ​ In frequency distribution graphs, we identify the population of the mean by drawing a vertical line and labeling it with 𝜇 or M ​ Because the standard deviation measures distance from the mean, it is represented by a line or an arrow drawn from the mean outward for a distance equal to the standard deviation and labeled with a 𝜎 or s Means And Standard Deviations Transformation Of Scale ​ If a constant value is added to every score in a distribution, this does not change the standard deviation ​ If each score in a distribution is multiplied by a constant, the standard deviation is multiplied by the same constant ○​ Similarly, if you divide each score by a constant, the same standard deviation will be defined by the same constant Standard Deviation And Descriptive Statistics ​ Describing the entire distribution: ○​ Rather than listing all of the individual scores in a distribution, research reports typically summarize the data by reporting only the mean and the standard deviation ○​ M=10, s=2 ​ Describing the location of individual scores ○​ The mean and the standard deviation can be used to reconstruct the underlying scale of measurement (the X values along the horizontal line) ○​ The scale of measurement helps complete the picture of the entire distribution and helps to relate each individual score to the rest of the group Variance Of Inferential Statistics ​ In very general terms, the goal of inferential statistics is to detect meaningful and significant patterns in research results ○​ Variability plays an important role in the inferential process because the variability in the data influences how easy it is to see patterns ○​ In general, low variability means that existing patterns can be seen clearly, whereas high variability tends to obscure any patterns that might exist Monday January 27 2025

Use Quizgecko on...
Browser
Browser