Good Measurement & Experiments Intro

Choose a study mode

Play Quiz
Study Flashcards
Spaced Repetition
Chat to Lesson

Podcast

Play an AI-generated podcast conversation about this lesson
Download our mobile app to listen on the go
Get App

Questions and Answers

Which of the following best illustrates a concrete construct?

  • An economist analyzes 'consumer confidence' through survey responses.
  • A researcher studying 'happiness' defines it as overall life satisfaction.
  • A biologist measures 'heart rate' as beats per minute. (correct)
  • A psychologist investigates 'intelligence' using a standardized IQ test.

Which scenario exemplifies the use of a conceptual definition in research?

  • Defining 'hunger' as 'the subjective feeling of needing food'. (correct)
  • Defining 'sleep quality' as the amount of time spent in bed.
  • Defining 'stress' as the score obtained on a stress assessment questionnaire.
  • Defining 'exercise' as 30 minutes of physical activity, three times a week.

How would a researcher operationalize the conceptual variable 'anxiety' in a study?

  • By researching the history of anxiety disorders.
  • By measuring anxiety levels using a standardized scale. (correct)
  • By explaining the causes and effects of anxiety.
  • By defining anxiety as a general state of worry.

Which research scenario relies on self-report measures?

<p>Researchers ask participants to rate their mood on a scale of 1 to 7. (D)</p> Signup and view all the answers

What distinguishes observational measures from other types of data collection?

<p>They are based on direct observation of behavior or physical traces. (A)</p> Signup and view all the answers

What type of data do physiological measures primarily record?

<p>Biological data such as hormone levels or brain activity. (D)</p> Signup and view all the answers

A researcher administers the same anxiety questionnaire to a group of participants on two separate occasions one month apart and finds a strong positive correlation between the scores. What type of reliability is primarily demonstrated in this scenario?

<p>Test-retest reliability (A)</p> Signup and view all the answers

Two researchers are independently coding children's behavior on a playground to measure aggression. If they have high interrater reliability, what does this indicate?

<p>The two observers are in agreement about their observations. (A)</p> Signup and view all the answers

A researcher uses Cronbach's alpha to assess which type of reliability?

<p>Internal reliability (A)</p> Signup and view all the answers

Which of the following scales of measurement is characterized by categories with no numerical value?

<p>Nominal (B)</p> Signup and view all the answers

A researcher measures participants' ranking in a race (1st, 2nd, 3rd). Which scale of measurement is being used?

<p>Ordinal (B)</p> Signup and view all the answers

Which scale of measurement has equal intervals between values but no true zero?

<p>Interval (B)</p> Signup and view all the answers

Which of the following is an example of a ratio scale of measurement?

<p>Reaction time in milliseconds (D)</p> Signup and view all the answers

Which correlation coefficient indicates the strongest positive relationship between two variables?

<p>+0.75 (A)</p> Signup and view all the answers

What does face validity primarily assess?

<p>Whether a measure appears to measure what is intended (A)</p> Signup and view all the answers

What does content validity assess about a test or measure?

<p>Whether the measure includes all the parts of the construct it aims to assess. (C)</p> Signup and view all the answers

Which type of validity is demonstrated when SAT scores predict college performance?

<p>Criterion validity (C)</p> Signup and view all the answers

What does convergent validity indicate about a measure?

<p>It correlates strongly with other measures of the same construct. (C)</p> Signup and view all the answers

What does discriminant validity primarily examine?

<p>The extent to which a measure does not correlate with measures of different constructs. (A)</p> Signup and view all the answers

In the context of measurement, what does internal validity refer to?

<p>The degree to which a measure is free from confounding factors. (B)</p> Signup and view all the answers

Flashcards

Abstract Construct

A mental or theoretical concept. Not directly observable (e.g., love, hunger, intelligence).

Concrete Construct

Directly observable or measurable construct (e.g., reaction time, height).

Conceptual Definition

Specifies precisely what the researcher means when referring to a variable.

Operationalization

How a conceptual variable is measured or manipulated in a study.

Signup and view all the flashcards

Self-Report Measures

Measures based on participants' verbal or written responses (e.g., surveys, interviews).

Signup and view all the flashcards

Observational Measures (Behavioral)

Measures based on direct observation of behavior or physical traces of behavior.

Signup and view all the flashcards

Physiological Measures

Measures that record biological data (e.g., fMRI, EEG, hormone levels, heart rate).

Signup and view all the flashcards

Reliability

The consistency or repeatability of a measure.

Signup and view all the flashcards

Test-Retest Reliability

The consistency of a measure across two or more testing occasions.

Signup and view all the flashcards

Interrater Reliability

The degree to which two or more observers agree on their observations.

Signup and view all the flashcards

Internal Reliability

The extent to which multiple items on the same measure consistently measure the same construct.

Signup and view all the flashcards

Nominal Scale

Categorical data with no numerical meaning (e.g., types of soda).

Signup and view all the flashcards

Ordinal Scale

Ranked order, but intervals are not necessarily equal (e.g., finishing places in a race).

Signup and view all the flashcards

Interval Scale

Numeric scales with equal intervals between values, but no true zero.

Signup and view all the flashcards

Ratio Scale

Numeric scales with equal intervals and a true zero (i.e., zero means "nothing" of that variable).

Signup and view all the flashcards

Scatterplots

Graphs displaying the relationship between two variables, often used to visualize correlation.

Signup and view all the flashcards

Correlation Coefficients

A statistical measure indicating the direction and strength of the relationship between two variables.

Signup and view all the flashcards

Validity (of a Measure)

Extent to which a measure assesses what it intends to measure.

Signup and view all the flashcards

Face Validity

Whether a measure appears, at face value, to measure what it claims.

Signup and view all the flashcards

Content Validity

Extent to which a test includes all parts of the construct it aims to assess.

Signup and view all the flashcards

Study Notes

  • Study notes for the identification of good measurement and an introduction to simple experiments

Abstract vs. Concrete Constructs

  • An abstract construct is a mental or theoretical concept (e.g., "love," "hunger," "intelligence").
  • A concrete construct is more directly observable or measurable (e.g., "number of items recalled," "reaction time," or "height").
  • "Love" can be used as an example of an abstract concept.
  • "Blood pressure" can be used as a physiological index of stress, and a concrete concept.

Conceptual Definitions

  • The conceptual definition of a variable specifies precisely what the researcher means when referring to that variable, and is also known as the construct definition
  • A conceptual Definition for "hunger" might be "the subjective feeling of needing food."

Operationalization

  • An operational definition specifies how a conceptual variable is measured or manipulated in a study.
  • This is also called an operational definition.
  • If the conceptual variable is "hunger," operationalize it as "the number of hours since last meal" or "total calories consumed within a day."

Self-Report Measures

  • Measures are based on participants' verbal or written responses (e.g., surveys, questionnaires, interviews).
  • Self-report measures can include parent/teacher reports if studying children.
  • A questionnaire asks participants, “On a scale of 1–5, how hungry are you right now?"

Observational Measures (Behavioral)

  • Measures are based on direct observation of behavior or physical traces of behavior.
  • Physical traces might include counting wrappers in a trash can to gauge snacking.
  • Counting how many times someone opens a refrigerator as an Indicator of hunger.

Physiological Measures

  • Measures record biological data (e.g., fMRI, EEG, hormone levels, heart rate, etc.).
  • Salivary cortisol can measure stress level, or saliva production to operationalize hunger.

Reliability

  • Reliability indicates the consistency or repeatability of a measure.
  • Reliability encompasses test-retest, interrater, and internal reliability.

Test-Retest Reliability

  • Test-retest reliability indicates the consistency of a measure across two or more testing occasions for constructs expected to remain stable, such as intelligence.
  • If you measure someone's intelligence in January and again in June, the scores should correlate strongly if it is truly measuring "IQ.”

Interrater Reliability

  • Interrater reliability indicates the degree to which two or more observers or coders agree on their observations of the same behavior.
  • Two observers count how many times a child on a playground shows aggression, and their tallies should be very similar if interrater reliability is high.

Internal Reliability

  • Internal reliability indicates the extent to which multiple items on the same measure or scale consistently measure the same construct.
  • Internal reliability is often assessed using Cronbach's alpha.
  • If a depression questionnaire has 10 questions, each targeting symptoms of depression, scores should be correlated if the scale is internally reliable.

Scales of Measurement (Levels of Measurement)

  • Scales of measurement include Nominal, Ordinal, Interval, and Ratio

Nominal Scale

  • Categorical, with no numerical meaning, for example, types of soda: "Coke," "Sprite,” “Pepsi”.
  • Assigning “1” to Coke, “2” to Sprite, etc. for labeling only.

Ordinal Scale

  • Ranked order, but intervals are not necessarily equal, for example, finishing places in a race: 1st, 2nd, 3rd
  • The gap between 1st and 2nd place might be 10 seconds, but between 2nd and 3rd might be 30 seconds.

Interval Scale

  • Numeric scales with equal intervals between values, but no true zero with zero not meaning “none” of the construct.
  • Celsius temperature, where 0°C doesn't mean “no temperature.”

Ratio Scale

  • Numeric scales with equal intervals and a true zero where Zero means "nothing" of that variable.
  • Examples include height, weight, or reaction time where (0 ms would mean no reaction time).

Scatterplots

  • Graphs use to display the relationship between two variables, and are often used to visualize correlation

Correlation Coefficients

  • A statistical measure (e.g., Pearson's r) indicates the direction and strength of the relationship between two variables, ranging from -1.0 to +1.0.
  • r = +0.80 indicates a strong positive correlation, while r = -0.05 indicates nearly no relationship.

Validity (of a Measure)

  • Validity is the extent to which a measure assesses what it is intended to measure

Face Validity

  • Face validity questions whether a measure appears, at face value, to measure what it claims to measure.
  • A survey titled “Depression Questionnaire" that asks about mood and energy levels looks like it's measuring depression.

Content Validity

  • Content validity is the extent to which a test or measure includes all the parts of the construct it aims to assess.
  • A math test that covers algebra, geometry, and calculus (not just algebra) helps to capture the entire domain.

Criterion Validity

  • Criterion validity is the extent to which a measure predicts or relates to an outcome that it should theoretically predict.
  • SAT scores(the measure) can predict college performance (the outcome).

Convergent Validity

  • Convergent validity indicates the degree to which a measure correlates strongly with other measures of the same construct.
  • A new anxiety scale should correlate highly with an established anxiety questionnaire.

Discriminant (Divergent) Validity

  • Discriminant Validity indicates the extent to which a measure does not correlate strongly with measures of different constructs.
  • A depression scale should not strongly correlate with a measure of physical fitness.

Internal Validity (as a measurement term)

  • In measurement contexts, “internal validity" can sometimes mean the measure is free from confounding factors so that one can be confident the measure alone is capturing the construct.

Between-Groups Designs (Independent-Groups and Between-Subjects Designs)

  • Different groups of participants are placed into different levels of the independent variable.
  • Group 1 studies with classical music, and Group 2 studies with no music, then you compare test performance.

Equivalent Groups, Posttest-Only Design

  • Participants are randomly assigned to groups (to ensure equivalence), then tested once (posttest) on the dependent variable.
  • Randomly assign people to watch either a funny or a serious video, then measure mood afterward (posttest-only).

Equivalent Groups, Pretest/Posttest Designs

  • Participants are randomly assigned to at least two groups, measured on the DV before exposure to the IV (pretest) and then again after exposure (posttest).
  • Two groups both get a mood pretest, then Group 1 gets the "funny” video, Group 2 the "serious" video, followed by a mood posttest.

Control Variable (and Control Variables)

  • Any variable the experimenter holds constant across conditions in order to isolate the effect of the IV on the DV.
  • Keeping the room temperature, time of day, or lighting consistent for all participants ensures consistent conditions for all.

Dependent Variable

  • The variable that is measured to see whether it is affected by the independent variable (the "outcome").
  • Examples inlcude test scores, reaction time, and mood ratings.

Independent Variable

  • The variable is manipulated by the experimenter and has distinct levels/conditions
  • Classical music vs. no music in a memory study

Design Confounds

  • Design confounds are when a second variable systematically varies with the IV, providing an alternative explanation for your results.
  • Time of day differing between conditions (morning vs. evening), confounds with music vs. no music.

Effect Size

  • A quantitative measure of the strength or magnitude of the relationship between variables.
  • This includes effect measures like Cohen's d, and Pearson's r
  • A large Cohen's d (e.g., 0.80) indicates a big difference between groups.

Internal Validity (in Experiments)

  • Internal validity measures the degree to which a study rules out alternative explanations for a causal relationship between the IV and DV.
  • A well-controlled study with no confounds and random assignment ensures high internal validity.

Matched-Groups Design

  • Participants are matched on a particular characteristic (e.g., IQ, age) and then randomly assigned to different conditions.
  • Matched-groups designs are often used to reduce individual difference confounds in small samples.
  • Matching on reading level, then assigning matched pairs to different teaching-method conditions.

Order Effects (Carryover & Practice Effects)

  • Order effects refers to when being exposed to one condition changes how participants respond to subsequent conditions, and are a within-subjects phenomenon
  • There are two types:
    • Carryover Effects: Residual influence from a previous condition (e.g., drug still in the system)
    • Practice (or Fatigue) Effects: Participants get better (practice) or worse (fatigue) over repeated tasks.

Pretest/Posttest Design

  • A type of between-groups design where participants get measured on the DV both before and after exposure to the IV.
  • Known as “Equivalent Groups, Pretest/Posttest Designs.”

Selection Effects

  • Selection effects occur when participants in one IV level systematically differ from those in another IV level.
  • It often arises if participants self-select or if assignment is not properly random.
  • Volunteers for the "stressful condition” might already be more thrill-seeking compared to those placed in a neutral condition.

Systematic Variability

  • In experiments, a variable's levels coincide in a predictable (non-random) manner with the IV.
  • This can create confounds if it's not controlled.
  • If all enthusiastic research assistants run the “treatment" group, and all bored assistants run the “control" group, systematic variability has occured.

Unsystematic Variability

  • Random or haphazard fluctuations across conditions that contribute noise to data but does not provide a systematic alternative explanation.
  • Individual differences in mood fluctuate adding random variation but not necessarily tied to the IV group.

Within-Groups Designs (Within-Subjects Designs)

  • The same group of participants experiences all levels/conditions of the independent variable.

Concurrent Measures

  • Participants are exposed to all IV levels (or multiple stimuli) at the same time, then a single attitudinal or behavioral preference is the DV.
  • Infants see two faces simultaneously (male vs. female), and the time they spend looking at each face is measured.

Repeated Measures

  • Participants are measured on the DV multiple times, after each distinct level of the IV.
  • The participants rate the taste of Cookie A, then Cookie B, then Cookie C.

Attrition Threats

  • When certain participants (often extreme scorers) drop out of a study systematically there are attrition threats.
  • Attrition threatens internal validity if one condition loses more participants than another.
  • If participants with severe anxiety drop out of the "candy therapy" condition while calmer individuals stay, attrition has occured.

Blind Studies (and Double-Blind Studies)

  • In a single-blind (masked) study participants or researchers are unaware of which condition participants are in.
  • In a double-blind study, neither the participants nor the researchers evaluating them know who's in which condition.
  • In a drug trial, neither the participants nor the experimenters know who receives the real drug vs. a placebo.

Ceiling and Floor Effects

  • Ceiling effect: All scores cluster at the top of the scale because a task is too easy, or the measurement is capped.
  • Floor effect: All scores cluster at the bottom if the task is too hard or measurement starts above zero.
  • A math test that is far too easy could produce near-100% scores for everyone, resulting in a ceiling effect.

Combined Threats

  • Occur when two or more internal validity threats overlap (e.g., selection-history, where a historical event only affects participants in one condition).
  • Only the experimental group experiences a campus wellness event, leading them to change their behavior differently from the control group.

Demand Characteristics

  • Cues are present in the research setting and allow participants to guess the hypothesis or expectations, potentially altering their behavior to "help" the study.
  • A participant notices the researcher's excitement about one condition and starts trying to please them in that condition.

History Threats

  • External events happening during the study affect all (or most) participants in the treatment group.
  • Example: A campus-wide stress-reduction campaign might reduce everyone's stress, not just the group receiving a specific "therapy".

Instrumentation Threats

  • Occur if the measurement instrument (e.g., coding guidelines, calibration) changes over time, making pretest/posttest scores not directly comparable.
  • Example: A researcher becoming more lenient in scoring anxiety over the course of the study.

Maturation Threats

  • A natural change in participants (like spontaneous improvement) that occurs over time, and is not due to the IV.
  • Students becoming less nervous by the end of the semester simply because they've adjusted to school demands.

Measurement Error

  • Factors that inflate or deflate a person's true score on the DV.
  • Using a poorly calibrated scale to weigh participants introduces random error in weight measurements.

Observer Bias

  • Observer bias occurs when researcher expectations influence how they interpret outcomes or record behaviors.
  • A researcher unconsciously rates participants they "expect” to improve as more improved.

Placebo Effects

  • Improvement or change occurs simply because participants believe they are receiving a valid treatment, not from the treatment's "active" ingredients.
  • Example: Taking a sugar pill for headache relief induces feeling better purely due to the belief that the pill is real medicine.

Precision & Power

  • Power is the likelihood of finding a statistically significant effect when one truly exists.
  • Precision refers to how narrow the estimate (e.g., a confidence interval) is around an effect.
  • Larger sample sizes and well-controlled methods increase power by reducing random noise.

Regression Threats (Regression to the Mean)

  • The tendency for extreme scores at one measurement to be less extreme (closer to average) at the next measurement.
  • Students with unusually high nervousness scores on the first day might naturally score closer to average by the second day, regardless of any treatment.

Reverse Confounds

  • When an unaccounted-for variable pushes the effect of the IV in the opposite direction, potentially masking a real difference.
  • If the "no coffee" group accidentally all had intense sugar rushes, it might cancel out the caffeine advantage in the "coffee" group.

Sample Sizes

  • The number of participants in each condition impacts variability of results.
  • Larger sample sizes reduce random error and increase the precision of estimates, thereby boosting power.
  • A study with 200 participants per group typically has higher power than one with only 20 participants per group.

Situation Noise

  • Any external distractions or variability in the testing environment that increase unsystematic variability within each group.
  • Running one group of participants in a noisy hallway and the other group in a quiet lab could add extra "noise" to the data.

Studying That Suits You

Use AI to generate personalized quizzes and flashcards to suit your learning preferences.

Quiz Team

Related Documents

More Like This

Definición Operacional de Constructos
10 questions
Definición Operacional de Constructos
5 questions
Psychologie Hoofdstuk 3: Variabelen
38 questions

Psychologie Hoofdstuk 3: Variabelen

TrustworthyXylophone1088 avatar
TrustworthyXylophone1088
Research Methods in Psychology
24 questions
Use Quizgecko on...
Browser
Browser