Podcast
Questions and Answers
What are the two variants of the independent samples t-test mentioned?
What are the two variants of the independent samples t-test mentioned?
- Normal distributions and non-normal distributions
- Homogeneous variances and heterogeneous variances
- Equal variances assumed and equal variances not assumed (correct)
- Equal means and unequal means
What factor can lead to non-representativeness in a sample according to the discussed content?
What factor can lead to non-representativeness in a sample according to the discussed content?
- Non-random non-response (correct)
- Stratified sampling
- Normal distribution of data
- Simple randomization
When can the data missing mechanism be ignored without issue?
When can the data missing mechanism be ignored without issue?
- When data are MCAR or MAR (correct)
- When data are perfectly random
- When data are normally distributed
- When data are MNAR only
What does weighting aim to achieve in survey data analysis?
What does weighting aim to achieve in survey data analysis?
What is the relationship between age classes and weighting in the context provided?
What is the relationship between age classes and weighting in the context provided?
What is the main purpose of using several items together to measure a construct?
What is the main purpose of using several items together to measure a construct?
Which of the following statements about Cronbach's alpha is true?
Which of the following statements about Cronbach's alpha is true?
What characterizes data that is Missing Completely at Random (MCAR)?
What characterizes data that is Missing Completely at Random (MCAR)?
Which mechanism describes data missing that depends on observed information?
Which mechanism describes data missing that depends on observed information?
What type of scale is implied by averaging multiple items together?
What type of scale is implied by averaging multiple items together?
What is a key feature of Missing Not at Random (MNAR)?
What is a key feature of Missing Not at Random (MNAR)?
Which statement best describes unit non-response?
Which statement best describes unit non-response?
What is the primary function of weighting in surveys?
What is the primary function of weighting in surveys?
What characterizes the Missing Not at Random (MNAR) scenario?
What characterizes the Missing Not at Random (MNAR) scenario?
What is a risk of using a random sample from a small population?
What is a risk of using a random sample from a small population?
What can help determine if a sample resembles the target population?
What can help determine if a sample resembles the target population?
If the observed percentage of females in a sample is lower than the population percentage but not significantly different, what can be concluded?
If the observed percentage of females in a sample is lower than the population percentage but not significantly different, what can be concluded?
Why could generalizability be lost when not all intended respondents participate?
Why could generalizability be lost when not all intended respondents participate?
When testing a sample proportion in SPSS, what does the test value represent?
When testing a sample proportion in SPSS, what does the test value represent?
What is a challenge when trying to gather information from non-respondents?
What is a challenge when trying to gather information from non-respondents?
What does a comparison of important variables between respondents and non-respondents help to identify?
What does a comparison of important variables between respondents and non-respondents help to identify?
What is the null hypothesis (𝐻0) regarding the proportions of the different groups?
What is the null hypothesis (𝐻0) regarding the proportions of the different groups?
What is the impact of unit non-response on generalisability?
What is the impact of unit non-response on generalisability?
How can under- or over-sampling be corrected in analysis?
How can under- or over-sampling be corrected in analysis?
What does the term 'item non-response' refer to?
What does the term 'item non-response' refer to?
In SPSS, how is a weighting variable created?
In SPSS, how is a weighting variable created?
What should be done if a variable is categorical when comparing respondents and non-respondents?
What should be done if a variable is categorical when comparing respondents and non-respondents?
What example illustrates the need for weighting in stratified sampling?
What example illustrates the need for weighting in stratified sampling?
What variable is added to the SPSS data file for differentiating respondents?
What variable is added to the SPSS data file for differentiating respondents?
What is the primary benefit of using regression imputation compared to mean imputation?
What is the primary benefit of using regression imputation compared to mean imputation?
What does multiple imputation address that single imputation methods do not?
What does multiple imputation address that single imputation methods do not?
When performing multiple imputation, what is generally assumed about the number of imputations?
When performing multiple imputation, what is generally assumed about the number of imputations?
What challenge is associated with regression imputation?
What challenge is associated with regression imputation?
In the context of missing data mechanisms, which statement is true?
In the context of missing data mechanisms, which statement is true?
What is a key characteristic of pooled regression in multiple imputation?
What is a key characteristic of pooled regression in multiple imputation?
What does stochastics regression imputation include that traditional regression does not?
What does stochastics regression imputation include that traditional regression does not?
What is an inherent limitation of mean imputation?
What is an inherent limitation of mean imputation?
Flashcards
Independent Samples t-test
Independent Samples t-test
A statistical test used to compare the means of two independent groups. It determines if there is a statistically significant difference between the average values of the two groups.
Weighting
Weighting
A statistical technique used to adjust for non-representativeness in a sample. Weights are calculated for each observation, and these weights are applied to the data to make the sample more representative of the population.
MCAR (Missing Completely At Random)
MCAR (Missing Completely At Random)
A type of missing data mechanism where the probability of an observation being missing is independent of the values of other variables or the observed data itself. In other words, missing data is completely random.
MAR (Missing At Random)
MAR (Missing At Random)
Signup and view all the flashcards
MNAR (Missing Not At Random)
MNAR (Missing Not At Random)
Signup and view all the flashcards
Item non-response
Item non-response
Signup and view all the flashcards
Unit non-response
Unit non-response
Signup and view all the flashcards
Chi-square test
Chi-square test
Signup and view all the flashcards
Stratified sampling
Stratified sampling
Signup and view all the flashcards
Generalizability
Generalizability
Signup and view all the flashcards
Comparing respondents and non-respondents
Comparing respondents and non-respondents
Signup and view all the flashcards
SPSS
SPSS
Signup and view all the flashcards
Missing Not at Random (MNAR)
Missing Not at Random (MNAR)
Signup and view all the flashcards
Random Sampling
Random Sampling
Signup and view all the flashcards
Sample Bias
Sample Bias
Signup and view all the flashcards
Does the sample resemble the population?
Does the sample resemble the population?
Signup and view all the flashcards
Testing a proportion
Testing a proportion
Signup and view all the flashcards
Test value
Test value
Signup and view all the flashcards
Why do we use scales to measure constructs?
Why do we use scales to measure constructs?
Signup and view all the flashcards
What is a scale score?
What is a scale score?
Signup and view all the flashcards
Why are scales advantageous in research?
Why are scales advantageous in research?
Signup and view all the flashcards
What is Cronbach's alpha?
What is Cronbach's alpha?
Signup and view all the flashcards
What are Missingness Mechanisms?
What are Missingness Mechanisms?
Signup and view all the flashcards
What is Missing Completely at Random (MCAR)?
What is Missing Completely at Random (MCAR)?
Signup and view all the flashcards
What is Missing at Random (MAR)?
What is Missing at Random (MAR)?
Signup and view all the flashcards
What is Missing Not at Random (MNAR)?
What is Missing Not at Random (MNAR)?
Signup and view all the flashcards
Mean Imputation
Mean Imputation
Signup and view all the flashcards
Regression Imputation
Regression Imputation
Signup and view all the flashcards
Multiple Imputation
Multiple Imputation
Signup and view all the flashcards
Predictive Mean Matching (PMM)
Predictive Mean Matching (PMM)
Signup and view all the flashcards
Study Notes
Week 6: Weighting and Scaling
- Survey methodology, Block 2, 2024
- Focus on weighting and scaling techniques for survey data
- Covers scales, reliability, missing data mechanisms, and unit/item non-response
Scales and Reliability of Scales
- Many constructs are complex and cannot be measured with a single question
- Several items combined to measure a construct create a scale score
- Summing or averaging items into a scale score results in a new variable
- A scale's level of measurement is an advantage
- Scales are reliable when their items measure the same concept
- Cronbach's alpha and item analysis assess scale reliability (next slide)
Reliability of Scales: Output
- Cronbach's alpha (closer to 1 indicates higher internal consistency)
- Item-total statistics (examines each item in relation to the scale)
- Scale mean if item deleted
- Variance if item deleted
- Corrected item-total correlation
- Cronbach's alpha if item deleted
Introduction to Missingness Mechanisms
- Every data point has a probability of being missing, governed by a mechanism (MCAR, MAR, MNAR)
- Missing Completely at Random (MCAR): Missingness probability is unrelated to other data
- Missing at Random (MAR): Missingness probability relates to observed data
- Missing Not at Random (MNAR): Missingness probability relates to unobserved data
What It Would Look Like
- Example table demonstrating missing data (respondent, response, weight, gender)
- "R," "Y," and "X" variables in a table format (Respondent, Response, Weight, Gender) show a few example cases with missing values.
Missing Completely at Random (MCAR)
- The probability of a data point being missing is the same for all data points
- Missingness is unrelated to the data itself
Missing at Random (MAR)
- The probability of a data point being missing depends on the observed data
- Missingness relates to observable data.
Missing Not at Random (MNAR)
- The probability of a data point being missing depends on unobserved data
- Missingness relates to unobservable data
Random Sampling and Generalization
- Target population → sample (simple random sampling)
- Randomness ensures generalizability to the population, but chance is involved (especially in small samples)
- Check for sample bias by comparing important variables and checking for differences between respondents and non-respondents
- Generalizability can be problematic if not all intended respondents participate
Does Sample Resemble Population? Example
- Target population: all those using a fitness center
- Administration may provide percentages for males, females, students, and staff
- Sample percentages could be compared against population percentages
Testing a Proportion in SPSS
- Data analysis using SPSS to test sample proportions, such as calculating the proportion of females in the sample in statistical software.
Does the Sample Resemble the Population?
- Percentages for different groups (students, staff) in the given sample can be contrasted with the population percentages provided by the administration to assess sample characteristics.
- Chi-square test example of sample data comparisons regarding the different groups
Unit vs Item Non-response
- Generalizability of a survey depends on all intended respondents participating
- Categories of non-response: Not reached, Refused to participate, Did not answer all questions
- Item non-response: Respondents complete most important parts but not all
- Unit non-response: Respondents not contacted or refused to participate (data not collected)
Weighting
- Under- or over-sampling can be corrected through weighting
- Weights are calculated to adjust for population proportions in the sample which were not equally distributed
- A weighting variable is added to adjust for sampling differences
Output Before/After Weighting
- Tables showing frequency distributions of a variable before and after applying weights
- Weights are applied to the sample to make it representative for the target population.
Stratified Sampling and Weighting
- Population and age distribution (examples based on elderly people in a home)
- Random sampling from each age group creates a more representational sample.
Unit Non-response
- Generalizability issues arise if all respondents do not participate
- Non-respondents differ from respondents (e.g. age, gender)
Unit Non-response in SPSS
- Create data files for respondents, analyze their data compared to non-respondents.
- Use SPSS to analyze differences between respondents, non-respondents (Categorical & Numerical)
Comparing Distributions
- Charts comparing different groups frequency distributions (e.g., gender and responses) in SPSS
- Chi-square and other statistical tests are used to analyze whether there is a significant relationship between the categorical variables
Comparing Averages
- Independent samples t-test used to compare average ages of respondents versus non-respondents in SPSS
- Determine whether differences exist between these groups (using t-tests or cross-tabs).
More on Weighting
- Weighting is essential in complex survey designs like clustering, stratification, etc
- Understanding weighting necessitates advanced courses
Weighting Summarized
- Several issues can cause a sample to not accurately represent the population (stratified sampling, failed randomization, non-random non-response)
- A weighted sample accounts for these factors by applying appropriate weights
Next Topic: Imputation; Short Break
- Upcoming topic: imputing missing data; a break is suggested
Item Missings
- Example table shows "Respondent," "R," "Y(Grade)," and "X(Hours)"
- Data points are highlighted if they have missing values (e.g., "MISSING").
MCAR, MAR, MNAR and Imputation
- If data missing completely at random (MCAR) or missing at random (MAR), missing data mechanisms can be ignored using multiple imputation and maximum likelihood procedures
- If missing not at random (MNAR), missing mechanisms should not be ignored
Regression Imputation
- Regression imputation estimates missing values using the best possible value from a statistical model
- The imputed value minimizes potential errors; True grade is uncertain; predictions do not fully account for the uncertainty of the imputed value
Some Theory
- Sample mean for incomplete data may differ, but the complete data is still representative
- Multiple imputation handles sample uncertainty
Regression Multiple Imputation
- Using regression to impute data values
Stochastics Regression Imputation
- Regression imputation that explicitly accounts for the uncertainty of the missing data.
SPSS - Analyze Pattern
- Overview of a statistical software package’s functions.
- Example summaries for missing values and patterns
SPSS - Analyze Pattern
- Visual display of missing data in SPSS
- Charts showing the percentage of complete and incomplete data sets
- Summary of missing values and patterns
SPSS
- SPSS functionality to impute missing data values
- Various methods available, including selecting an imputation method, specifying the number of imputations (e.g., 5), and determining the output location.
Multiple Imputation (Predictive Mean Matching) - Descriptives
- Example outputs showing results of data imputation.
- Mean, standard deviation, minimum, maximum, calculated for the imputed and original data sets, and after imputation.
Multiple Imputation (PMM) - Pooled Regression
- Output showing estimates of coefficients in a regression model
- Various methods of imputation and parameters for each such as Imputation Method (automatic or custom selected), number of imputations and Maximum iterations.
Pooled – Why?
- Multiple imputations produce different datasets that need to be combined
- Combines between and within imputation variance
Missing data mechanisms
- Diagrams (MCAR, MAR, MNAR) visually represent the relationships between variables and missing data
Studying That Suits You
Use AI to generate personalized quizzes and flashcards to suit your learning preferences.
Related Documents
Description
This quiz focuses on Week 6 of Survey Methodology, specifically exploring weighting and scaling techniques for analyzing survey data. Key topics include scale construction, reliability assessments like Cronbach's alpha, and approaches to handling missing data. Test your understanding of these vital components in survey research.