Podcast
Questions and Answers
Which test would be appropriate for examining the association between two nominal variables?
Which test would be appropriate for examining the association between two nominal variables?
What should be examined before analyzing correlations between continuous data?
What should be examined before analyzing correlations between continuous data?
Which of the following is NOT a requirement for performing a Pearson correlation?
Which of the following is NOT a requirement for performing a Pearson correlation?
What does the homogeneity test assess in the context of ANOVA?
What does the homogeneity test assess in the context of ANOVA?
Signup and view all the answers
What is a critical step to take after using the 'Split file' function in data analysis?
What is a critical step to take after using the 'Split file' function in data analysis?
Signup and view all the answers
What is the purpose of dividing a ranking question into smaller problems?
What is the purpose of dividing a ranking question into smaller problems?
Signup and view all the answers
Which of the following represents user-missing values in survey data?
Which of the following represents user-missing values in survey data?
Signup and view all the answers
In conducting a survey, how should missing values be treated?
In conducting a survey, how should missing values be treated?
Signup and view all the answers
What is a characteristic of straightforward coding?
What is a characteristic of straightforward coding?
Signup and view all the answers
Which course could be considered the easiest based on a typical coding framework?
Which course could be considered the easiest based on a typical coding framework?
Signup and view all the answers
What numerical code is assigned to female respondents in the questionnaire?
What numerical code is assigned to female respondents in the questionnaire?
Signup and view all the answers
Which method provides information about categorical variables?
Which method provides information about categorical variables?
Signup and view all the answers
What score represents the distance in multivariate space for respondents?
What score represents the distance in multivariate space for respondents?
Signup and view all the answers
What could cause an outlier such as a weight of 888 kg in a dataset?
What could cause an outlier such as a weight of 888 kg in a dataset?
Signup and view all the answers
What should be examined to identify potential outliers in the dataset?
What should be examined to identify potential outliers in the dataset?
Signup and view all the answers
Which regression model specification helps in saving the Mahalanobis distance?
Which regression model specification helps in saving the Mahalanobis distance?
Signup and view all the answers
What type of outlier is identified by examining one variable at a time?
What type of outlier is identified by examining one variable at a time?
Signup and view all the answers
What is primarily used for analyzing continuous variables?
What is primarily used for analyzing continuous variables?
Signup and view all the answers
What is a potential consequence of removing outliers from a sample?
What is a potential consequence of removing outliers from a sample?
Signup and view all the answers
Which measure of central tendency is least affected by skewness in data?
Which measure of central tendency is least affected by skewness in data?
Signup and view all the answers
When conducting an Independent Samples T-Test in SPSS, what is the primary dependent variable being analyzed?
When conducting an Independent Samples T-Test in SPSS, what is the primary dependent variable being analyzed?
Signup and view all the answers
What would you use as a summary measure for nominal data?
What would you use as a summary measure for nominal data?
Signup and view all the answers
What is one of the advantages of reporting analyses both with and without outliers?
What is one of the advantages of reporting analyses both with and without outliers?
Signup and view all the answers
What should you be cautious of when analyzing data for central tendency?
What should you be cautious of when analyzing data for central tendency?
Signup and view all the answers
In the context of analyzing more than two independent groups, which model is appropriate for a continuous dependent variable?
In the context of analyzing more than two independent groups, which model is appropriate for a continuous dependent variable?
Signup and view all the answers
Which of the following is NOT a recommended action when dealing with outliers?
Which of the following is NOT a recommended action when dealing with outliers?
Signup and view all the answers
Study Notes
Coding and Screening for Surveys
- Straightforward coding involves assigning clear numerical values for variables like age, gender, and opinions.
- Age is coded as "age in years" in SPSS.
- Gender is coded as 1 = male, 2 = female in SPSS.
- Opinions are coded as 1=strongly disagree, 2=disagree, 3=neutral, 4=agree, 5=strongly agree in SPSS.
Not So Straightforward Coding
- Coding more complex questions, like ranking courses by difficulty, requires careful design.
- Forced choice format breaks rankings into smaller, pairwise comparisons.
- This allows translating one ranking question into multiple distinct variables.
- Multiple answers are allowed when asking which courses participants liked most from a list.
Some Remarks about Missing Values
- Missing values in surveys can be coded as empty or explicitly defined (system/user missing).
- These codes can differentiate between different types of missing data (e.g., "not applicable").
- Examples of missing value codes include 97 = does not apply, 98 = don't know, and 99 = refused to answer.
- Using numerical codes (e.g., 1 = male, 2 = female) and value labels is recommended for variables like gender.
- Assign a unique identifier to each participant for tracking and data analysis.
Screening Example Data
- Example values for the variables sex, age, anxiety, IQ, married, and income are given in a table.
- Age is in years.
- Anxiety is on a 1-7 ordinal scale.
- Married is the number of years married.
- Income is in 5 categories (1≤ 1500, 2 = 1501-2500,..., 5>5000)
- Other variables, with ranges and descriptive statistics like means and standard deviations, are shown.
Screening per Variable
- Descriptive statistics for several variables are presented in a table, including number of brothers/sisters, number of children, age, education years completed (self, father, mother, spouse), R’s occupation prestige score, and occupational category.
- Data summaries are provided for continuous and categorical variables, including frequency distributions and cross-tables.
- Additional categorical data for respondents' sex and most important problems in the last 12 months (e.g. Finance, Health, Lack of Basic services) is included.
Bivariate Screening
- Bivariate screening checks for unexpected combinations of values in pairs of variables.
- A scatterplot is useful for visualizing relationships between continuous variables and identifying potential outliers.
Multivariate Screening
- Multivariate screening examines multiple variables together to identify outliers.
- Mahalanobis distance calculates the distance between a respondent and the average respondent in a multi-dimensional space.
- High Mahalanobis distances indicate potential outliers.
Potential Outliers
- Examine extreme values in terms of their means to determine if they are significantly different from the other values.
- Look for irregular combinations of values on variables, as this suggests potential outliers
- Scrutinize for data entry errors: A weight of 888 kg is suspicious and should be checked.
- Assess whether respondents are outside of the expected population
- Check if the sample consists of multiple, distinct subgroups.
Handling Outliers
- Strategies for handling outliers include minimizing their influence, transforming their values closer to the mean, or deleting outliers if other factors permit
- Carefully consider whether removing outliers maintains the sample's randomness.
- Always report both the analysis with and without outliers and the rationale behind decisions to help maintain transparency.
Analyses
- Central tendency measures (mean, median, mode) summarize data distributions.
- Histograms, boxplots, and various SPSS analysis options are used to understand and visualize the distributions of variables like age and sex.
- Appropriate statistical tests (e.g., t-tests) have to be selected for investigating differences or comparing characteristics across groups or conditions.
More Than Two Independent Groups
- Analysis can be conducted on more than 2 groups using General Linear Model (GLM) in SPSS, checking assumptions.
- Assumptions include normality of residuals in each subgroup, absence of significant outliers, and equal variances in dependent subgroups.
Histograms and Split Files
- Splitting a file in SPSS lets users analyze and plot data individually for specific subgroups or groups based on categorical variables.
Output ANOVA through GLM, Associations between Variables
- Output from the General Linear Model (GLM) procedure, including ANOVA results and F-tests for analyses with more than two groups, is shown for different categories like respondents' education levels.
- To analyze the relationship between two variables, appropriate methods such as Spearman Correlation for ordinal data, or Pearson correlation and chi-squared test for continuous and nominal data respectively, are used. To prevent misinterpretation, the correlation analysis always begins with the scatterplot examination to check for patterns, outliers, and linearity.
- Significance values from correlation tests represent the probability of obtaining the observed result if there is no true relationship between the variables.
x2 Tests
- Chi-squared (χ²) tests indicate whether observed frequencies in a categorical variable differ from expected frequencies.
- These tests can also verify for independence of paired observations across categories in two different variables.
Crosstabs in SPSS
- SPSS procedure for creating cross-tabulation tables, showing frequencies and percentages within different groups of categorical variables, with example variable options for use.
Studying That Suits You
Use AI to generate personalized quizzes and flashcards to suit your learning preferences.
Related Documents
Description
Explore the techniques for coding survey data effectively, including straightforward and complex methods. Understand how to handle variables such as age, gender, and opinions, and learn about strategies for addressing missing values in surveys.