Survey Data Coding Techniques
26 Questions
0 Views

Choose a study mode

Play Quiz
Study Flashcards
Spaced Repetition
Chat to lesson

Podcast

Play an AI-generated podcast conversation about this lesson

Questions and Answers

Which test would be appropriate for examining the association between two nominal variables?

  • Spearman correlation
  • Pearson correlation
  • ANOVA
  • Crosstabs with chi-square test (correct)
  • What should be examined before analyzing correlations between continuous data?

  • Descriptive statistics of the dataset
  • Box plots for outliers
  • Scatterplots of the variables (correct)
  • Histograms of the data
  • Which of the following is NOT a requirement for performing a Pearson correlation?

  • Linear relationship
  • Continuous data
  • Normality of residuals (correct)
  • Observations must be independent
  • What does the homogeneity test assess in the context of ANOVA?

    <p>Equal variances among groups</p> Signup and view all the answers

    What is a critical step to take after using the 'Split file' function in data analysis?

    <p>Run the analysis without groups</p> Signup and view all the answers

    What is the purpose of dividing a ranking question into smaller problems?

    <p>To translate a single question into several distinct variables</p> Signup and view all the answers

    Which of the following represents user-missing values in survey data?

    <p>Values defined as 99 = refused to answer</p> Signup and view all the answers

    In conducting a survey, how should missing values be treated?

    <p>They can be left empty or defined explicitly</p> Signup and view all the answers

    What is a characteristic of straightforward coding?

    <p>It simplifies the coding process for quantitative data</p> Signup and view all the answers

    Which course could be considered the easiest based on a typical coding framework?

    <p>Health psychology</p> Signup and view all the answers

    What numerical code is assigned to female respondents in the questionnaire?

    <p>2</p> Signup and view all the answers

    Which method provides information about categorical variables?

    <p>Frequency distributions</p> Signup and view all the answers

    What score represents the distance in multivariate space for respondents?

    <p>Mahalanobis distance</p> Signup and view all the answers

    What could cause an outlier such as a weight of 888 kg in a dataset?

    <p>Incorrect data entry</p> Signup and view all the answers

    What should be examined to identify potential outliers in the dataset?

    <p>Score patterns of suspicious individuals</p> Signup and view all the answers

    Which regression model specification helps in saving the Mahalanobis distance?

    <p>All variables as predictors</p> Signup and view all the answers

    What type of outlier is identified by examining one variable at a time?

    <p>Univariate outlier</p> Signup and view all the answers

    What is primarily used for analyzing continuous variables?

    <p>Descriptive statistics</p> Signup and view all the answers

    What is a potential consequence of removing outliers from a sample?

    <p>The results may significantly differ if outliers are excluded.</p> Signup and view all the answers

    Which measure of central tendency is least affected by skewness in data?

    <p>Median</p> Signup and view all the answers

    When conducting an Independent Samples T-Test in SPSS, what is the primary dependent variable being analyzed?

    <p>Age</p> Signup and view all the answers

    What would you use as a summary measure for nominal data?

    <p>Mode</p> Signup and view all the answers

    What is one of the advantages of reporting analyses both with and without outliers?

    <p>It allows for a comprehensive understanding of data impact.</p> Signup and view all the answers

    What should you be cautious of when analyzing data for central tendency?

    <p>Assuming a normal distribution in all cases.</p> Signup and view all the answers

    In the context of analyzing more than two independent groups, which model is appropriate for a continuous dependent variable?

    <p>General Linear Model</p> Signup and view all the answers

    Which of the following is NOT a recommended action when dealing with outliers?

    <p>Delete extreme cases to ensure randomness.</p> Signup and view all the answers

    Study Notes

    Coding and Screening for Surveys

    • Straightforward coding involves assigning clear numerical values for variables like age, gender, and opinions.
    • Age is coded as "age in years" in SPSS.
    • Gender is coded as 1 = male, 2 = female in SPSS.
    • Opinions are coded as 1=strongly disagree, 2=disagree, 3=neutral, 4=agree, 5=strongly agree in SPSS.

    Not So Straightforward Coding

    • Coding more complex questions, like ranking courses by difficulty, requires careful design.
    • Forced choice format breaks rankings into smaller, pairwise comparisons.
    • This allows translating one ranking question into multiple distinct variables.
    • Multiple answers are allowed when asking which courses participants liked most from a list.

    Some Remarks about Missing Values

    • Missing values in surveys can be coded as empty or explicitly defined (system/user missing).
    • These codes can differentiate between different types of missing data (e.g., "not applicable").
    • Examples of missing value codes include 97 = does not apply, 98 = don't know, and 99 = refused to answer.
    • Using numerical codes (e.g., 1 = male, 2 = female) and value labels is recommended for variables like gender.
    • Assign a unique identifier to each participant for tracking and data analysis.

    Screening Example Data

    • Example values for the variables sex, age, anxiety, IQ, married, and income are given in a table.
    • Age is in years.
    • Anxiety is on a 1-7 ordinal scale.
    • Married is the number of years married.
    • Income is in 5 categories (1≤ 1500, 2 = 1501-2500,..., 5>5000)
    • Other variables, with ranges and descriptive statistics like means and standard deviations, are shown.

    Screening per Variable

    • Descriptive statistics for several variables are presented in a table, including number of brothers/sisters, number of children, age, education years completed (self, father, mother, spouse), R’s occupation prestige score, and occupational category.
    • Data summaries are provided for continuous and categorical variables, including frequency distributions and cross-tables.
    • Additional categorical data for respondents' sex and most important problems in the last 12 months (e.g. Finance, Health, Lack of Basic services) is included.

    Bivariate Screening

    • Bivariate screening checks for unexpected combinations of values in pairs of variables.
    • A scatterplot is useful for visualizing relationships between continuous variables and identifying potential outliers.

    Multivariate Screening

    • Multivariate screening examines multiple variables together to identify outliers.
    • Mahalanobis distance calculates the distance between a respondent and the average respondent in a multi-dimensional space.
    • High Mahalanobis distances indicate potential outliers.

    Potential Outliers

    • Examine extreme values in terms of their means to determine if they are significantly different from the other values.
    • Look for irregular combinations of values on variables, as this suggests potential outliers
    • Scrutinize for data entry errors: A weight of 888 kg is suspicious and should be checked.
    • Assess whether respondents are outside of the expected population
    • Check if the sample consists of multiple, distinct subgroups.

    Handling Outliers

    • Strategies for handling outliers include minimizing their influence, transforming their values closer to the mean, or deleting outliers if other factors permit
    • Carefully consider whether removing outliers maintains the sample's randomness.
    • Always report both the analysis with and without outliers and the rationale behind decisions to help maintain transparency.

    Analyses

    • Central tendency measures (mean, median, mode) summarize data distributions.
    • Histograms, boxplots, and various SPSS analysis options are used to understand and visualize the distributions of variables like age and sex.
    • Appropriate statistical tests (e.g., t-tests) have to be selected for investigating differences or comparing characteristics across groups or conditions.

    More Than Two Independent Groups

    • Analysis can be conducted on more than 2 groups using General Linear Model (GLM) in SPSS, checking assumptions.
    • Assumptions include normality of residuals in each subgroup, absence of significant outliers, and equal variances in dependent subgroups.

    Histograms and Split Files

    • Splitting a file in SPSS lets users analyze and plot data individually for specific subgroups or groups based on categorical variables.

    Output ANOVA through GLM, Associations between Variables

    • Output from the General Linear Model (GLM) procedure, including ANOVA results and F-tests for analyses with more than two groups, is shown for different categories like respondents' education levels.
    • To analyze the relationship between two variables, appropriate methods such as Spearman Correlation for ordinal data, or Pearson correlation and chi-squared test for continuous and nominal data respectively, are used. To prevent misinterpretation, the correlation analysis always begins with the scatterplot examination to check for patterns, outliers, and linearity.
    • Significance values from correlation tests represent the probability of obtaining the observed result if there is no true relationship between the variables.

    x2 Tests

    • Chi-squared (χ²) tests indicate whether observed frequencies in a categorical variable differ from expected frequencies.
    • These tests can also verify for independence of paired observations across categories in two different variables.

    Crosstabs in SPSS

    • SPSS procedure for creating cross-tabulation tables, showing frequencies and percentages within different groups of categorical variables, with example variable options for use.

    Studying That Suits You

    Use AI to generate personalized quizzes and flashcards to suit your learning preferences.

    Quiz Team

    Related Documents

    Lecture 5 2024 PDF

    Description

    Explore the techniques for coding survey data effectively, including straightforward and complex methods. Understand how to handle variables such as age, gender, and opinions, and learn about strategies for addressing missing values in surveys.

    More Like This

    Data Coding Process in Surveys
    40 questions
    1.1 Survey Tabulation
    10 questions
    Use Quizgecko on...
    Browser
    Browser