PSYU2248 Statistics II: Correlation and Scatterplots
18 Questions
0 Views

Choose a study mode

Play Quiz
Study Flashcards
Spaced Repetition
Chat to lesson

Podcast

Play an AI-generated podcast conversation about this lesson

Questions and Answers

What is the standardized regression coefficient formula, according to the content?

standardized beta = b * sy

What is true about the relationship between the standardized and unstandardized regression coefficients?

  • They have the same signs but differ in value (correct)
  • They have the same value and sign
  • They are not related
  • They have different signs and values
  • In simple linear regression, the standardized coefficient IS the ____________.

    correlation coefficient

    What is the purpose of the regression equation in predicting scores?

    <p>To predict scores on the dependent variable (Y) based on the independent variable (X)</p> Signup and view all the answers

    Higher perceived fairness of wealth statistically significantly predicted more support for redistribution of wealth. (True/False)

    <p>False</p> Signup and view all the answers

    What is the purpose of running a statistical analysis within a research context?

    <p>To answer research questions/hypotheses and relate the findings back to the research context</p> Signup and view all the answers

    Correlation analysis always follows experimental design.

    <p>False</p> Signup and view all the answers

    What are the criteria for a cause-and-effect (causal) relationship?

    <ol> <li>Covariance rule: there must be a relationship; 2. Temporal precedence: the cause must precede the effect; 3. Internal validity: excluding other potential causes of the effect</li> </ol> Signup and view all the answers

    ____ measures co-variation and is just covariance, standardized.

    <p>Correlation</p> Signup and view all the answers

    Match the correlation coefficient with its description:

    <p>Pearson’s product-moment correlation = Normal correlation coefficient Spearman’s correlation rs = Correlation on ranked data, non-parametric correlation Point-biserial correlation rpb = Use for one numeric and one dichotomous variable Phi φ = Use for two dichotomous variables</p> Signup and view all the answers

    What does R-squared value of 0.75 indicate?

    <p>R-squared value of 0.75 indicates that 75% of the variation in the dependent variable can be explained by the independent variable(s).</p> Signup and view all the answers

    What does the coefficient of 4.58 for x with a p-value of 0.003 signify?

    <p>The coefficient of 4.58 for x indicates that for every one-unit increase in x, the dependent variable y increases by 4.58. The low p-value of 0.003 suggests that this relationship is statistically significant.</p> Signup and view all the answers

    What does an unstandardized beta represent in regression analysis?

    <p>The size of the effect of the independent variable on the dependent variable</p> Signup and view all the answers

    What are the assumptions required for a correlation to be appropriate?

    <p>Linear and monotonic relationship</p> Signup and view all the answers

    Outliers are only problematic if they distort the results.

    <p>True</p> Signup and view all the answers

    What are the key assumptions of correlation?

    <p>Numeric Data, Independence of observations, Monotonicity, Linearity, No major gaps or outliers</p> Signup and view all the answers

    The correlation formula can be expressed in multiple ways: r = [cov(X,Y)] / [s_x * s_y]. Covariance is the __________ correlation.

    <p>unstandardized</p> Signup and view all the answers

    What does a confidence interval for a correlation estimate?

    <p>It provides a range within which the true population correlation coefficient is likely to fall.</p> Signup and view all the answers

    Study Notes

    Research Process and Design

    • Statistical analysis exists within a research context, and it's essential to understand the research context to properly apply and interpret statistical analyses.
    • The research process involves:
      • Making an observation
      • Reviewing the literature and identifying the theory
      • Generating aims, research questions, and hypotheses
      • Designing the study
      • Obtaining ethical approval
      • Running the study and collecting data
      • Analyzing the data
      • Writing up and disseminating the findings
    • Design steps:
      • Understanding research questions and hypotheses
      • Identifying the sampling population
      • Understanding how variables are measured
    • Statistics steps:
      • Describing variables using univariate and bivariate summaries
      • Fitting an appropriate statistical model
      • Formally testing assumptions
      • Interpreting results and drawing conclusions

    Stata Recap

    • Stata can be downloaded from the MQ student website.
    • Data files can be opened in Stata, including .dta files and imported from Excel.
    • Useful Stata YouTube videos are available.

    Things You Should Know How to Do in Stata

    • Open a data file
    • Look at the data file to identify variables, number of observations, etc.
    • Run descriptive statistics for various types of variables
    • Create a new variable and attach value labels to categorical variables
    • Run statistical analyses, such as one-sample t-tests, independent t-tests, paired t-tests, correlations, and chi-square tests
    • Run assumption checks, such as Shapiro-Wilk and Levene's tests

    Revision (Plus) of Correlations and Scatterplots

    • Correlation analysis is used in non-experimental design, where the researcher doesn't intervene or manipulate the variables.
    • Correlation doesn't imply causation.
    • Criteria for a cause-and-effect relationship:
      • Covariance rule: there must be a relationship
      • Temporal precedence: the cause must precede the effect
      • Internal validity: excluding other potential causes of the effect
    • Correlations can be true or spurious.
    • Example of a spurious correlation: infant mortality rate and number of doctors in a population.

    Correlation Coefficient

    • The correlation coefficient measures the strength and direction of a linear relationship between two variables.
    • Pearson's product-moment correlation (r) is used for numeric variables.
    • The correlation coefficient ranges from -1 to 1.
    • Strength of correlation:
      • 0 to 0.10: no real relationship
      • 0.10 to 0.30: weak relationship
      • 0.30 to 0.50: moderate relationship
      • 0.50 to 1: strong relationship

    Scatterplots

    • Scatterplots visualize the relationship between two variables.
    • When analyzing a scatterplot, consider:
      • Monotonicity (does the trend keep in one direction?)
      • Linearity (can it be summarized by a straight line?)
      • Direction of association (positive or negative?)
      • Effect of X on Y (how steep is the slope?)
      • Correlation (how strong is the correlation?)
      • Gaps (are there any gaps?)
      • Outliers (are there any outliers?)
    • Scatterplots are essential for checking the assumptions of correlation.

    Calculating Correlation and Covariance

    • Correlation formula: 𝑟 = 𝑐𝑜𝑣(𝑥, 𝑦) / (𝑠𝑥 𝑠𝑦)
    • Covariance formula: 𝑐𝑜𝑣(𝑥, 𝑦) = Σ(𝑥 − 𝑥̄)(𝑦 − 𝑦̄) / (𝑛 − 1)

    Confidence Intervals

    • Confidence intervals are interval estimates that provide a range of values within which the true population estimate is likely to lie.
    • Formula for a 95% CI: point estimate +/- 1.96 x SE
    • SE (standard error) is a measure of variability.
    • Calculating a CI for a correlation involves transforming the correlation coefficient into a z-score.### Study Notes: Wealth Inequality

    Regression Output

    • Source table: Model, Residual, and Total
    • Number of observations: 9
    • F-statistic: 21.00, p = 0.0025
    • R-squared: 0.7500, Adj R-squared: 0.7143
    • Root MSE: 0.84515

    Coefficients Table

    • x: Coefficient: 0.5, Std. Err: 0.1091089, t: 4.58, p: 0.003
    • _cons: Coefficient: 1.5194625, Std. Err: 1.93, t: 0.096, p: -

    Model as a Whole Effects

    • Model is statistically significant, F(1, 7) = 21.00, p = 0.003
    • R-squared: 0.75 (75%), a large amount of variance explained

    Effect of X

    • The effect of X is statistically significant, t(7) = 4.58, p = 0.003
    • For every one-point increase in X, Y increases by 0.5 points (b = 0.5)

    Intercept

    • The intercept (AKA constant term) is the predicted score on Y when X = 0, a = 1
    • The predicted score on Y when X = 0 is 1

    Standardized Regression Coefficient

    • Standardized beta: 0.8232
    • Unstandardized beta is not comparable between different IVs on different scales

    Using the Regression Equation to Predict Scores

    • Regression line predicts a score of Y for any given value of X
    • We can substitute in values of X to find predicted scores for Y

    Wealth Inequality Example

    • Example from Open Stats Lab, study by Dawtry et al. (2015)
    • Examined why people differ in their assessments of the increasing wealth inequality within developed nations

    Study Methods + Hypotheses

    • Design: cross-sectional online survey study

    • Sample: 305 US adults recruited from an online survey pool Amazon’s Mturk### Study on Wealth Inequality

    • Participants reported their attitudes toward redistribution of wealth, measured using a four-item scale (redist1 – redist4), which was converted into a single variable called support_for_redistribution.

    • Participants also reported their political orientation on a scale from 1 (extremely liberal) to 9 (extremely conservative), measured by the variable political_preference.

    • Additionally, participants reported their perceived fairness of the distribution of household income across the US population, measured by the variable fairness.

    Hypotheses

    • Hypothesis 1: Support for redistribution of wealth is predicted by perceived fairness of wealth distribution, with individuals who think the current system is fair having less support for redistribution.
    • Hypothesis 2: Support for redistribution of wealth is predicted by political orientation, with more liberal individuals being more likely to support redistribution.

    Regression Analysis

    • Simple Linear Regression (SLR) was used to test the hypotheses.
    • For Hypothesis 1, the independent variable (IV) was fairness, and the dependent variable (DV) was support for redistribution.
    • For Hypothesis 2, the independent variable (IV) was political preference, and the dependent variable (DV) was support for redistribution of wealth.
    • A negative predictive relationship was hypothesized for both hypotheses.

    Results

    • Higher perceived fairness of wealth statistically significantly predicted less support for redistribution of wealth (F(1, 303) = 234.57, p < 0.05).
    • The results supported Hypothesis 1, indicating that individuals who perceived the current system as fair were less likely to support redistribution of wealth.

    Studying That Suits You

    Use AI to generate personalized quizzes and flashcards to suit your learning preferences.

    Quiz Team

    Related Documents

    Regression Analysis PDF

    Description

    This quiz covers the link between statistics and research design, a Stata walk-through, and a revision of correlation and scatterplots, including confidence intervals.

    More Like This

    Use Quizgecko on...
    Browser
    Browser