Exam Questions PDF

**What problems can occur when fitting LMs to non-normal data?** - Some assumptions may be violated. LMs assume the residuals to be normally distributed. When the variance of residuals is not constant, heteroscedasticity occurs. Non-normal data may have skewed or heteroscedastic residuals. - The support may not fit. LMs assume, that the response variable can take on any real value. This may not be the case for every type of data. - It may lead to biased predictions, when the distribution of data differs significantly from the normal distribution. - One can receive values, which do not make sense for the example or in the given context. **What is the difference between LMs and GLMs?** - Link function: LMs use an identity link function, so the expected value of the response variable is modelled directly as a linear combination of the predictors. GLMs use different link functions depending on the data, which relates to linear predictor to the mean of the distribution function. - Linear predictor: - Variance structure: LMs assume homoscedasticity, whereas GLMs can handle heteroscedasticity. - Types of data: LMs can only handle continuous data. GLMs are applicable to a wider range of data types. - Distributions: LMs assume that the response variable is normally distributed. GLMs account for response variables, which follow different distributions from the exponential family. - GLMs are more flexible and can be applied in a wider range of data types, whereas LMs are best suited for continuous data with homoscedasticity. **Which objective function minimises the (weighted) LSE?** **What is the (weighted) least squares error (LSE)?** - Weighted sum of squared residuals - The aim is to find model parameters that minimise this weighted sum of squared differences between the observed and predicted values. One uses the squared differences, so that the direction of the deviation gets eliminated. - In the ordinary least squares (OSL) all weights are equal, typically to 1. In other words, every observation is treated equally, so that each data point contributes equally to the total error. - In the weighted least squares (WLS), different observations can have different weights, allowing for more flexibility in handling heteroscedasticity. In some data sets, some observations might be more reliable/ important than others. For example, when dealing with heteroscedasticity, we do not want parts with high and low variance to have the same weight, because data points with higher variance might be less reliable. If we were to treat all observations equally (as in OLS), the model might be skewed. **Write down the formula for the (weighted) sum of squares of the residuals for simple LMs.** **What is the difference between errors and residuals?** - Errors: true deviation of observed values from true population regression line difference between actual data points and true underlying model (which we do not know) - Residuals: differences between observed values and predicted values from fitted regression model measure discrepancy between data and estimated model **Explain the standard assumptions of LMs. What additional assumption is often made?** - Linearity: the relationship between independent variables and dependent variable is linear changes in independent variable(s) correspond to proportional changes in dependent variable - Independence: residuals (errors) are independent observation of one error is not correlated with the observation of another residuals not systematic - Normally distributed errors: residuals are normally distributed important for hypothesis testing and confidence intervals (possible additional assumption) - Equal variances/ homoscedasticity: variability of errors does not change with the value of independent variable(s) **Explain the terms homoscedasticity and heteroscedasticity.** - Homoscedasticity: the variance is constant over all observations - Heteroscedasticity: data in one part of the regression line has more variance than other part **What is the interpretation of a p-value?** - Gives information about significance and how likely it is, to receive the given values without a true effect - Can be used to determine, whether the null hypothesis should be rejected based on the observations - A p-value tells you, how likely it is to get your study results (or more extreme ones) if the null hypothesis is actually true. Typically, a significance level of 5% is used. When one obtains a p-value smaller than 0.05, there is strong evidence against the null hypothesis, suggesting there is a statistically significant effect. When one obtains a p-value larger than 0.05, there is weak evidence against the null hypothesis, so that one might not reject the null hypothesis. **\ ** **Give an example of a case where variable transformations (polynomial effects, interaction effects) are useful.** - Polynomial effects: suppose a non-linear relationship between study hours and exam scores after a certain point, the benefit of additional study hours diminishes including a quadratic term to model this relationship - Interaction effects: introduce for example exam scores from subjects in the past as a possible interaction term capture how effect of one variable (study hours) changes at different levels of another variable (past exam scores) - Improved fit: transformations allow model to fit data more accurately - Better interpretation: meaningful insights into underlying phenomena helps make better predictions and decisions **What data types are the distributions most suitable for?** - Bernoulli is most suitable for binary/ dichotomous data - Binomial is most suitable for discrete data (can take on only specific, separate values) or (bounded) counts - Poisson is most suitable for discrete or count data - Exponential is most suitable for continuous data representing time until an event occurs - Gamma is most suitable for positive continuous data (representing aggregated time until multiple events occur) - Normal is most suitable for continuous data, that is symmetrically distributed around a mean **Give the general from of a distribution from the exponential family (i.e. in terms of functions a, b, c and d).** **What is a nuisance parameter?** - A parameter, which has an influence, but not the parameter of primary interest - Treated as constants - Even if it is not of primary interest, it still must be accounted for in the analysis - Primary role: provide a more accurate/ complete model - Adjustment: often need to be estimated or integrated out of model **\ ** **What are the 3 components of GLMs?** Linear predictor: - linear combination of predictors - Each variable gets multiplied by a coefficient, which shows how much that variable affects the outcome. All these multiplied values are added together plus a constant term (intercept). Similar to a simple regression line. - links mean of distribution of response variable to linear predictor transforms expected value of response variable to ensure it fits the scale of the linear predictor - Basically an equation, which uses the score from the linear predictor and transforms it to fit the support of the chosen distribution. **Explain the syntax of the glm function in R.** glm(response variable \~ predictor 1 +/\* predictor 2, family= x(link= "x"), data= data) Family: - Gaussian for normal distribution (identity link= default) - Binomial for binary/ binomial data (logit link= default) - Poisson for count data (log link= default) - Gamma for continuous positive data (inverse link= default) Weights: optional vector of weights for fitting process **What is the difference between log odds and odds?** Odds: - ratio of comparing success to failure - \>1: event is more likely to happen than not - = 1: 50/50 chance of event happening or not (success or failure) - \0, the expected count increases. - If β1\

Exam Questions PDF

Document Details

Tags

Related

Summary

Full Transcript