Regression Analysis

Choose a study mode

Play Quiz
Study Flashcards
Spaced Repetition
Chat to Lesson

Podcast

Play an AI-generated podcast conversation about this lesson
Download our mobile app to listen on the go
Get App

Questions and Answers

What is the primary goal of regression analysis?

  • Calculating correlation coefficients.
  • Determining the standard error of estimate.
  • Establishing the 'nature of relationship' between variables for prediction. (correct)
  • Determining if a relationship exists between variables.

In regression analysis, what is the term for the variable whose value is being predicted?

  • Dependent variable (correct)
  • Intervening variable
  • Explanatory variable
  • Independent variable

Simple linear regression involves studying the causal relationship of multiple dependent variables and multiple independent variables.

False (B)

What does the term 'regression' literally mean?

<p>Returning to an average value. (D)</p> Signup and view all the answers

What is the name of the criterion that is frequently used to select a line of 'best fit'?

<p>least squares criterion</p> Signup and view all the answers

According to the least squares criterion, the line of best fit is the one that maximizes the sum of the squares of the vertical distances from the observed points to the line.

<p>False (B)</p> Signup and view all the answers

The equations for estimating a and b, used in determining a line of best fit, are also known as ______ equations.

<p>normal</p> Signup and view all the answers

In the least squares line of regression of Y on X, what does the regression coefficient byx measure?

<p>The change in Y corresponding to a unit change in X (A)</p> Signup and view all the answers

The regression lines of Y on X and X on Y are reversible equations.

<p>False (B)</p> Signup and view all the answers

What happens to the two regression lines in the case of perfect correlation (positive or negative)?

<p>They coincide. (B)</p> Signup and view all the answers

Match the descriptions with the appropriate variable from the regression equations:

<p>bxy = Regression coefficient of X on Y byx = Regression coefficient of Y on X X = Mean value of X Y = Mean value of Y</p> Signup and view all the answers

For estimation purposes, when would you consider using the regression line of X on Y rather than Y on X?

<p>When you want to estimate the value of X from known values of Y (A)</p> Signup and view all the answers

The two lines of regression always intersect at the origin.

<p>False (B)</p> Signup and view all the answers

According to the document, sales (in thousand rupees) and the number of building permits issued is an example of what calculation?

<p>Regression equation calculation</p> Signup and view all the answers

Which of the following represents the coefficient of correlation?

<p>The square root of the product of the two regression coefficients. (C)</p> Signup and view all the answers

If one of the regression coefficients is greater than unity, the other can also be greater than unity.

<p>False (B)</p> Signup and view all the answers

According to the document, the regression coefficients are (blank) of change of origin but not of scale.

<p>independent</p> Signup and view all the answers

If you change the scale of the independent and dependent variables, what adjustments must be made to the regression coefficients?

<p>Multiply one coefficient and divide the other by appropriate constants reflecting the change in scale. (C)</p> Signup and view all the answers

The coefficient of correlation is dependent on the change of origin and scale.

<p>False (B)</p> Signup and view all the answers

What is the geometric mean if the regression coefficient of X on Y is 2.5 and the same for regression equation of Y on X is 0.8?

<p>It is impossible because the regression coefficients are not consistent.</p> Signup and view all the answers

What is the formula of byx in terms of r and σx and σy?

<p>R * σy / σx (B)</p> Signup and view all the answers

A company analyzes data and finds the correlation coefficient = 0.85 with correlation equation is 2Y/5 = 8X to calculate an inventory of a product Y. After receiving the results, it's discovered that their calculations of r, were incorrect. Of the following data, which is correct if we assume that only r was calculated incorrectly?

<p>that only r was calculated incorrectly (C)</p> Signup and view all the answers

Match the term and the description related to formulas from the document:

<p>R^2 = represents a formula to find the coefficient of determination Explained variation = Represents Total variation times the coefficient of determination Total variation = The equation to represent Explained variation plus Unexplained variation Syx is sqrt[Unexplained Variation / n] = Represents a formula for calculating standard error of estimate</p> Signup and view all the answers

Provided there is sufficient data, calculating coefficient of determination can only be done by creating models to be graphed using total variation equations and the explained variation

<p>False (B)</p> Signup and view all the answers

The regression equations derived from the document provide a method for calculation of variance and also the (blank) that can be expected within a group:

<p>correlation</p> Signup and view all the answers

In the context of time series analysis, what is the term for variations that are completely unpredictable and caused by unusual events?

<p>Irregular or random variations (B)</p> Signup and view all the answers

Which component of a time series refers to the general tendency of the data to increase or decrease over a long period?

<p>Secular Trend (B)</p> Signup and view all the answers

Seasonal variations in a time series can be observed even when the data are recorded annually.

<p>False (B)</p> Signup and view all the answers

Which phase is NOT a component of the business cycle?

<p>Growth (D)</p> Signup and view all the answers

What is the length of the period when referring to cyclical phases lasting?

<p>7 to 9 years</p> Signup and view all the answers

The multiplicative model assumes that the components of a time series cannot affect one another.

<p>False (B)</p> Signup and view all the answers

The method of identifying trend in a time series data via (blank) is the simplest method of estimating trend:

<p>freehand curve</p> Signup and view all the answers

Which of the following is a disadvantage of using the freehand curve method for trend analysis?

<p>It is subjective and may not be suitable for forecasting (A)</p> Signup and view all the answers

If dividing a time series to create 2 equal parts, identify the primary advantage to analyzing via semi-averages:

<p>A simple method that can identify linear trend (C)</p> Signup and view all the answers

Time series mean calculations are not affected greatly by extreme data elements.

<p>False (B)</p> Signup and view all the answers

Why should extreme values not be included when calculating semi-averages?

<p>Because extreme points do not give a true picture of the growth factor.</p> Signup and view all the answers

What action should be performed when calculating moving averages when extreme values are recorded in the first data set?

<p>sacrifice at both the start and endpoints data in exchange for a smoother data set (B)</p> Signup and view all the answers

Unlike regression analysis, time division cannot be used for analyzing cyclical trends.

<p>True (A)</p> Signup and view all the answers

The four yearly moving average is implemented by working with original data/averages and creating a (blank) centered average:

<p>two-period</p> Signup and view all the answers

Match the equations with the components of the time series that they are measuring:

<p>∑Y= na + b ∑X = Normal equation b is the slope of a trend equation = Represents the change in the value for a single change variable a + bx + cx2 = What consists of a second degree equation Σ(Υ-Γ) = method of least squares</p> Signup and view all the answers

Flashcards

Regression Analysis

Analysis that studies the functional relationship between variables to provide a mechanism for prediction or forecasting.

Regression Equation

A mathematical equation allowing prediction of one variable's value from known values of others.

Dependent Variable

The variable whose value is predicted in regression analysis.

Independent Variables

Variables used to predict the value of the dependent variable.

Signup and view all the flashcards

Simple Regression Analysis

Regression analysis involving one dependent and one independent variable.

Signup and view all the flashcards

Least Squares Approach

The line minimizes the sum of squared vertical distances from observed points to the line.

Signup and view all the flashcards

Regression Coefficient

Represents the slope of the line of regression, measuring the change in Y for a unit change in X.

Signup and view all the flashcards

Regression line of Y on X

The line of "best fit" minimizes sum of squared vertical distances from observed points to the line.

Signup and view all the flashcards

Regression line of X on Y

Minimizes the sum of the squares of the horizontal distances from the observed points to the line"

Signup and view all the flashcards

Regression

A measure of the average relationship between two or more variables.

Signup and view all the flashcards

Finding constants cand d

Constants cand d are determined according to the least squares criterion

Signup and view all the flashcards

Finding Regression Coefficients

Formulas for Regression Coefficients in terms of Deviations of andY

Signup and view all the flashcards

Coincidence of regression lines

In case of perfect correlation (positive or negative), the two regression lines would coincide.

Signup and view all the flashcards

Two Regression Equations

If X and Y are uncorrelated, two regression equations reduce toY and X

Signup and view all the flashcards

Calculate byx and bxy

Calculate regression coefficients for simple equations

Signup and view all the flashcards

regression coefficient

It is the amount of change in the dependent variable for a unit change in the independent variable.

Signup and view all the flashcards

The regression line of on .

Is used to estimate or predict the value of Y from known values of X.

Signup and view all the flashcards

The regression line of On .

Is used to estimate or predict the value of from known values of .

Signup and view all the flashcards

Correlation and Regression

Properties of regression coefficients

Signup and view all the flashcards

Properties of Regressions

The regression coefficients are independent of change of origin but not scale.

Signup and view all the flashcards

Data Calculations

It can be determined by regression equations for calculating deviation in data.

Signup and view all the flashcards

Standard Error of Estimate

Standard deviation, measures the scatter or spread of the actual values around the regression line.

Signup and view all the flashcards

Study Notes

Regression Analysis

  • Regression analysis is used to predict the value of one variable from known values of other variables
  • The variable to be predicted is the dependent or explained variable
  • Variables used to predict the dependent variable are called independent or explanatory variables
  • Simple regression analysis is confined to the study of only two variables, a dependent variable and one independent variable
  • Simple linear regression is used when the relationship between the dependent and independent variable is linear
  • Regression helps to study the functional relationship between variables for prediction or forecasting

Meaning of Regression

  • The word 'regression' originally meant 'stepping back' or 'returning to average value'
  • Sir Francis Galton introduced regression as a statistical concept in 1877, studying heights of fathers and sons
  • Galton's studies showed that offspring of abnormally tall or short parents tend to revert toward the average height of the population

Lines of Regression - The Least Squares Approach

  • This involves estimating or predicting values of a dependent variable based on known values of an independent variable
  • Bivariate data consists of pairs of observations (X, Y) on two quantitative variables X and Y
  • Assumes X and Y are approximately linearly related, following a straight line on a scatter diagram
  • A line can be visually fitted to approximate the data and predict a value of Y for a given X
  • The least squares criterion selects the line of "best fit" by minimizing the sum of the squares of vertical distances from observed points
  • A line of best fit is represented by the equation Y = a + bX
  • Constants a and b are determined to minimize the sum of squared vertical distances
  • The determination of a and b uses differential calculus and results in two normal equations
  • The normal equations are ΣY = na + bΣX and ΣXY = aΣX + bΣX²

Solving for Regression Coefficients

  • Solving the normal equations simultaneously yields formulas for a and b
  • The formula for b = [nΣXY - (ΣX)(ΣY)] / [nΣX² - (ΣX)²]
  • The formula for a = Y − bX, where X and Y are means of X and Y, respectively
  • The line of best fit is called the least squares line of regression of Y on X, where b is a regression coefficient, measuring the change in Y per unit change in X
  • The line of regression of X on Y allows to estimate a value of X for a given value of Y, is given by X = c + dY
  • The constant d is called the regression coefficient of X on Y, denoted by bxy
  • The regression coefficient bxy measures the change in X corresponding to a unit change in Y
  • The formula for d = [nΣXY - (ΣX)(ΣY)] / [nΣY² - (ΣY)²]
  • The formula for c = X - dY

Equations for Regression Lines

  • The equation for the line of regression of Y on X can be written as Y − Y = byx(X − X)
  • The equation for the line of regression of X on Y can be written as X − X = bxy(Y − Y)
  • There are always two lines of regression used depending on independent, dependent variables
  • The regression line of Y on X is used to estimate Y from known values of X
  • Similarly, regression line of X on Y is used to predict or estimate values of X from known values of Y
  • The two regression equations are not reversible

Two Regression Lines and Perfect Correlation

  • The two regression lines intersect ate the point (X, Y), that is the mean value
  • The basis for deriving the two regression equations are different
  • The two lines would coincide in case of perfect correlation (positive or negative) and one line is sufficient
  • Two regression equations reduce to Y=Y arid X=X, if X and Y are uncorrelated and are perpendicular to each other

Regression Coefficients - Some Formulas

  • The regression coefficient of Y on X, byx = Cov(X,Y) / σx²
  • The regression coefficient of X on Y, bxy = Cov(X,Y) / σy²
  • The covariance between X and Y is given by Cov(X,Y) = (ΣXY / n) – (ΣX ΣY / n²)
  • The variance of X and Y values are respectively given by σx² = Σ(X-X)² / n and σy² = Σ(Y-Y)² / n

Further Formulas for Regression Coefficients

  • byx = r (σy / σx) and bxy = r (σx / σy), where r is the coefficient of correlation
  • The two regression equations can be expressed as Y−Y = r (σy / σx) (X -X) X-X=r (σy / σx) (Y -Y)
  • When there is a perfect correlation (r = ±1), regression equation of Y on X becomes, (Y-Y/σy) = (X -X/σx)(or, (X -X/σx) =(Y-Y/σy)
  • the two regression lines coincide in case of perfect correlation .If r = 0, i.e., if X and Y are uncorrelated, the two regression equations reduce to Y- Y and X-X, and hence they are perpendicular to each other

Properties of Regression Coefficients

  • The coefficient of correlation and the two regression coefficients have the same signs, since bxy and byx depends on COV(X,Y)
  • The coefficient has the same as sign that of the Cov(X,Y)
  • The coefficient of correlation is the geometric mean between the regression coefficients;ie bxy and byx =r2
  • If one of the regression coefficients is greater than unity, the other must be less than unity because must be less than one
  • Regression coefficients are independent of change of origin but not of scale
  • We know that if U=(X-A) /hand V=(Y-B) / kthen byx =bk/h and bxy=bkv/hv
  • The correlation coefficient is independent of the change in origin and scale therefore rxy equals ruv

Standard Error of Estimate

  • Primary use of regression is to estimate values of dependent from the independent variable.
  • The reliability of such estimates depends on the closeness of the relationship between variables
  • The Standard error of estimate is a measure of the extent of spread/scatter around the regression line, indicating the extent of spread or scatter of the points about the regression line
  • The standard error of estimate measures the scatter or spread of the observed values relative to the values predicted by the regression line (standard deviation measuring the scatter or spread of actual values
  • Syx = sqrt[ Σ(Y - Yc)² / n ], where Yc: computed/estimated Y
  • Syx = sqrt[ ΣY² − aΣY − bΣXY / n ] is a convenient formula for computation
  • Syx - σy√1-r2, here r = coefficient of correlation
  • It indicates the extent of the possible variations (or error) that may be present; i.e. the spread of points around the regression line

Guidelines for problems

  • If it is given that "n" values of the "X" and "Y" are mentioned in the data, use b = [ (nΣXY )-(ΣX)(ΣY) ]/[ nΣX² - (ΣX)²]
  • However, if in the question, there are some notations given use ∑x (Summations) to solve the problems [Refer ex 2, 3, 4, 5]:

Examples for Regression Analysis

  • In case since building Permits have been expressed in hundreds so x=3 if the data, then number of building permit is equal to 300
  • From the examples for this chapter, you can notice that there can be different forms of questions arising in it exam

Examples of solved questions

  • Example 3, 4, 5, 6 are for basic and clear understanding regression coefficient
  • Example 7, 8, 9 helps one to understand "application level questions/ practical issues"
  • Other later numerical and formulas are in the text

Multiple Linear Regression Analysis

  • The multiple linear regression model is a more general form of the linear regression model in which the criterion variable "Y'' is specified as a function of multiple predictor variables
  • The most general formula used in predicting the value is Y= B0 + B2* x1+ B2* x2+ B3* x3+ B4* x4........ Be *xn
  • Where, B0 intercept, x1-to -xn different predictor variables with B2 to Be coefficients for respective predictor variablzs

Non-linear Trend/ Parabolic trend:

The equation of "Parabolic Curve" to better measure to trend is denoted with equation form: Yc = a + bx + cx² Since ΣX = 0 then a=Y/N and c = ΣXY²/ΣX²

Shifting of Orgin for parabola trend

  • Yc 35+5X+ 3X2
  • Here for the shift of a, b. then c remains untouched
  • Yc a2 + b2x+cx²

Trend values using regression model

#NOTE: I did my best to keep to the content requested. Let me know how it is!

Studying That Suits You

Use AI to generate personalized quizzes and flashcards to suit your learning preferences.

Quiz Team

Related Documents

More Like This

Simple Linear Regression Model
39 questions
Linear Regression: Dummy Variables
39 questions
Use Quizgecko on...
Browser
Browser