Bivariate Data & Correlation - PDF

Summary

This document provides an introduction to bivariate data, explaining what it is and how it's related to correlation and regression. It also offers examples and formulas.

Full Transcript

# UNIT - I ## Correlation and Regression ### Bivariate Data The word Bivariate is used to study situations in which two characteristics are measured on each individual, these characteristics being represented by two variables. Statistical data relating to the simultaneous measurement of two variab...

# UNIT - I ## Correlation and Regression ### Bivariate Data The word Bivariate is used to study situations in which two characteristics are measured on each individual, these characteristics being represented by two variables. Statistical data relating to the simultaneous measurement of two variables are called Bivariate Data. A Bivariate Data consist of measurements on two variables, normally denoted as X & Y. The observations of each individual are paired. Thus, for 'N' observations we will have (x1,y1) (x2,y2)….. (xn, yn). **Eg.:** Bivariate Data of heights (X) in cms and weights (Y) in kgs of 8 students can be represented as: | X | Y | |---|---| | 176 | 64 | | 172 | 60 | | 180 | 62 | | 173 | 67 | | 182 | 60 | | 170 | 59 | | 168 | 20 | | 174 | 21 | ### Bivariate Frequency Distribution When Bivariate Data is large, it is advantageous to rearrange the data in the form of a two-way frequency table with two variables X and Y, called as Bivariate Frequency Table. | X Variable | Class Interval | Mid Values | lx₁-Ux₁ | lx₂-Ux₂ | … | lxₙ-Uxₙ | Σn | Ση | |---|---|---|---|---|---|---|---|---| | Y Variable | Class Interval | y₁ | Xy₁ | Xy₂ | … | Xyn | Σn | Σn | | | | y₂ | X²y₁ | X²y₂ | … | X²yn | Σn | Σn | | | | … | … | … | … | … | … | … | | | | yₘ | Xiym | X²ym | … | Xnym | Σn | Σn | | | | | | | | | | | | g(x) | & Σf(x,y) | | | | | | | | N = ΣΣf(x,y) Σf(x,y) = g(x) h(y) - Total of the frequencies **column wise** - Total of the frequencies **column wise** ## Example for Representing Bivariate Frequency Table: The data given below is related to heights and weights of two people. From a two-way frequency table with the class intervals: 62-64, 64-66 and so on for heights and 115-125lbs, 125-135lbs and so on for weights. | S.NO | Heights | Weights | |---|---|---| | 1 | 70 | 170 | | 2 | 65 | 135 | | 3 | 65 | 136 | | 4 | 64 | 137 | | 5 | 69 | 148 | | 6 | 63 | 124 | | 7 | 65 | 128 | | 8 | 70 | 129 | | 9 | 71 | 129 | | 10 | 62 | 129 | | 11 | 70 | 163 | | 12 | 67 | 139 | | 13 | 63 | 122 | | 14 | 68 | 134 | | 15 | 67 | 140 | | 16 | 66 | 138 | | 17 | 67 | 132 | | 18 | 69 | 148 | | 19 | 67 | 129 | | 20 | 67 | 152 | ## Scatter diagram It is the simplest way of the diagrammatic representation of bivariate data. Thus for the bivariate distribution (xi yi) i= 1, 2, 3, ..... n, if the values of the variables x and y are plotted along the x-axis and y-axis represents in the x-y plane, the diagram of dots so obtained is known as scatter diagram. From the scatter diagram, we can form a fairly good, though vague, idea whether the variables are correlated or not. **Example:** If the points are very dense, that is, very close to each other, we should expect a fairly good amount of co-relation between the variables and if the points are widely scattered, a poor co-relation is expected. This method, however, is not suitable if the no. of observation are fairly large. ## Correlation - Numerical. - Quantitative data - Cor-relation is denoted by r and written as r(x,y). In the bi-variate distribution, if the change in one variable affect a change in another variable, the variables are said to be correlated, thus co-relation refers to the relationship between the variables and is denoted by "r". - Direct and positive cor-relation or perfect positive cor-relation - If two variables change is the same direction that is if one increases and other also increases, or if one decreases the other also decreases. It is called as perfect positive correlation or positive correlation. **Examples:** - (i) height and weights of a group of persons. - (ii) The income and expenditure. - Diverse or negative cor-relation / Perfect negative correlation:- If two variables change in the opposite direction that is one increases other decrease and vice-verse, the cor-relation is called as negative cor-relation. **Examples:** - (i) Price and demand of a commodity - (ii) Increase in temperature and demand for sweaters. ## Diagram for Scatter Diagram - **Diagram for perfect positive correlation:** r=1 - **Diagram for perfect negative correlation:** r=-1 - **No Correlation:** r=0 ## Linear correlation If the amount of change in one variable bears a constant ratio to the amount of change in other variable, then the cor-relation is said to be linear. **Ex.:** X: 2 4 6 8 10 Y: 5 10 15 20 25 ## Non-linear cor-relation / Curvilinear correlation If the amount of change in one variable doesn’t bears a constant ratio to the amount of change in other variable, the correlation is said to be non-linear. **Ex.:** X: 2 4 3 2 Y: 4 3 1 9 6 ## Computation of Karl-Pearson Correlation Coefficient for Grouped and Ungrouped Data and its properties It is a measure of intensity or degree of linear relationship between two variables, which is given by: $$ r(X,Y) = \frac{cov(X,Y)}{\sigma_x \sigma_y} = \frac{\sum_{i=1}^n (x_i- \bar{x})(y_i-\bar{y})}{\sqrt{\sum_{i=1}^n (x_i-\bar{x})^2}\sqrt{\sum_{i=1}^n (y_i-\bar{y})^2}} $$ **Prof. Karl Pearson discovered the measure of correlation that is linear relationship between the variables X and Y, called Correlation Coefficient. It is also known as Product Moment Correlation Coefficient. It is denoted by 'r(x,y)' or 'γ'.** If joint probability distribution of X and Y is given, the Correlation Coefficient can be expressed in terms of expectations as shown below: $$ r(X,Y) = \frac{cov(X,Y)}{\sigma_x \sigma_y} = \frac{E[(X-E(X))(Y-E(Y))]}{\sqrt{E(X^2)-[E(X)]^2} \sqrt{E(Y^2)-[E(Y)]^2}} $$ ## Properties of Correlation Co-efficient **Property 1:** The Correlation coeficient always lies between -1 & +1 -1 < r(x,y) < 1 -1 < r(X,Y) ≤ 1 **Proof** $$ r(x,y) = \frac{cov(x,y)}{\sigma_x \sigma_y} $$ $$ \frac{\sum_{i=1}^n (x_i- \bar{x})(y_i-\bar{y})}{\sqrt{\sum_{i=1}^n (x_i-\bar{x})^2}\sqrt{\sum_{i=1}^n (y_i-\bar{y})^2}} $$ From Cauchy Schwartz's inequality: $$ \sum_{i=1}^n(a_ib_i)^2 \leq \sum_{i=1}^n a_i^2 . \sum_{i=1}^n b_i^2 $$ $$ \lfloor \sum_{i=1}^n (x_i- \bar{x})(y_i-\bar{y})\rfloor^2 \leq \sum_{i=1}^n (x_i- \bar{x})^2 . \sum_{i=1}^n (y_i-\bar{y})^2 $$ $$ \lfloor \sum_{i=1}^n (x_i- \bar{x})(y_i-\bar{y})\rfloor^2 \leq \sum_{i=1}^n (x_i- \bar{x})^2 . \sum_{i=1}^n (y_i-\bar{y})^2 $$ $$ \lfloor \sum_{i=1}^n (x_i- \bar{x})(y_i-\bar{y})\rfloor^2 \leq \sum_{i=1}^n (x_i- \bar{x})^2 . \sum_{i=1}^n (y_i-\bar{y})^2 $$ $$ \lfloor \sum_{i=1}^n (x_i- \bar{x})(y_i-\bar{y})\rfloor^2 = \sum_{i=1}^n (x_i- \bar{x})^2 . \sum_{i=1}^n (y_i-\bar{y})^2 $$ $$ \lfloor r(x,y) \rfloor^2 \leq 1 $$ ∴ -1 ≤ r(x,y) ≤ 1 **Property 2:** Correlation Coefficient is independent of change of origin and scale. **Proof:** Changes of origin means some values has been added or subtracted in the observations, change of scale means some values is multiplied or divided in the observations. The Karl Pearson Correlation Coefficient is given by: $$ r(X,Y) = \frac{Cov(X,Y)}{\sigma_x \sigma_y} $$ Let us define: U = X-a, Y = Y-b h k Uh = X-a Y = Y-b X = Uh + a Y = VK + b →① Taking Expected values of equation 1 & 2: E(X) = E(Uh + a) = hE(U) + a E(Y) = E(VK + b) = kE(V) + b →② Consider, Covariance of X,Y: Cov(X,Y) = E[(X-E(X))(Y-E(Y))] X-E(X) from 1 & 3: Uh+a-hE(U)-a Uh-hE(U) h(U - E(U)) Y-E(Y) from 2 & 4 VK+b - kE(V)-b VK- kE(V) k(V - E(V)) →③ $$ Cov(X,Y) = E [h(U-E(U))k(V-E(V))] $$ $$ = hk Cov(U,V) $$ Consider: $$ \sigma_x = \sqrt{E[X-E(X)]^2} $$ $$ = \sqrt{E[h(U-E(U))]^2} $$ $$ = h\sqrt{E[U-E(U)]^2} $$ $$ = h\sigma_u $$ Similarly: $$ \sigma_y = \sqrt{E[Y-E(Y)]^2} $$ $$ = \sqrt{E[k(V-E(V))]^2} $$ $$ = k\sqrt{E[V-E(V)]^2} $$ $$ = k\sigma_v $$ $$ Cov(X,Y) = hk Cov(u,v) \sigma_x \sigma_y = h\sigma_u k\sigma_v $$ $$ r(X,Y) = \frac{Cov(X,Y)}{\sigma_x \sigma_y} $$ $$ = \frac{hk Cov(u,v)}{h\sigma_u k\sigma_v} $$ $$ = \frac{Cov(U,V)}{\sigma_u \sigma_v} $$ ∴ Correlation Coeficient is independent of change of origin and scale. **Hence Proved** **Property 3:** Two independent variables are uncorrelated but the converse is not true. **Proof:** **Uncorrelated Random Variables** Two random variables X and Y are uncorrelated if "Cov (x,y)=0". - **Case 1:** X & Y are independent. If X & Y are two independent random variables, then: $$ f_{X,Y}(x,y) = f_x(x) f_y(y) $$ $$ E(X,Y) = \int_{-\infty}^{\infty} \int_{-\infty}^{\infty} xy f_{X,Y}(x,y) dx dy $$ $$ = \int_{-\infty}^{\infty} \int_{-\infty}^{\infty} xy f_x(x) f_y(y) dx dy $$ $$ = \int_{-\infty}^{\infty} x f_x(x) dx \int_{-\infty}^{\infty} y f_y(y) dy $$ ∴ E(X,Y) = E(X). E(Y) $$ Cov(X,Y) = E(xy) - E(X) - E(Y) $$ ∴ Cov(x,y) = 0 ∴ Two independent random variables are always uncorrelated. - **Case 2:** X & Y are dependent, the converse is not always true. If the true random variables may be dependent but still they may be uncorrelated. **Example** : Let y = X² | X | Y | |---|---| | -3 | 9 | | -2 | 4 | | -1 | 1 | | 0 | 0 | | 1 | 1 | | 2 | 4 | | 3 | 9 | **Property 2:** Correlation Coefficient is independent of change of origin and scale. **Proof:** Changes of origin means some values has been added or subtracted in the observations, change of scale means some values is multiplied or divided in the observations. The Karl Pearson Correlation Coefficient is given by: $$ r(X,Y) = \frac{Cov(X,Y)}{\sigma_x \sigma_y} $$ Let us define: U = X-a, Y = Y-b h k Uh = X-a Y = Y-b X = Uh + a Y = VK + b →① Taking Expected values of equation 1 & 2: E(X) = E(Uh + a) = hE(U) + a E(Y) = E(VK + b) = kE(V) + b →② Consider, Covariance of X,Y: Cov(X,Y) = E[(X-E(X))(Y-E(Y))] X-E(X) from 1 & 3: Uh+a-hE(U)-a Uh-hE(U) h(U - E(U)) Y-E(Y) from 2 & 4 VK+b - kE(V)-b VK- kE(V) k(V - E(V)) →③ $$ Cov(X,Y) = E [h(U-E(U))k(V-E(V))] $$ $$ = hk Cov(U,V) $$ Consider: $$ \sigma_x = \sqrt{E[X-E(X)]^2} $$ $$ = \sqrt{E[h(U-E(U))]^2} $$ $$ = h\sqrt{E[U-E(U)]^2} $$ $$ = h\sigma_u $$ Similarly: $$ \sigma_y = \sqrt{E[Y-E(Y)]^2} $$ $$ = \sqrt{E[k(V-E(V))]^2} $$ $$ = k\sqrt{E[V-E(V)]^2} $$ $$ = k\sigma_v $$ $$ Cov(X,Y) = hk Cov(u,v) \sigma_x \sigma_y = h\sigma_u k\sigma_v $$ $$ r(X,Y) = \frac{Cov(X,Y)}{\sigma_x \sigma_y} $$ $$ = \frac{hk Cov(u,v)}{h\sigma_u k\sigma_v} $$ $$ = \frac{Cov(U,V)}{\sigma_u \sigma_v} $$ ∴ Correlation Coeficient is independent of change of origin and scale. **Hence Proved** **Property 3:** Two independent variables are uncorrelated but the converse is not true. **Proof:** **Uncorrelated Random Variables** Two random variables X and Y are uncorrelated if "Cov (x,y)=0". - **Case 1:** X & Y are independent. If X & Y are two independent random variables, then: $$ f_{X,Y}(x,y) = f_x(x) f_y(y) $$ $$ E(X,Y) = \int_{-\infty}^{\infty} \int_{-\infty}^{\infty} xy f_{X,Y}(x,y) dx dy $$ $$ = \int_{-\infty}^{\infty} \int_{-\infty}^{\infty} xy f_x(x) f_y(y) dx dy $$ $$ = \int_{-\infty}^{\infty} x f_x(x) dx \int_{-\infty}^{\infty} y f_y(y) dy $$ ∴ E(X,Y) = E(X). E(Y) $$ Cov(X,Y) = E(xy) - E(X) - E(Y) $$ ∴ Cov(x,y) = 0 ∴ Two independent random variables are always uncorrelated. - **Case 2:** X & Y are dependent, the converse is not always true. If the true random variables may be dependent but still they may be uncorrelated. **Example** : Let y = X² | X | Y | |---|---| | -3 | 9 | | -2 | 4 | | -1 | 1 | | 0 | 0 | | 1 | 1 | | 2 | 4 | | 3 | 9 | **Property 2:** Correlation Coefficient is independent of change of origin and scale. **Proof:** Changes of origin means some values has been added or subtracted in the observations, change of scale means some values is multiplied or divided in the observations. The Karl Pearson Correlation Coefficient is given by: $$ r(X,Y) = \frac{Cov(X,Y)}{\sigma_x \sigma_y} $$ Let us define: U = X-a, Y = Y-b h k Uh = X-a Y = Y-b X = Uh + a Y = VK + b →① Taking Expected values of equation 1 & 2: E(X) = E(Uh + a) = hE(U) + a E(Y) = E(VK + b) = kE(V) + b →② Consider, Covariance of X,Y: Cov(X,Y) = E[(X-E(X))(Y-E(Y))] X-E(X) from 1 & 3: Uh+a-hE(U)-a Uh-hE(U) h(U - E(U)) Y-E(Y) from 2 & 4 VK+b - kE(V)-b VK- kE(V) k(V - E(V)) →③ $$ Cov(X,Y) = E [h(U-E(U))k(V-E(V))] $$ $$ = hk Cov(U,V) $$ Consider: $$ \sigma_x = \sqrt{E[X-E(X)]^2} $$ $$ = \sqrt{E[h(U-E(U))]^2} $$ $$ = h\sqrt{E[U-E(U)]^2} $$ $$ = h\sigma_u $$ Similarly: $$ \sigma_y = \sqrt{E[Y-E(Y)]^2} $$ $$ = \sqrt{E[k(V-E(V))]^2} $$ $$ = k\sqrt{E[V-E(V)]^2} $$ $$ = k\sigma_v $$ $$ Cov(X,Y) = hk Cov(u,v) \sigma_x \sigma_y = h\sigma_u k\sigma_v $$ $$ r(X,Y) = \frac{Cov(X,Y)}{\sigma_x \sigma_y} $$ $$ = \frac{hk Cov(u,v)}{h\sigma_u k\sigma_v} $$ $$ = \frac{Cov(U,V)}{\sigma_u \sigma_v} $$ ∴ Correlation Coeficient is independent of change of origin and scale. **Hence Proved** **Property 3:** Two independent variables are uncorrelated but the converse is not true. **Proof:** **Uncorrelated Random Variables** Two random variables X and Y are uncorrelated if "Cov (x,y)=0". - **Case 1:** X & Y are independent. If X & Y are two independent random variables, then: $$ f_{X,Y}(x,y) = f_x(x) f_y(y) $$ $$ E(X,Y) = \int_{-\infty}^{\infty} \int_{-\infty}^{\infty} xy f_{X,Y}(x,y) dx dy $$ $$ = \int_{-\infty}^{\infty} \int_{-\infty}^{\infty} xy f_x(x) f_y(y) dx dy $$ $$ = \int_{-\infty}^{\infty} x f_x(x) dx \int_{-\infty}^{\infty} y f_y(y) dy $$ ∴ E(X,Y) = E(X). E(Y) $$ Cov(X,Y) = E(xy) - E(X) - E(Y) $$ ∴ Cov(x,y) = 0 ∴ Two independent random variables are always uncorrelated. - **Case 2:** X & Y are dependent, the converse is not always true. If the true random variables may be dependent but still they may be uncorrelated. **Example** : Let y = X² | X | Y | |---|---| | -3 | 9 | | -2 | 4 | | -1 | 1 | | 0 | 0 | | 1 | 1 | | 2 | 4 | | 3 | 9 | **Property 2:** Correlation Coefficient is independent of change of origin and scale. **Proof:** Changes of origin means some values has been added or subtracted in the observations, change of scale means some values is multiplied or divided in the observations. The Karl Pearson Correlation Coefficient is given by: $$ r(X,Y) = \frac{Cov(X,Y)}{\sigma_x \sigma_y} $$ Let us define: U = X-a, Y = Y-b h k Uh = X-a Y = Y-b X = Uh + a Y = VK + b →① Taking Expected values of equation 1 & 2: E(X) = E(Uh + a) = hE(U) + a E(Y) = E(VK + b) = kE(V) + b →② Consider, Covariance of X,Y: Cov(X,Y) = E[(X-E(X))(Y-E(Y))] X-E(X) from 1 & 3: Uh+a-hE(U)-a Uh-hE(U) h(U - E(U)) Y-E(Y) from 2 & 4 VK+b - kE(V)-b VK- kE(V) k(V - E(V)) →③ $$ Cov(X,Y) = E [h(U-E(U))k(V-E(V))] $$ $$ = hk Cov(U,V) $$ Consider: $$ \sigma_x = \sqrt{E[X-E(X)]^2} $$ $$ = \sqrt{E[h(U-E(U))]^2} $$ $$ = h\sqrt{E[U-E(U)]^2} $$ $$ = h\sigma_u $$ Similarly: $$ \sigma_y = \sqrt{E[Y-E(Y)]^2} $$ $$ = \sqrt{E[k(V-E(V))]^2} $$ $$ = k\sqrt{E[V-E(V)]^2} $$ $$ = k\sigma_v $$ $$ Cov(X,Y) = hk Cov(u,v) \sigma_x \sigma_y = h\sigma_u k\sigma_v $$ $$ r(X,Y) = \frac{Cov(X,Y)}{\sigma_x \sigma_y} $$ $$ = \frac{hk Cov(u,v)}{h\sigma_u k\sigma_v} $$ $$ = \frac{Cov(U,V)}{\sigma_u \sigma_v} $$ ∴ Correlation Coeficient is independent of change of origin and scale. **Hence Proved** **Property 3:** Two independent variables are uncorrelated but the converse is not true. **Proof:** **Uncorrelated Random Variables** Two random variables X and Y are uncorrelated if "Cov (x,y)=0". - **Case 1:** X & Y are independent. If X & Y are two independent random variables, then: $$ f_{X,Y}(x,y) = f_x(x) f_y(y) $$ $$ E(X,Y) = \int_{-\infty}^{\infty} \int_{-\infty}^{\infty} xy f_{X,Y}(x,y) dx dy $$ $$ = \int_{-\infty}^{\infty} \int_{-\infty}^{\infty} xy f_x(x) f_y(y) dx dy $$ $$ = \int_{-\infty}^{\infty} x f_x(x) dx \int_{-\infty}^{\infty} y f_y(y) dy $$ ∴ E(X,Y) = E(X). E(Y) $$ Cov(X,Y) = E(xy) - E(X) - E(Y) $$ ∴ Cov(x,y) = 0 ∴ Two independent random variables are always uncorrelated. - **Case 2:** X & Y are dependent, the converse is not always true. If the true random variables may be dependent but still they may be uncorrelated. **Example** : Let y = X² | X | Y | |---|---| | -3 | 9 | | -2 | 4 | | -1 | 1 | | 0 | 0 | | 1 | 1 | | 2 | 4 | | 3 | 9 | **Problems based on Correlation Co-efficient:** 1. The following table gives no of hours for an prepared for an examination and their marks of 10 students. Find Correlation Between them. | Student No | No of hours prepared | Marks obtained | |---|---|---| | 1 | 4 | 31 | | 2 | 9 | 58 | | 3 | 10 | 65 | | 4 | 14 | 73 | | 5 | 4 | 37 | | 6 | 7 | 44 | | 7 | 12 | 60 | | 8 | 22 | 91 | | 9 | 1 | 21 | | 10 | 17 | 84 | **Solution:** Correlatior Coeficient = r(x,y) or rxy let no of hours "prepared = x Marks Obtained = Y r(x,y) or rxy = Cov(x,y) / (σx σy) | S.NO | X | Y | XY | X² | Y² | |---|---|---|---|---|---| | 1 | 4 | 31 | 124 | 16 | 961 | | 2 | 9 | 58 | 522 | 81 | 3364 | | 3 | 10 | 65 | 650 | 100 | 4225 | | 4 | 14 | 73 | 1022 | 196 | 5329 | | 5 | 4 | 37 | 148 | 16 | 1369 | | 6 | 7 | 44 | 308 | 49 | 1936 | | 7 | 12 | 60 | 720 | 144 | 3600 | | 8 | 22 | 91 | 2002 | 484 | 8281 | | 9 | 1 | 21 | 21 | 1 | 441 | | 10 | 17 | 84 | 1423 | 289 | 4056 | | Σ | 100 | 564 | 6945 | 1376 | 36562 | **Calculation:** **Cov(x,y)** = (& Σ XiYi - Σ X Σ Y) /n = (1/10)(6945 - (100)(564)) = 130.5 **σx** = √(& Σ Xi²-(Σ X)²/n) = √(1/10)(1376-100²) = 6.131 **σy** = √(& Σ Yi²-(Σ Y)²/n) = √(1/10)(36562-(564)²) = 21.8 **r(x,y)** = Cov(x,y) / (σx σy) = 130.5 / (6.131 * 21.8) = 0.981 ∴ The Correlation between no of hours prepared and marks obtained is 0.981. It is a positive correlation. ## Correlation Ratio The Correlation Ratio is the measure to find the curvilinear relationship between the variables. The correlation ratio is denoted by η (eta). In other words, if the variables are linearly related Karl Pearson correlation coefficient is the best measure to know the association between the variables, if there is a curvi-linear relationship between the variables, then the correlation coefficient is not the appropriate measure. In some cases 'r' may be very low and there may not be a linear relation between the variables. In this case, the correlation ratio is the best measure. Correlation Ratio is denoted by η (eta). As "η" measures the concentration of points above the straight line of best fit. Whereas, “eta” measures the concentration of the point about the curve of best fit. ## SPEARMAN'S RANK CORRELATION COEFFICIENT - Qualitative data The Karl Pearson’s Correlation is based upon the assumption that the variable are being measured on interval scale and Ratio scale. If the measurement are made on nominal scale or od ordinal scale, the Karl Pearson cor-relation Coefficient will not work. The Alternate method of finding cor-relahor is known as Spearman's Rank Correlation Corgiant. Using Ranks, Coefficient of Rank Correlation was developed by **“CHARLES EDWARD SPEARMANN” in 1904. This method is specially useful in Qualitative measures.** Let X₁X₂X₃… Xₙ and Y₁Y₂Y₃… Yₙ be the rank of ‘n’ individuals of the charasties A&B respectively. The Correlation obtain for the rank of individuals of two characterstic A&D is called Rank Correlation. Ranks of the individuals may or may not be same. **How to calculate Rank Correlation:** Let x = 1, 2, 3… n y = 1, 2, 3… n x = Σ (1+2+3 +… + n) = (n(n+1))/2 = Σy x = Σ (1²+2²+3²+…+n²) - (Σn)²/n = Σy **Calculation:** C = 1- 6Σdi² / n(n²-1) **Problems on Rank Correlation Coefficient** 1. Compute Spearman's Rank Correlation Coeffpeient | X | Y | Xi | Yi | di = Xi - Yi | di² | |---|---|---|---|---|---| | 20 | 19 | 3 | 2 | 1 | 1 | | 36 | 25 | 1 | 1 | 0 | 0 | | 14 | 9 | 4 | 4 | 0 | 0 | | 29 | 10 | 2 | 3 | -1 | 1 | | 5 | 2 | 6 | 5 | 1 | 1 | | 11 | 6 | 5 | 4 | 1 | 1 | | 2 | 2 | **Solution:** C = 1- 6Σdi² / n(n²-1) = 1- 6(6)/6(6²-1) = 0.9428 ∴ This is positive Cor-relation. ## PARTIAL CORRELATION Correlation btw any two variables studied particularly that is estudied after eleminating the linear effect of others from them is called partial Correlation. In other words, we can say that it remains the partical relationship only between two variables when the effect of other variable is excluded. **Coeff. Of Partial Correl.**

Use Quizgecko on...
Browser
Browser