Summary

This document provides an overview of statistical concepts such as hypothesis testing, different types of sampling (probability and non-probability methods), and analyses of variance (ANOVA). It includes various tests like t-tests and chi-square tests.

Full Transcript

UNIT 3 SYLLABUS: Hypothesis testing: Hypothesis testing: one tailed and two tailed tests for means of small sample (t-test)- F-test – one way and two way analysis of variance (ANOVA) – chi-square test for simple sample standard deviation, independence of attributes and goodness of fit. ...

UNIT 3 SYLLABUS: Hypothesis testing: Hypothesis testing: one tailed and two tailed tests for means of small sample (t-test)- F-test – one way and two way analysis of variance (ANOVA) – chi-square test for simple sample standard deviation, independence of attributes and goodness of fit. Sample Design All the items under consideration in any field constitute a “Universe” or “Population” A complete enumeration of all the items in the “population” is known as a “census enquiry” Since a complete census enquiry is not possible generally, we select a ‘sample’ – a few items from the “universe” for our study Researcher selects the sample by using ‘sampling design’ – a definite plan determined before any data is actually collected Types of Sampling Probability sampling techniques: 1. Simple Random Sampling 2. Systematic Sampling 3. Stratified Sampling 4. Cluster/area Sampling 5. Multi-stage Sampling Types of Sampling Non-Probability sampling techniques: 1. Deliberate Sampling 2. Quota Sampling 3. Sequential Sampling 4. Snowball sampling 5. Panel samples Sampling Techniques Sample: A sample can be defined as a part of the target population that represents the total population. Sampling Process: 1. Define the population. 2. Identify the sampling frame. 3. Specify the sampling unit. 4. Selection of sampling method. Sampling Techniques 5. Determination of sample size. 6. Specify the sampling plan. 7. Selection of samples. Factors for determination of sample size for a survey/research: 1. Inappropriate sampling frame. 2. Defective measuring device. 3. Non-respondents. Sampling Techniques 4. Indeterminancy principle. 5. Natural bias in the reporting of data. Sampling errors: these are the random variations in the sample estimates around the true population estimates. It decreases with the increase in the size of the sample and it happens to be of smaller magnitude in case of homogeneous population. Determination of Sample Size To determine the sample size for a pilot study the following formula is used: Sample size, n = {(σZα) / e}2 Where, σ represents the SD of the population, Zα the value is determined with the level of confidence of the researcher and e represents the error expected in the study Hypothesis Testing Hypothesis: in statistics, hypothesis is referred to as a statement characterising the population that the researcher wishes to verify on the basis of available sample information. Hypothesis testing: it is a process in which choice is made between the two actions, i.e., either accept or reject the presumed statement. Hypothesis Testing Terminologies: 1. Null hypothesis: it is a statement about the population whose credibility or validity the researcher wants to assess based on the sample. This is formulated specifically to test for possible rejection or nullification. It always states ‘no difference’. The researchers main aim is tested with this statement only. Eg: there is no significant difference in the customers opinion on opening walmart outlets in chennai city. Hypothesis Testing 2. Alternative or Alternate hypothesis: the conclusion that we accept when the data fail to support null hypothesis. Eg: the customers prefer kirana shops rather than established outlets. 3. Significance level: it is a percentage value that is the probability of rejecting the null hypothesis when it is true. Normally, 5% and 1% significance values are considered for evaluation. Hypothesis Testing 4. One-tailed test: a hypothesis test in which there is only one rejection region, i.e., we are concerned with whether the observed value deviates from the hypothesised value in one direction. 5. Two-tailed test: a hypothesis test in which the null hypothesis is rejected and if the sample value is significantly higher or lower than the hypothesised value. It is the test that invloves both the rejection regions Hypothesis Testing Types of hypothesis: Descriptive hypothesis. Relational hypothesis. Working hypothesis. Null hypothesis. Analytical hypothesis. Statistical hypothesis. Common sense hypothesis. Simple and composite hypothesis. Hypothesis Testing Sources of hypothesis: 1. Theory. 2. Observation. 3. Past experience. 4. Case studies. 5. Similarity. Steps involved in Hypothesis Testing 1. Formulate the hypothesis. 2. Select the level of significance. 3. Select an appropriate test. 4. Calculate the value. 5. Obtain the critical test value. 6. Make decisions. Errors in hypothesis testing 1.Type 1 error- Hypothesis is true but your test rejects it. 2. Type 2 error- Hypothesis is false but your test accepts it. Level of significance and confidence Significance means the percentage risk to reject a null hypothesis when it is true and it is denoted by 𝛼. Generally taken as 1%, 5%, 10% (1−𝛼)is the confidence interval in which the null hypothesis will exist when it is true. Two tailed test at 10 5% Significance level Acceptance and Rejection regions in case of a Two Suitable When 𝐻0: 𝜇 = 𝜇0 tailed test 𝐻𝑎: 𝜇 ≠ 𝜇0 𝑅𝑒𝑗𝑒𝑐𝑡𝑖𝑜𝑛 𝑟𝑒𝑔𝑖𝑜𝑛 𝑅𝑒𝑗𝑒𝑐𝑡𝑖𝑜𝑛 𝑟𝑒𝑔𝑖𝑜𝑛 𝑇𝑜𝑡𝑎𝑙 𝐴𝑐𝑐𝑒𝑝𝑡𝑎𝑛𝑐𝑒 𝑟𝑒𝑔𝑖o𝑛 /𝑠𝑖𝑔𝑛𝑖𝑓𝑖𝑐𝑎𝑛𝑐𝑒 𝑙𝑒𝑣𝑒𝑙 /𝑠𝑖𝑔𝑛𝑖𝑓𝑖𝑐𝑎𝑛𝑐𝑒 𝑙𝑒𝑣𝑒𝑙 (𝛼 = 0.025 𝑜𝑟 2.5%) 𝑜𝑟 𝑐𝑜𝑛𝑓𝑖𝑑𝑒𝑛𝑐𝑒 𝑙𝑒𝑣𝑒𝑙 (𝛼 = 0.025 𝑜𝑟 2.5%) (1 − 𝛼) = 95% 𝐻0: 𝜇 = 𝜇0 Left tailed test at 11 5% Significance level Acceptance and Rejection regions in case of a left tailed Suitable When 𝐻0: 𝜇 = 𝜇0 test 𝐻𝑎 : 𝜇 < 𝜇0 𝑅𝑒𝑗𝑒𝑐𝑡𝑖𝑜𝑛 𝑟𝑒𝑔𝑖𝑜𝑛 𝑇𝑜𝑡𝑎𝑙 𝐴𝑐𝑐𝑒𝑝𝑡𝑎𝑛𝑐𝑒 𝑟𝑒𝑔𝑖𝑜𝑛 /𝑠𝑖𝑔𝑛𝑖𝑓𝑖𝑐𝑎𝑛𝑐𝑒 𝑙𝑒𝑣𝑒𝑙 𝑜𝑟 𝑐𝑜𝑛𝑓𝑖𝑑𝑒𝑛𝑐𝑒 𝑙𝑒𝑣𝑒𝑙 (𝛼 = 0.05 𝑜𝑟 5%) (1 − 𝛼) = 95% 𝐻0: 𝜇 = 𝜇0 Right tailed test at 12 5% Significance level Acceptance and Rejection regions in case of a Right Suitable When 𝐻0: 𝜇 = 𝜇0 tailed test 𝐻𝑎 : 𝜇 > 𝜇0 𝑇𝑜𝑡𝑎𝑙 𝐴𝑐𝑐𝑒𝑝𝑡𝑎𝑛𝑐𝑒 𝑟𝑒𝑔𝑖𝑜𝑛 𝑅𝑒𝑗𝑒𝑐𝑡𝑖𝑜𝑛 𝑟𝑒𝑔𝑖𝑜𝑛 𝑜𝑟 𝑐𝑜𝑛𝑓𝑖𝑑𝑒𝑛𝑐𝑒 𝑙𝑒𝑣𝑒𝑙 /𝑠𝑖𝑔𝑛𝑖𝑓𝑖𝑐𝑎𝑛𝑐𝑒 𝑙𝑒𝑣𝑒𝑙 (1 − 𝛼) = 95% (𝛼 = 0.05 𝑜𝑟 5%) 𝐻0: 𝜇 = 𝜇0 HYPOTHESIS TESTING PROCEDURES Z-test (Large Samples) The z-test is a hypothesis test in which the z-statistic follows a normal distribution. The z-test is best used for greater than 30 samples because, under the central limit theorem, as the number of samples gets larger, the samples are considered to be approximately normally distributed. A z-test is a statistical test used to determine whether two population means are different when the variances are known and the sample size is large. t-test (Small Samples) A t-test is an analysis of two populations means through the use of statistical examination; a t-test with two samples is commonly used with small sample sizes, testing the difference between the samples when the variances/SD of two normal distributions are not known. A t-test looks at the t-statistic, the t-distribution and degrees of freedom to determine the probability of difference between populations for hypothesis testing. T Test is often called Student's T test in the name of its founder "Student". t-test : Test for a specified mean Two tailed test hypothesis: H0 : μ=μ0 H1 : μ≠μ0 Test Statistic, Where S = √{(ns2) / (n-1)} Inference: Table Value: (n-1) is the degrees of freedom for the distribution. This value is used to find the table value for the given level of significance. If the calculated value is less than the table value at 5% or 1% significance value, Null Hypothesis is accepted. If the calculated value is more than the table value at 5% or 1% significance value, Null Hypothesis is rejected. So, the alternative hypothesis will be accepted in that case. Note: One tailed test is performed the same way the difference is with the representation in hypothesis and the table values for significance level. t-test : Test of significance for the difference between two population means when the population SD’s are not known Two tailed test hypothesis: H0 : μ1=μ2 H1 : μ1≠μ2 Test Statistic, Where Sp = √{(n1 s1 2 + n2 s2 2) / (n1 + n2 -2)} Inference: Table Value: (n-1) is the degrees of freedom for the distribution. This value is used to find the table value for the given level of significance. If the calculated value is less than the table value at 5% or 1% significance value, Null Hypothesis is accepted. If the calculated value is more than the table value at 5% or 1% significance value, Null Hypothesis is rejected. So, the alternative hypothesis will be accepted in that case. Note: One tailed test is performed the same way the difference is with the representation in hypothesis and the table values for significance level. t-test – Paired Observations Condition of independence may not hold good for all samples. When the samples are related to each other t-test can be performed for small samples on converting the given sample into single data type by taking the difference. So, the formula will be: Where d=x-y and Sd represents the S.D of the population(here). Note: If Sd value is taken from the sample then the denominator will be Sd/ √n-1 Inference is similar to the previous t-tests discussed. F-test This test is based on the test statistic that follows F-distribution. This F-test is used to check the equality of two population variances. Two tailed test hypothesis: H0 : σ12=σ22 H1 : σ12≠σ22 F-test Test Statistic, The value of F is calculated such that it is always greater than 1 F-test Table value Calculations: The value of (n1 – 1) degrees of freedom represents the row & the value of (n2 – 1) degrees of freedom represents the column. With this table value the final interpretation is made. If the calculated value is less than the table value, Null Hypothesis is accepted. If the calculated value is more than the table value, Null Hypothesis is rejected. So, the alternative hypothesis will be accepted in that case. ANOVA Analysis Of Variance. It is a technique to test equality of means when more than 2 populations are considered. Between Sample Variation and within sample variation. There are two types in this: (i) One-way ANOVA & (ii) Two-way ANOVA One-Way Analysis of Variance One-Way Analysis of Variance Methodology: Write down the Hypothesis for one way ANOVA i.e., H0 : H1 : 1. Calculate N (Total No.of Observations) 2. Calculate T (Total of all the observations) 3. Calculate Correction Factor T2/N. 4. Calculate Sum of Squares: (i) Total Sum of Squares: SST = [∑X12 + ∑ X22 + ……. +∑ Xn2] – T2/N (ii) Column Sum of Squares: SSC = {[(∑X1 ) 2 /N1]+ [(∑X2) 2/N1] + ……. +[(∑Xn ) 2/N1]} – T2/N Where N1 refers to the No.of elements in each column One-Way Analysis of Variance 5. Prepare ANOVA table and Calculate F-ratio (F-value is calculated such that F>1) ANOVA TABLE SOURCE OF SUM OF DEGREES OF MEAN SUM VARIANCE VARIATION SQUARES FREEDOM OF SQUARES RATIO Between SSC c-1 MSC = F= Columns SSC/ (c-1) MSC / MSE Within SSE N-c MSE = Or Columns SSE / (N-c) F= (Errors) MSE/MSC Total SST N-1 One-Way Analysis of Variance 6. After calculating the F ratio value, final interpretation is made on comparison with the respective table value. Finding Table value: When F is calculated using the formula MSC / MSE 1. (c-1) degrees of freedom value - Column. 2. (N-c) degrees of freedom value – Row. Compare the respective table value with the calculated value at 5% or 1% level of significance. If calculated value < table value, Null hypothesis is accepted. If calculated value > table value, Null hypothesis is rejected & Alternate hypothesis is accepted. Accordingly give the final interpretation in wordings. Example- one way ANOVA Example: 3 samples obtained from normal populations with equal variances. Test the hypothesis at 5% level of significance that sample means are equal. 8 7 12 10 5 9 7 10 13 14 9 12 11 9 14 Solution: H0 : H1 : X1 (X1) 2 X2 (X2 )2 X3 (X3 )2 8 64 7 49 12 144 10 100 5 25 9 81 7 49 10 100 13 169 14 196 9 81 12 144 11 121 9 81 14 196 Total 50 530 40 336 60 734 1. Number of observations , N= 15 2. Total sum of all observations , T= 50 + 40 + 60 = 150 3. Correction factor = T2 / N=(150)2 /15= 22500/15=1500 4. Total sum of squares, SST= 530+ 336+ 734 – 1500= 100 5. Sum of squares between samples, SSC=(50)2/5 + (40)2 /5 + (60) 2 /5 - 1500=40 6. Sum of squares within samples, SSE= 100-40= 60 ANOVA Table SOURCE OF SUM OF DEGREES OF MEAN SUM VARIANCE VARIATION SQUARES FREEDOM OF SQUARES RATIO Between SSC = 40 c-1 = 3-1 = 2 MSC = F= Columns SSC/ (c-1) MSC / MSE =40/2 = 20 =20/5 = 4 (Since, MSC>MSE) Within SSE = 60 N-c = 15-3 = MSE = Columns 12 SSE / (N-c) = (Errors) 60/12 = 5 Total SST = 100 N-1 = 15-1 = 14 F=4 (Calculated Value) Solution… Table Value: V1 = 2 and V2 = 12 at 5% level of significance = 3.89 Calculated value>Table value, so Null Hypothesis is rejected and Alternate Hypothesis is accepted. So, the population means are not equal at 5% level of significance. Two-Way Analysis of Variance Methodology: Write down the Hypothesis for Two way ANOVA i.e., H0 : There is no significant difference between column means as well as between row means. H1 : There is a significant difference between column means as well as between row means. 1. Calculate N (Total No.of Observations) 2. Calculate T (Total of all the observations) 3. Calculate Correction Factor T2/N. 4. Calculate Sum of Squares: (i) Total Sum of Squares: SST = [∑X12 + ∑ X22 + ……. +∑ Xn2] – T2/N (ii) Column Sum of Squares: SSC = {[(∑X1 ) 2 /N1]+ [(∑X2) 2/N1] + ……. +[(∑Xn ) 2/N1]} – T2/N Where N1 refers to the No.of elements in each column (iii) Row Sum of Squares: SSR = {[(∑Y1 ) 2/N2] + [(∑Y2) 2/N2] + ……. +[(∑Yn ) 2/N2]} – T2/N Where N2 refers to the No.of elements in each row Two-Way Analysis of Variance 5. Prepare ANOVA table and Calculate F-ratio (F-value is calculated such that F>1) ANOVA TABLE SOURCE OF SUM OF DEGREES OF MEAN SUM VARIANCE VARIATION SQUARES FREEDOM OF SQUARES RATIO Between SSC c-1 MSC = Fc = MSC/MSE Columns SSC/ (c-1) Between Rows SSR r-1 MSR = FR = MSR/MSE SSR / (r-1) Residual (Errors) SSE N-c-r+1 MSE = SSE / (N-c-r+1) Total SST N-1 Two-Way Analysis of Variance 6. After calculating the Fc and FRratio value, final interpretation is made on comparison with the respective table value. Finding Table value: Fc 1. (c-1) degrees of freedom value - Column. 2. (N-c-r+1) degrees of freedom value – Row. FR 1. (r-1) degrees of freedom value - Column. 2. (N-c-r+1) degrees of freedom value – Row. Compare the respective table value with the calculated value at 5% or 1% level of significance. If calculated value < table value, Null hypothesis is accepted. If calculated value > table value, Null hypothesis is rejected & Alternate hypothesis is accepted. Accordingly give two final interpretation in wordings. Example – Two way ANOVA In a certain factory, production can be accomplished by four different workers on five different machines. A sample study, in context of a two-way design without repeated values is being made with two fold objective of examining whether the four workers differ with respect to mean production and whether the mean productivity is same for all the 5 machines. The researcher involved in this study reports the following data: i. Sum of squares for variation between machines = 35.2 ii. Sum of squares for variation between workmen = 53.8 iii. Sum of squares for total variation = 174.2 Set up ANOVA table for the given information and draw the inference about variation at 5% level of significance. Solution: H0 : Workers do not differ significantly with respect to mean productivity and the mean productivity is same for 5 different machines H1 : Workers differ significantly with respect to mean productivity and the mean productivity is not the same for the machines Solution: Two-way ANOVA Table SOURCE OF SUM OF DEGREES OF MEAN SUM OF VARIANCE VARIATION SQUARES FREEDOM SQUARES RATIO Between SSC = 35.2 c-1 = 5-1 MSC = Fc = MSC/MSE Columns =4 SSC/ (c-1) = 8.8/7.1 = 35.2/4 = 8.8 = 1.24 Between Rows SSR = 53.8 r-1 = 4-1 MSR = FR = MSR/MSE =3 SSR / (r-1) = 17.93/7.1 = 53.8/3 = 17.93 = 2.53 Residual (Errors) SSE = 85.2 N-c-r+1 = 20-5- MSE = 4+1 = 12 SSE / (N-c-r+1) = 85.2/12 = 7.1 Total SST = 174.2 N-1 = 20-1 = 19 Solution: Calculated values: Fc = MSC/MSE = 1.24 FR = MSR/MSE = 2.53 Table Values: For Fc (Column = 4, Row=12) = 3.26 For FR(Column = 3, Row=12) = 3.49 Final Interpretations: Calculated Fc < Table value of Fc Calculated FR < Table value of FR So, H0 is accepted. There is no significant difference between mean productivity with respect to workers and machines. Two-Way Analysis of Variance Coding method is another method of solving two-way ANOVA. In this method the first step is subtract the given value from all the values in the data set and then the regular methodology discussed earlier will be followed. NON-PARAMETRIC METHODS Non-parametric tests can be applied when: – Data don’t follow any specific distribution and no assumptions about the population are made. Distribution-free tests. – Data measured on any scale. Commonly used Non Parametric Tests are: − Chi Square test − The Sign Test − Wilcoxon Signed-Ranks Test − Mann–Whitney U test − The Kruskal Wallis or H test − The Spearman rank correlation test CHI SQUARE TEST First used by Karl Pearson Simplest & most widely used non-parametric test in statistical work. Calculated using the formula- χ2 = ∑ ( O – E )2 E O = observed frequencies E = expected frequencies Greater the discrepancy between observed & expected frequencies, greater shall be the value of χ2. Calculated value of χ2 is compared with table value of χ2 for given degrees of freedom. CHI SQUARE TEST Application of chi-square test: – The test of goodness of fit (determine if actual numbers are similar to the expected/ theoretical numbers) – Test for independence of attributes (disease & treatment, vaccination & immunity) – To test if the population has a specified value of variance. CHI SQUARE TEST FOR INDEPENDENCE OF ATTRIBUTES H0: In the population, the two categorical variables are independent OR there is no relationship between the two variables. H1: In the population, two categorical variables are dependent OR there is a relationship between the two variables. Summarize the data in the two-way contingency table. One column representing the observed counts and another for expected counts. Calculating the Expected count from the observed data set through a formula CHI SQUARE TEST FOR INDEPENDENCE OF ATTRIBUTES Expected Count, E=[row total × column total] / sample size. Then the table is expanded to calculate χ2 = ∑ ( O – E )2 E The calculated value is compared with the table value and final interpretations are made. For Table Value: Degrees of freedom = (r - 1) (c - 1), which represents the row value to be used under significance column values. Example: 1. The following contingency table shows the classification of 1000 workers in a factory according to the disciplinary action taken by the management and their promotional experience: Disciplinary Promotional Experience Total Action Promoted Not Promoted Offenders 30 670 700 Non-offenders 70 230 300 Total 100 900 1000 Use Chi-square test to ascertain the disciplinary action taken and promotional experience are associated. Solution: H0: There is no significant relationship between disciplinary action taken and promotional experience of workers H1: There is a significant relationship between disciplinary action taken and promotional experience of workers Observed Frequency Table: Disciplinary Promotional Experience Total Action Promoted Not Promoted Offenders 30 670 700 Non-offenders 70 230 300 Total 100 900 1000 Expected Frequency Table: Disciplinary Promotional Experience Total Action Promoted Not Promoted Offenders 70 630 700 Non-offenders 30 270 300 Total 100 900 1000 Solution Ctd… Observed Expected (O-E) ( O – E )2 ( O – E ) 2/ E Values, O Values, E 30 70 -40 1600 22.86 670 630 40 1600 2.54 70 30 40 1600 53.33 230 270 -40 1600 5.92 ∑ ( O – E )2/ E = 84.65 Calculated Value = 84.65 Table Value: Degrees of freedom = (r-1)(c-1) It is a 2x2 contingency table. So, r=2 & c=2 N.d.f = (2-1)(2-1) = 1 Level of Significance = 5% Final Interpretation: Calculated Value = 84.65 Table value=3.8415 Calculated value > Table value Null hypothesis is rejected and alternate hypothesis is accepted. So there exists a significant relationship between the disciplinary action taken and the promotional experience of workers in the factory. CHI SQUARE TEST FOR GOODNESS OF FIT The chi-square goodness of fit test is appropriate when the following conditions are met: The sampling method is simple random sampling The variable under study is categorical The expected value of the number of sample observations in each level of the variable is at least 5. This approach consists of four steps: (1) state the hypothesis, (2) formulate an analysis plan, (3) analyze sample data, and (4) interpret results. CHI SQUARE TEST FOR GOODNESS OF FIT - ANALYSIS Degrees of freedom: The degrees of freedom (DF) is equal to the number of levels (k) of the categorical variable minus 1: DF = k - 1. Expected frequency counts: The expected frequency counts at each level of the categorical variable are equal to the sample size times the hypothesized proportion from the null hypothesis i.e., Ei = npi where Ei is the expected frequency count for the ith level of the categorical variable, n is the total sample size, and pi is the hypothesized proportion of observations in level i. Test statistics:. The test statistic is a chi-square random variable (Χ2) defined by the following equation.Χ2 = Σ [ (Oi - Ei)2 / Ei ] where Oi is the observed frequency count for the ith level of the categorical variable, and Ei is the expected frequency count for the ith level of the categorical variable. INTERPRETING THE FINAL RESULTS ON COMPARISON WITH THE TABLE VALUE (USING DF VALUE) Example: 1. A sample analysis of examination results of 500 students was made. It was found that 220 students have failed, 170 have secured a third class, 90 have secured a second class and rest first class. Does this sample support the general belief that the above categories are in the ratio 4:3:2:1 respectively? Solution: H0: Results of the four category students follow the ratio 4:3:2:1 H1: Results of the four category students do not follow the ratio 4:3:2:1 Solution ctd… Calculating Expected Frequencies: Ei = npi E1 = np1 = 500 x (4/10) = 200 E2 = np2 = 500 x (3/10) = 150 E3 = np3 = 500 x (2/10) = 100 E4 = np4 = 500 x (1/10) = 50 Categories Observed Expected (O-E) ( O – E )2 ( O – E ) 2/ E Values, O Values, E Failures 220 200 20 400 2 Third Class 170 150 20 400 2.67 Second Class 90 100 -10 100 1 First Class 20 50 -30 900 18 Total 500 ∑ ( O – E )2/ E =23.67 Table Value: N.d.f = (k-1) = (4-1) = 3 Level of Significance = 5% Final Interpretation Calculated Value = 23.67 Table Value = 7. 8147 Calculated Value > Table Value So, Null hypothesis is rejected and alternate hypothesis is accepted. Hence the results of the four category of students do not follow the ratio of 4:3:2:1 CHI SQUARE TEST FOR A SPECIFIED POPULATION VARIANCE OR STANDARD DEVIATION To test a claim about the value of the variance or the standard deviation of a population, then the test statistic will follow a chi-square distribution with n−1 degrees of freedom, and is given by the following formula. χ2= ns2 / σ2 where n=Sample size, s=Sample S.D and σ = Population S.D Example: 1. A random sample of size 20 from a population gives the sample standard deviation of 6. Test the hypothesis that the population standard deviation is 9 at 1% level of significance. Solution: H0: σ = 9 and H1: σ ≠ 9 Formula: χ2= ns2 / σ2 Given: n = 20, s=6 and σ = 9 Substituting the values in the formula, χ2= (20 x 62)/92 = 720/81 = 8.88 Final Interpretation Table Value: N.d.f = n-1 = 20-1 =19 Level of Significance = 1% Table value = 36.1909 Calculated value < Table value. So, Null hypothesis is accepted and hence the population standard deviation for the given distribution is 9. Parametric vs Non-parametric Parametric tests => have information about population, or can make certain assumptions – Assume normal distribution of population. – Data is distributed normally. – population variances are the same. Non-parametric tests are used when there are no assumptions made about population distribution – Also known as distribution free tests. – But info is known about sampling distribution.

Use Quizgecko on...
Browser
Browser