RADIVIEW - Statistics & Probability PDF

Summary

This document contains notes on statistics and probability, including topics like hypothesis testing. It appears to be a reviewer for a mid-term exam for STEM students.

Full Transcript

STEM STUDENTS’ MIDTERMS: SECOND QUARTER REVIEWER PROB AND STATS BASIC CONCEPTS OF HYPOTHESIS TESTING Ho: The average number of hours it takes a nine-year old...

STEM STUDENTS’ MIDTERMS: SECOND QUARTER REVIEWER PROB AND STATS BASIC CONCEPTS OF HYPOTHESIS TESTING Ho: The average number of hours it takes a nine-year old child to learn a certain task is Hypotheses are tentative statements to equal to 0.50 hour (μ = 0.50). explain facts about a situation based on available evidences. Ha: The average number of hours it takes a nine-year old child to learn a certain task is less Test of hypothesis is a statistical testing than 0.50 hour (μ < 0.50). procedure to resolve a hypothesis. TYPES OF TESTS AND THE REJECTION REGION A statistical hypothesis is a statement about the numerical value of a population 1. Directional Test parameter. It is a statement or tentative A test of any statistical hypothesis where assertion which aims to explain facts about a alternative hypothesis is expressed by less certain phenomenon. than () is called directional TWO KINDS OF HYPOTHESIS or one-tailed test since the rejection region lies entirely on one tail of the sampling 1. A null hypothesis, denoted by HO, is a distribution. statement that there is no difference between a parameter and a specific value. 2. Nondirectional test 2. An alternative hypothesis, denoted by Ha, is A test where an alternative hypothesis is a statement that there exists a difference written with a not equal sign (≠) is called a between a parameter and a specific value. It is nondirectional test or two tailed tests since the opposite or the negation of the null there is no assertion made on the direction of hypothesis. the difference. The rejection region is split into two parts, one in each tail of the sampling Example: distribution. Claim: The average monthly income of Filipino TYPES OF ERRORS families in the low-income bracket is Php 8 000. Type 1 error occurs when we reject the null hypothesis when it is true. This is also known Ho: The average monthly income of Filipino as alpha error (α error) families in the low-income bracket is Php 8 000 (μ = 8 000). Type II error occurs when we accept the null hypothesis when it is false. This is known as Ha: The average monthly income of Filipino beta error (β error) families in the low-income bracket is not equal to Php 8 000 (μ ≠ 8 000). LEVEL OF SIGNIFICANCE Example: The probability of committing Type I error is known as level of significance. This is denoted Claim: The average number of hours it takes a by Greek letter α (alpha). The value of α tells nine-year old child to learn a certain task is less us the probability of committing error in than 0.50 hour. rejecting the null hypothesis when it is actually true. PALAWAN NATIONAL SCHOOL SCIENCE, TECHNOLOGY, ENGINEERING, AND MATHEMATICS STUDENTS’ ASSOCIATION (STEMSA) Bgy. Manggahan, Puerto Princesa City, Palawan Telephone No.: PNS STEMSA 09684548804 E-mail Address: [email protected] STEM STUDENTS’ MIDTERMS: SECOND QUARTER REVIEWER PROB AND STATS THE PARAMETER The formula for z-test can be written as: Parameter-a numerical characteristic of a (X̅ − μ) z= 𝑛 population. √σ Rejection region-the interval in the sampling where: distribution that leads to rejection of the null hypothesis. X̅ = sample mean Example of parameters and their symbol: μ = population mean 1. Mean (μ) – the average of the outcome n = sample size based on the repeated process or experiment. σ = standard deviation 2. Variance (σ) – a measure of variability, Example: A certain drug is claimed by its shows the degree of spread in a data set. manufacturers to reduce overweight men by 3. Standard deviation (σ) – denotes how far, 4.75 kg per month, with a standard deviation on the average, an observed value is from its of 0.89 kg. Ten randomly chosen men reported mean, a square root of the variance. losing an average of 4.25 kg within a month. Does this data support the claim of the Example: The average weight of grade 11 manufacturer at 0.05 level of significance? students 17 years and older is 59 kilograms for males. Solution: Parameter: The average weight of grade 11 Follow the five-step procedure in testing students is 59 kg. hypothesis. Hypothesis testing is a statistical testing Step 1: procedure used to resolve a hypothesis. A Ho: The average weight loss per month using hypothesis is tested in order to identify the drug is equal to 4.75kg (μ = 4.75). whether it is true or not. If it is found to be true, it is accepted; if it is found to be false, it is to be Ha: The average weight loss per month using rejected. the drug is not equal to 4.75 kg (μ ≠ 4.75). Sample Mean- is the estimate of the Step 2: population mean. Type of test: two-tailed or non-directional test Population Mean- is the average of the group Critical value: With the use of table 1, α = 0.05, characteristic. two-tailed test, the critical value is z = ±1.96 TEST-STATISTIC WHEN THE POPULATION VARIANCE IS KNOWN AND N > 30 The z-test of one-sample mean is used to test if the sample mean X̅ significantly differ from the population mean μ. PALAWAN NATIONAL SCHOOL SCIENCE, TECHNOLOGY, ENGINEERING, AND MATHEMATICS STUDENTS’ ASSOCIATION (STEMSA) Bgy. Manggahan, Puerto Princesa City, Palawan Telephone No.: PNS STEMSA 09684548804 E-mail Address: [email protected] STEM STUDENTS’ MIDTERMS: SECOND QUARTER REVIEWER PROB AND STATS Rejection region: 92 000. A random sample of 36 families reveal a mean of Php 95 000 with a standard deviation of Php 5 000. Based on these sample data, can it be concluded that the study is correct in its claim? Use 0.01 level of significance. Solution: Step 3: Compute test value Step 1: Given: Ho: The average cost of raising a child from X̅ = 4.25 μ = 4.75 n = 10 σ = 0.89 birth to age one is equal to Php 92 000 (μ = 92 Substitute the given values in the formula: 000). (X̅ − μ) Ha: The average cost of raising a child from z= 𝑛 √σ birth to age one is more than Php 92 000 (μ > (4.25 − 4.75) 92 000). z= 10 √0.89 Step 2: z = −1.78 Type of test: one-tailed or directional test The test value or the computed value is z = Critical value: With the use of table 1, α = 0.01, −1.78. one-tailed test, the critical value is z = 2.33 Step 4: Rejection region: Decision: Since the computed or test value does not fall within the rejection region, we accept the null hypothesis. Step 5: Conclusion: There is no significant difference Step 3: Compute test value between the sample and population mean. Given: Thus, the manufacturer is correct in claiming that the drug can reduce overweight men by X̅ = 95 000 μ = 92 000 4.75 kg per month. n = 36 σ = 5 000 TEST-STATISTIC WHEN THE POPULATION Substitute the given values in the formula: VARIANCE IS UNKNOWN AND N > 30 (X̅ − μ) If, for instance, σ is unknown, we can still use z= 𝑛 √σ the z-test by replacing σ by s (sample standard deviation) given that n > 30. (95,000− 92,000) z= 36 √5000 Example: A study shows that the cost of raising a child from birth to age one is more than Php z = 3.6 PALAWAN NATIONAL SCHOOL SCIENCE, TECHNOLOGY, ENGINEERING, AND MATHEMATICS STUDENTS’ ASSOCIATION (STEMSA) Bgy. Manggahan, Puerto Princesa City, Palawan Telephone No.: PNS STEMSA 09684548804 E-mail Address: [email protected] STEM STUDENTS’ MIDTERMS: SECOND QUARTER REVIEWER PROB AND STATS The test value or the computed value is z = 3.6 Step 1: Formulate the null and alternative hypotheses. Step 4: Ho: The average number of words that Decision: Since the computed or test value falls graduates can type is 85 words per minute (μ = within the rejection region, we reject the null 85). hypothesis. Ha: The average number of words that Step 5: graduates can type is more than 85 words per Conclusion: There is a significant difference minute (μ > 85) between the sample and population mean. Step 2: Thus, the study is correct in stating that the cost of raising a child from birth to age one is Type of test: The test is one-tailed(right-tailed) more than Php 92 000. Critical value: Using the t-distribution table, TEST-STATISTIC WHEN THE POPULATION the critical value of t at 0.05 level, one-tailed VARIANCE IS UNKNOWN AND N < 30 test, df = 15-1 = 14 is t = 1.761 When σ is unknown and the population is less Rejection region: than 30, we will use the t-test of one-sample mean. The formula for t-test is given as (X̅ − μ) t= 𝑠 √𝑛 where: X̅ = mean of the sample Step 3: Compute for the test value. μ = mean of the population Given: n = size of the sample X̅ = 80 μ = 85 n = 15 s = 7.6 s = standard deviation of the population Substitute values in the formula df = n − 1 (80 − 85) t= 7.6 √15 Example: The director of a certain school for secretarial studies claimed that his graduates = −2.55 can type more than 85 words per minute. A Step 4: random sample of 15 graduates has been found to have an average of 80 words per Decision: Accept the null hypothesis since the minute with a standard deviation of 7.6 words computed value falls outside the rejection per minute. Using 0.05 level of significance, region. test the claim of the director. Step 5: Solution: PALAWAN NATIONAL SCHOOL SCIENCE, TECHNOLOGY, ENGINEERING, AND MATHEMATICS STUDENTS’ ASSOCIATION (STEMSA) Bgy. Manggahan, Puerto Princesa City, Palawan Telephone No.: PNS STEMSA 09684548804 E-mail Address: [email protected] STEM STUDENTS’ MIDTERMS: SECOND QUARTER REVIEWER PROB AND STATS Conclusion: There is no significant difference p̂ −p z= 𝑝𝑞 √ between the sample mean and the population 𝑛 mean. Thus, the claim of the director of a 0.64−0.60 z= (0.60)(0.40) certain school for secretarial studies that their √ 200 graduates can type more than 85 words per 0.04 minute is incorrect. z= 0.24 √ 200 STATISTICAL TEST 0.04 z= -the data obtained from a sample to make a √0.0012 decision about whether the null hypothesis 0.04 z= should be rejected. 0.03464 z-=1.15 -the numerical value obtained from a statistical test is called test-value DRAWING CONCLUSIONS ABOUT POPULATION ON TEST STATISTICS VALUE AND FORMULA FOR Z-TEST FOR PROPORTION THE REJECTION REGION p̂ −p z= 𝑝𝑞 STEP 1: State the hypotheses and identify the √ 𝑛 claim Where: STEP 2: Find the critical value(s) 𝑥 p̂ = (sample proportion) 𝑛 STEP 3: Compute the test value.First, it is p=population proportion necessary to find p̂ 𝑥 n=sample size p̂ = 𝑛 q=1-p Then use this formula: ASSUMPTIONS FOR TESTING A PROPORTION p̂ −p z= 𝑝𝑞 √ 𝑛 Example: A dietician of PPC claims that 60% Palawenos are trying to avoid junk foods in STEP 4: Make the decision their diets. She randomly selected 200 people STEP5: Summarize the results/draw and found that 128 people stated that they conclusions. were trying to avoid junk foods in their diets,compute for the z-value CHI SQUARE TEST FOR GOODNESS OF FIT Given: (𝑂−𝐸)2 Formula: 𝑥 2 = 𝐸 x=128 n=200 𝑥 2 = chi square 128 p=0.60 p̂ = p̂ = 0.64 200 O=observed or actual frequencies q=1-0.60=0.40 E=expected frequencies Substitute in the formula. PALAWAN NATIONAL SCHOOL SCIENCE, TECHNOLOGY, ENGINEERING, AND MATHEMATICS STUDENTS’ ASSOCIATION (STEMSA) Bgy. Manggahan, Puerto Princesa City, Palawan Telephone No.: PNS STEMSA 09684548804 E-mail Address: [email protected] STEM STUDENTS’ MIDTERMS: SECOND QUARTER REVIEWER PROB AND STATS The chi square test for goodness of fit can be used to analyze nominal and ordinal data. It determines whether the distribution of a set of “data” fits some claimed distribution. Example: 12.592 DAY OF EF O-E (O- (O- E)^2 E)^2/E E. Draw a conclusion: Because the computed statistics 𝑥 2 = 5.87, accept the null hypothesis Mon 14 16 -2 4.25 and reject alternative hypothesis. Conclude Tues 22 16 6 36 2.25 that the no. of breakdown in a factory in various days of a week are uniformly Wed 16 16 0 0 0 distributed. Thur 18 16 2 4.25 SCATTER PLOT Fri 12 16 -4 16 1 are diagrams that are used to show the degree and pattern of relationship between the two Sat 19 16 3 9.56 sets of data. Sun 11 16 -5 25 1.56 Bivariate Data 112 112 5.87 The data collected in this type of study that A. Null Hypothesis: The no. of breakdown in a involves two variables (independent) and factory in various days of a week are uniformly (dependent) are called bivariate data. distributed. Abscissa- is the value of the independent Alternative Hypothesis: The no. of breakdown variable x. in a factory in various days of a week are not Ordinate- is the value of the dependent uniformly distributed. variable y. B. Level of significance: σ=0.05 A quick description of the correlation in a C. Compute the test statistic scatter plot should always include description of the: (𝑂−𝐸)2 𝑥2 = 𝐸 Form (linear or non-linear) 𝑥2 = 5.87 Direction (positive or negative) Strength (perfect, strong, weak) D. df= n-1 critical value:12.592 Perfect Highly Low =7-1 Positive Positive Positive =6 Correlation Correlation Correlation PALAWAN NATIONAL SCHOOL SCIENCE, TECHNOLOGY, ENGINEERING, AND MATHEMATICS STUDENTS’ ASSOCIATION (STEMSA) Bgy. Manggahan, Puerto Princesa City, Palawan Telephone No.: PNS STEMSA 09684548804 E-mail Address: [email protected] STEM STUDENTS’ MIDTERMS: SECOND QUARTER REVIEWER PROB AND STATS Low Highly Perfect Negative Negative Negative Correlation Correlation Correlation Step 1: Choose you dependent and independent variables. In the example above, the dependent variable is the grade the student received on the exam and the independent variable is the number of hours of study. No Correlation Step 2: Draw an x-axis for the independent variable Here are the procedures for drawing a scatter plot: Step 3: Add the y-axis for the dependent Step 1: Choose you dependent and variable independent variables. Step 2: Draw an x-axis for the independent variable. Step 3: Add the y-axis for the dependent variable. Step 4: Mark each data point on your scatter plot. Step 5: Label your graph and your axes. Example Mrs. Cruz surveyed her 6 students on how Step 4: Mark each data point on your scatter many hours did they study before the exam plot and the grades that they got after the exam. PALAWAN NATIONAL SCHOOL SCIENCE, TECHNOLOGY, ENGINEERING, AND MATHEMATICS STUDENTS’ ASSOCIATION (STEMSA) Bgy. Manggahan, Puerto Princesa City, Palawan Telephone No.: PNS STEMSA 09684548804 E-mail Address: [email protected] STEM STUDENTS’ MIDTERMS: SECOND QUARTER REVIEWER PROB AND STATS The correlation coefficient (r) is a number between -1 and 1 that describes both the strength and the direction of correlation. In symbol, we write -1 ≤ r ≤ 1. Example: Teachers of Pag-asa National High School instilled among their students the value of time management and excellence in everything they do. The table below shows the time in hours spent in studying (X) by six Grade 11 students and their scores in a test (Y). Solve for the Pearson’s sample correlation coefficient r. Step 5: Label your graph and your axes STEPS: : PEARSON’S CORRELATION COEFFICIENT - (also known as Pearson r), denoted by r, is a test statistic that measures the strength of the linear relationship between two variables - Formula: 𝑟 n(∑ XY) − (∑ X)(∑ Y) = √[n(∑ x 2 ) − (∑ 𝑥)2 ][n(∑ 𝑦 2 ) − (∑𝑦)2 )] PALAWAN NATIONAL SCHOOL SCIENCE, TECHNOLOGY, ENGINEERING, AND MATHEMATICS STUDENTS’ ASSOCIATION (STEMSA) Bgy. Manggahan, Puerto Princesa City, Palawan Telephone No.: PNS STEMSA 09684548804 E-mail Address: [email protected] STEM STUDENTS’ MIDTERMS: SECOND QUARTER REVIEWER PROB AND STATS 6 (420) − (21)(95) 𝑟= √[6(91) − (21)2 ][6(1975) − (95)2 ] 2520 − 1995 𝑟= √[546 − 441 ][11850 − 9025] 525 𝑟= √[105 ] 525 𝑟= √296625 𝑟 = 0.96395 𝑜𝑟 0.96 The value of r is a positive number. Therefore, we can say accurately that there is a positive correlation between hours spent in studying and their scores in a test 4.Subsitute the values obtained from step 3 in the formula: 𝑟 n(∑ XY) − (∑ X)(∑ Y) = √[n(∑ x 2 ) − (∑ 𝑥)2 ][n(∑ 𝑦 2 ) − (∑𝑦)2 )] PALAWAN NATIONAL SCHOOL SCIENCE, TECHNOLOGY, ENGINEERING, AND MATHEMATICS STUDENTS’ ASSOCIATION (STEMSA) Bgy. Manggahan, Puerto Princesa City, Palawan Telephone No.: PNS STEMSA 09684548804 E-mail Address: [email protected]

Use Quizgecko on...
Browser
Browser