Statistics and Probability Teaching Guide PDF

Summary

This teaching guide for senior high school statistics and probability in the Philippines, published in 2016 by the Commission on Higher Education, is designed to assist teachers. It covers various topics, including descriptive statistics, probability, sampling, and inference, and emphasizes the practical applications of statistical methods.

Full Transcript

Commission on Higher Education in collaboration with the Philippine Normal University TEACHING GUIDE FOR SENIOR HIGH SCHOOL Statistics and Probability CORE SUBJECT This Teaching Guide was collaboratively developed and review...

Commission on Higher Education in collaboration with the Philippine Normal University TEACHING GUIDE FOR SENIOR HIGH SCHOOL Statistics and Probability CORE SUBJECT This Teaching Guide was collaboratively developed and reviewed by educators from public and private schools, colleges, and universities. We encourage teachers and other education stakeholders to email their feedback, comments, and recommendations to the Commission on Higher Education, K to 12 Transition Program Management Unit - Senior High School Support Team at [email protected]. We value your feedback and recommendations. Published by the Commission on Higher Education, 2016 Chairperson: Patricia B. Licuanan, Ph.D. Commission on Higher Education K to 12 Transition Program Management Unit Office Address: 4th Floor, Commission on Higher Education, C.P. Garcia Ave., Diliman, Quezon City Telefax: (02) 441-1143 / E-mail Address: [email protected] DEVELOPMENT TEAM Team Leader: Jose Ramon G. Albert, Ph.D. Writers: Zita VJ Albacea, Ph.D., Mark John V. Ayaay Isidoro P. David, Ph.D., Imelda E. de Mesa This Teaching Guide by the Technical Editors: Commission on Higher Education Nancy A. Tandang, Ph.D., Roselle V. Collado is licensed under a Creative Commons Attribution- Copy Reader: Rea Uy-Epistola NonCommercial-ShareAlike Illustrator: Michael Rey O. Santos 4.0 International License. This means you are free to: Cover Artists: Paolo Kurtis N. Tan, Renan U. Ortiz Share — copy and redistribute CONSULTANTS the material in any medium or THIS PROJECT WAS DEVELOPED WITH THE PHILIPPINE NORMAL UNIVERSITY. format University President: Ester B. Ogena, Ph.D. Adapt — remix, transform, and VP for Academics: Ma. Antoinette C. Montealegre, Ph.D. build upon the material. VP for University Relations & Advancement: Rosemarievic V. Diaz, Ph.D. The licensor, CHED, cannot Ma. Cynthia Rose B. Bautista, Ph.D., CHED revoke these freedoms as long as Bienvenido F. Nebres, S.J., Ph.D., Ateneo de Manila University you follow the license terms. Carmela C. Oracion, Ph.D., Ateneo de Manila University However, under the following Minella C. Alarcon, Ph.D., CHED terms: Gareth Price, Sheffield Hallam University Attribution — You must give Stuart Bevins, Ph.D., Sheffield Hallam University appropriate credit, provide a link to the license, and indicate if SENIOR HIGH SCHOOL SUPPORT TEAM changes were made. You may do CHED K TO 12 TRANSITION PROGRAM MANAGEMENT UNIT so in any reasonable manner, but Program Director: Karol Mark R. Yee not in any way that suggests the licensor endorses you or your use. Lead for Senior High School Support: Gerson M. Abesamis NonCommercial — You may Lead for Policy Advocacy and Communications: Averill M. Pizarro not use the material for Course Development Officers: commercial purposes. John Carlo P. Fernando, Danie Son D. Gonzalvo ShareAlike — If you remix, Teacher Training Officers: transform, or build upon the material, you must distribute Ma. Theresa C. Carlos, Mylene E. Dones your contributions under the Monitoring and Evaluation Officer: Robert Adrian N. Daulat same license as the original. Administrative Officers: Ma. Leana Paula B. Bato, Printed in the Philippines by EC-TEC Commercial, No. 32 St. Louis Kevin Ross D. Nera, Allison A. Danao, Ayhen Loisse B. Dalena Compound 7, Baesa, Quezon City, [email protected] Preface Prior to the implementation of K-12, Statistics was taught in public high schools in the Philippines typically in the last quarter of third year. In private schools, Statistics was taught as either an elective, or a required but separate subject outside of regular Math classes. In college, Statistics was taught practically to everyone either as a three unit or six unit course. All college students had to take at least three to six units of a Math course, and would typically “endure” a Statistics course to graduate. Teachers who taught these Statistics classes, whether in high school or in college, would typically be Math teachers, who may not necessarily have had formal training in Statistics. They were selected out of the understanding (or misunderstanding) that Statistics is Math. Statistics does depend on and uses a lot of Math, but so do many disciplines, e.g. engineering, physics, accounting, chemistry, computer science. But Statistics is not Math, not even a branch of Math. Hardly would one think that accounting is a branch of mathematics simply because it does a lot of calculations. An accountant would also not describe himself as a mathematician. Math largely involves a deterministic way of thinking and the way Math is taught in schools leads learners into a deterministic way of examining the world around them. Statistics, on the other hand, is by and large dealing with uncertainty. Statistics uses inductive thinking (from specifics to generalities), while Math uses deduction (from the general to the specific). “Statistics has its own tools and ways of thinking, and statisticians are quite insistent that those of us who teach mathematics realize that statistics is not mathematics, nor is it even a branch of mathematics. In fact, statistics is a separate discipline with its own unique ways of thinking and its own tools for approaching problems.” - J. Michael Shaughnessy, “Research on Students’ Understanding of Some Big Concepts in Statistics” (2006) Statistics deals with data; its importance has been recognized by governments, by the private sector, and across disciplines because of the need for evidence-based decision making. It has become even more important in the past few years, now that more and more data is being collected, stored, analyzed and re-analyzed. From the time when humanity first walked the face of the earth until 2003, we created as much as 5 exabytes of data (1 exabyte being a billion “gigabytes”). Information communications technology (ICT) tools have provided us the means to transmit and exchange data much faster, whether these data are in the form of sound, text, visual images, signals or any other form or any combination of those forms using desktops, laptops, tablets, mobile phones, and other gadgets with the use of the internet, social media (facebook, twitter). With the data deluge arising from using ICT tools, as of 2012, as much as 5 exabytes were being created every two days (the amount of data created from the beginning of history up to 2003); a year later, this same amount of data was now being created every ten minutes. In order to make sense of data, which is typically having variation and uncertainty, we need the Science of Statistics, to enable us to summarize data for describing or explaining phenomenon; or to make predictions (assuming trends in the data continue). Statistics is the science that studies data, and what we can do with data. Teachers of Statistics and Probability can easily spend much time on the formal methods and computations, losing sight of the real applications, and taking the excitement out of things. The eminent statistician Bradley Efron mentioned how diverse statistical applications are: “During the 20th Century statistical thinking and methodology has become the scientific framework for literally dozens of fields including education, agriculture, economics, biology, and medicine, and with increasing influence recently on the hard sciences such as astronomy, geology, and physics. In other words, we have grown from a small obscure field into a big obscure field.” In consequence, the work of a statistician has become even fashionable. Google’s chief economist Hal Varian wrote in 2009 that “the sexy job in the next ten years will be statisticians.” He went on and mentioned that “The ability to take data - to be able to understand it, to process it, to extract value from it, to visualize it, to communicate it's going to be a hugely important skill in the next decades, not only at the professional level but even at the educational level for elementary school kids, for high school kids, for college kids. “ This teaching guide, prepared by a team of professional statisticians and educators, aims to assist Senior High School teachers of the Grade 11 second semester course in Statistics and Probability so that they can help Senior High School students discover the fun in describing data, and in exploring the stories behind the data. The K-12 curriculum provides for concepts in Statistics and Probability to be taught from Grade 1 up to Grade 8, and in Grade 10, but the depth at which learners absorb these concepts may need reinforcement. Thus, the first chapter of this guide discusses basic tools (such as summary measures and graphs) for describing data. While Probability may have been discussed prior to Grade 11, it is also discussed in Chapter 2, as a prelude to defining Random Variables and their Distributions. The next chapter discusses Sampling and Sampling Distributions, which bridges Descriptive Statistics and Inferential Statistics. The latter is started in Chapter 4, in Estimation, and further discussed in Chapter 5 (which deals with Tests of Hypothesis). The final chapter discusses Regression and Correlation. Although Statistics and Probability may be tangential to the primary training of many if not all Senior High School teachers of Statistics and Probability, it will be of benefit for them to see why this course is important to teach. After all, if the teachers themselves do not find meaning in the course, neither will the students. Work developing this set of teaching materials has been supported by the Commission on Higher Education under a Materials Development Sub-project of the K-12 Transition Project. These materials will also be shared with Department of Education. Writers of this teaching guide recognize that few Senior High School teachers would have formal training or applied experience with statistical concepts. Thus, the guide gives concrete suggestions on classroom activities that can illustrate the wide range of processes behind data collection and data analysis. It would be ideal to use technology (i.e. computers) as a means to help teachers and students with computations; hence, the guide also provides suggestions in case the class may have access to a computer room (particularly the use of spreadsheet applications like Microsoft Excel). It would be unproductive for teachers and students to spend too much time working on formulas, and checking computation errors at the expense of gaining knowledge and insights about the concepts behind the formulas. The guide gives a mixture of lectures and activities, (the latter include actual collection and analysis of data). It tries to follow suggestions of the Guidelines for Assessment and Instruction in Statistics Education (GAISE) Project of the American Statistical Association to go beyond lecture methods, and instead exercise conceptual learning, use active learning strategies and focus on real data. The guide suggests what material is optional as there is really a lot of material that could be taught, but too little time. Teachers will have to find a way of recognizing that diverse needs of students with variable abilities and interests. This teaching guide for Statistics and Probability, to be made available both digitally and in print to senior high school teachers, shall provide Senior High School teachers of Statistics and Probability with much-needed support as the country’s basic education system transitions into the K-12 curriculum. It is earnestly hoped that Senior High School teachers of Grade 11 Statistics and Probability can direct students into examining the context of data, identifying the consequences and implications of stories behind Statistics and Probability, thus becoming critical consumers of information. It is further hoped that the competencies gained by students in this course will help them become more statistical literate, and more prepared for whatever employment choices (and higher education specializations) given that employers are recognizing the importance of having their employee know skills on data management and analysis in this very data-centric world. K to 12 BASIC EDUCATION CURRICULUM SENIOR HIGH SCHOOL – CORE SUBJECT Grade: 11/12 Core Subject Title: Statistics and Probability No. of Hours/Semester: 80 hours/semester Prerequisite (if needed): Core Subject Description: At the end of the course, the students must know how to find the mean and variance of a random variable, to apply sampling techniques and distributions, to estimate population mean and proportion, to perform hypothesis testing on population mean and proportion, and to perform correlation and regression analyses on real-life problems. PERFORMANCE CONTENT CONTENT STANDARDS LEARNING COMPETENCIES CODE STANDARDS Random Variables The learner demonstrates The learner is able to apply The learner … and Probability understanding of key an appropriate random 1. illustrates a random variable (discrete and continuous). M11/12SP-IIIa-1 Distributions concepts of random variable for a given real-life 2. distinguishes between a discrete and a continuous variables and probability problem (such as in random variable. M11/12SP-IIIa-2 distributions. decision making and games 3. finds the possible values of a random variable. M11/12SP-IIIa-3 of chance). 4. illustrates a probability distribution for a discrete random variable and its properties. M11/12SP-IIIa-4 5. constructs the probability mass function of a discrete random variable and its corresponding histogram. M11/12SP-IIIa-5 6. computes probabilities corresponding to a given random variable. M11/12SP-IIIa-6 7. illustrates the mean and variance of a discrete random variable. M11/12SP-IIIb-1 8. calculates the mean and the variance of a discrete random variable. M11/12SP-IIIb-2 9. interprets the mean and the variance of a discrete random variable. M11/12SP-IIIb-3 10. solves problems involving mean and variance of probability distributions. M11/12SP-IIIb-4 Normal The learner demonstrates The learner is able to The learner … Distribution understanding of accurately formulate and 11. illustrates a normal random variable and its M11/12SP-IIIc-1 key concepts of normal solve real-life problems in characteristics. probability distribution. different disciplines 12. constructs a normal curve. M11/12SP-IIIc-2 K to 12 Senior High School Core Curriculum – Statistics and Probability December 2013 Page 1 of 7 K to 12 BASIC EDUCATION CURRICULUM SENIOR HIGH SCHOOL – CORE SUBJECT PERFORMANCE CONTENT CONTENT STANDARDS LEARNING COMPETENCIES CODE STANDARDS involving normal distribution. 13. identifies regions under the normal curve M11/12SP-IIIc-3 corresponding to different standard normal values. 14. converts a normal random variable to a standard normal variable and vice versa. M11/12SP-IIIc-4 15. computes probabilities and percentiles using the M11/12SP-IIIc-d- standard normal table. 1 Sampling and The learner demonstrates The learner is able to apply Sampling understanding of key suitable sampling and The learner … Distributions concepts of sampling and sampling distributions of 1. illustrates random sampling. sampling distributions of the the sample mean to solve M11/12SP-IIId-2 sample mean. real-life problems in 2. distinguishes between parameter and statistic. M11/12SP-IIId-3 different disciplines. 3. identifies sampling distributions of statistics (sample M11/12SP-IIId-4 mean). 4. finds the mean and variance of the sampling distribution M11/12SP-IIId-5 of the sample mean. 5. defines the sampling distribution of the sample mean for normal population when the variance is: M11/12SP-IIIe-1 (a) known (b) unknown 6. illustrates the Central Limit Theorem. M11/12SP-IIIe-2 7. defines the sampling distribution of the sample mean M11/12SP-III-3 using the Central Limit Theorem. 8. solves problems involving sampling distributions of the M11SP-IIIe-f-1 sample mean. Estimation of The learner demonstrates The learner is able to The learner … Parameters understanding of key estimate the population concepts of estimation of mean and population 1. illustrates point and interval estimations. M11/12SP-IIIf-2 population mean and proportion to make sound 2. distinguishes between point and interval estimation. M11/12SP-IIIf-3 K to 12 Senior High School Core Curriculum – Statistics and Probability December 2013 Page 2 of 7 K to 12 BASIC EDUCATION CURRICULUM SENIOR HIGH SCHOOL – CORE SUBJECT PERFORMANCE CONTENT CONTENT STANDARDS LEARNING COMPETENCIES CODE STANDARDS population proportion. inferences in real-life problems in different 3. identifies point estimator for the population mean. M11/12SP-IIIf-4 disciplines. 4. computes for the point estimate of the population M11/12SP-IIIf-5 mean. 5. identifies the appropriate form of the confidence interval estimator for the population mean when: (a) the population variance is known, (b) the population M11/12SP-IIIg-1 variance is unknown, and (c) the Central Limit Theorem is to be used. 9. illustrates the t-distribution. M11/12SP-IIIg-2 10. constructs a t-distribution. M11/12SP-IIIg-3 11. identifies regions under the t-distribution corresponding to different t-values. M11/12SP-IIIg-4 11. identifies percentiles using the t-table. M11/12SP-IIIg-5 12. computes for the confidence interval estimate based on the appropriate form of the estimator for the M11/12SP-IIIh-1 population mean. 13. solves problems involving confidence interval estimation of the population mean. M11/12SP-IIIh-2 14. draws conclusion about the population mean based on its confidence interval estimate. M11/12SP-IIIh-3 15. identifies point estimator for the population proportion. M11/12SP-IIIi-1 16. computes for the point estimate of the population proportion. M11/12SP-IIIi-2 17. identifies the appropriate form of the confidence interval estimator for the population proportion based M11/12SP-IIIi-3 on the Central Limit Theorem. 18. computes for the confidence interval estimate of the population proportion. M11/12SP-IIIi-4 19. solves problems involving confidence interval M11/12SP-IIIi-5 estimation of the population proportion. K to 12 Senior High School Core Curriculum – Statistics and Probability December 2013 Page 3 of 7 K to 12 BASIC EDUCATION CURRICULUM SENIOR HIGH SCHOOL – CORE SUBJECT PERFORMANCE CONTENT CONTENT STANDARDS LEARNING COMPETENCIES CODE STANDARDS 20. draws conclusion about the population proportion based on its confidence interval estimate M11/12SP-IIIi-6 21. identifies the length of a confidence interval. M11/12SP-IIIj-1 22. computes for the length of the confidence interval. M11/12SP-IIIj-2 23. computes for an appropriate sample size using the M11/12SP-IIIj-3 length of the interval. 24. solves problems involving sample size determination. M11/12SP-IIIj-4 Tests of The learner demonstrates The learner is able to The learner … Hypothesis understanding of key perform appropriate tests 1. illustrates: concepts of tests of of hypotheses involving the (a) null hypothesis hypotheses on the population mean and (b) alternative hypothesis M11/12SP-IVa-1 population mean and population proportion to (c) level of significance population proportion. make inferences in real-life (d) rejection region; and problems in different (e) types of errors in hypothesis testing. disciplines. 2. calculates the probabilities of committing a Type I and Type II error. M11/12SP-IVa-2 3. identifies the parameter to be tested given a real-life M11/12SP-IVa-3 problem. 4. formulates the appropriate null and alternative M11/12SP-IVb-1 hypotheses on a population mean. 5. identifies the appropriate form of the test-statistic when: (a) the population variance is assumed to be known M11/12SP-IVb-2 (b) the population variance is assumed to be unknown; and (c) the Central Limit Theorem is to be used. K to 12 Senior High School Core Curriculum – Statistics and Probability December 2013 Page 4 of 7 K to 12 BASIC EDUCATION CURRICULUM SENIOR HIGH SCHOOL – CORE SUBJECT PERFORMANCE CONTENT CONTENT STANDARDS LEARNING COMPETENCIES CODE STANDARDS 6. identifies the appropriate rejection region for a given level of significance when: (a) the population variance is assumed to be known M11/12SP-IVc-1 (b) the population variance is assumed to be unknown; and (c) the Central Limit Theorem is to be used. 7. computes for the test-statistic value (population mean). M11/12SP-IVd-1 8. draws conclusion about the population mean based on M11/12SP-IVd-2 the test-statistic value and the rejection region. 9. solves problems involving test of hypothesis on the M11/12SP-IVe-1 population mean. 10. formulates the appropriate null and alternative M11/12SP-IVe-2 hypotheses on a population proportion. 11. identifies the appropriate form of the test-statistic M11/12SP-IVe-3 when the Central Limit Theorem is to be used. 12. identifies the appropriate rejection region for a given level of significance when the Central Limit Theorem is M11/12SP-IVe-4 to be used. 13. computes for the test-statistic value (population proportion). M11/12SP-IVf-1 14. draws conclusion about the population proportion based on the test-statistic value and the rejection M11/12SP-IVf-2 region. 15. solves problems involving test of hypothesis on the M11/12SP-IVf-g- population proportion. 1 K to 12 Senior High School Core Curriculum – Statistics and Probability December 2013 Page 5 of 7 K to 12 BASIC EDUCATION CURRICULUM SENIOR HIGH SCHOOL – CORE SUBJECT CONTENT CONTENT STANDARDS PERFORMANCE LEARNING COMPETENCIES CODE STANDARDS ENRICHMENT The learner demonstrates The learner is able to 1. illustrates the nature of bivariate data. M11/12SP-IVg-2 understanding of key perform correlation and Correlation and concepts of correlation and regression analyses on 2. constructs a scatter plot. M11/12SP-IVg-3 Regression regression analyses. real-life problems in Analyses different disciplines. 3. describes shape (form), trend (direction), and variation M11/12SP-IVg-4 (strength) based on a scatter plot. 4. estimates strength of association between the variables M11/12SP-IVh-1 based on a scatter plot. 5. calculates the Pearson’s sample correlation coefficient. M11/12SP-IVh-2 6. solves problems involving correlation analysis. M11/12SP-IVh-3 7. identifies the independent and dependent variables. M11/12SP-IVi-1 8. draws the best-fit line on a scatter plot. M11/12SP-IVi-2 9. calculates the slope and y-intercept of the regression M11/12SP-IVi-3 line. 10. interprets the calculated slope and y-intercept of the M11/12SP-IVi-4 regression line. 11. predicts the value of the dependent variable given the M11/12SP-IVj-1 value of the independent variable. 12. solves problems involving regression analysis. M11/12SP-IVj-2 K to 12 Senior High School Core Curriculum – Statistics and Probability December 2013 Page 6 of 7 K to 12 BASIC EDUCATION CURRICULUM SENIOR HIGH SCHOOL – CORE SUBJECT Code Book Legend Sample: M11/12SP-IIIa-1 LEGEND SAMPLE Learning Area and Strand/ Subject or Mathematics Specialization First Entry M11/12 Grade Level Grade 11/12 Domain/Content/ Uppercase Letter/s Statistics and Probability SP Component/ Topic - Roman Numeral Quarter Third Quarter III *Zero if no specific quarter Lowercase Letter/s *Put a hyphen (-) in between letters to indicate Week Week one a more than a specific week - illustrates a random variable (discrete and Arabic Number Competency 1 continuous) K to 12 Senior High School Core Curriculum – Statistics and Probability December 2013 Page 7 of 7 Table of Contents Chapter 1: Exploring Data Chapter 3: Sampling Introducing Statistics 1 Coin Tossing revisited from a Data Collection Activity 7 Statistical Perspective 204 Basic Terms in Statistics 17 The Need for Sampling 221 Levels of Measurement 24 Sampling Distribution of the Sample Data Presentation 31 Mean 242 Measures of Central Tendency 44 Sampling without Replacement 265 Other Measures of Location 54 Sampling from a Box of Marbles, Nips, Measures of Variation 60 or Colored Paper Clips and One-Peso More on Describing Data: Coins 285 Summary Measures and Graphs 69 Sampling from the Periodic Table 299 Chapter 2: Random Variables and Chapter 4: On Estimation of Parameters Probability Distributions Concepts of Point and Interval Probability 86 Estimation 316 Geometric Probability 98 Point Estimation of the Population Random Variables 108 Mean 321 Probability Distributions of Confidence Interval Estimation of the Discrete Random Variables 117 Population Mean 328 Probability Density Functions 130 Point and Confidence Interval Mean and Variance of Discrete Estimation of the Population Proportion 344 Random Variables 144 More on Point Estimates and More about Means and Confidence Intervals 351 Variances 155 The Normal Distribution and Its Chapter 5: Tests of Hypothesis Properties 164 Basic Concepts in Hypothesis Testing 362 Areas Under a Standard Normal Steps in Hypothesis Testing 368 Distribution 182 Test on Population Mean 374 Areas under a Normal Test on Population Proportion 385 Distribution 194 More on Hypothesis Tests Regarding the Population Proportion 390 Chapter 6: Correlation and Regression Analysis Examining Relationships with Correlation 399 Biographical Notes 421 ! ! CHAPTER 1: EXPLORING DATA Lesson 1: Introducing Statistics TIME FRAME: 60 minutes OVERVIEW OF LESSON In decision making, we use statistics although some of us may not be aware of it. In this lesson, we make the students realize that to decide logically, they need to use statistics. An inquiry could be answered or a problem could be solved through the use of statistics. In fact, without knowing it we use statistics in our daily activities. LEARNING COMPETENCIES: At the end of the lesson, the learner should be able to identify questions that could be answered using a statistical process and describe the activities involved in a statistical process. LESSON OUTLINE: 1. Motivation 2. Statistics as a Tool in Decision-Making 3. Statistical Process in Solving a Problem REFERENCES: Albert, J. R. G. (2008).Basic Statistics for the Tertiary Level (ed. Roberto Padua, Welfredo Patungan, Nelia Marquez), published by Rex Bookstore. Handbook of Statistics 1 (1st and 2nd Edition), Authored by the Faculty of the Institute of Statistics, UP Los Baños, College Laguna 4031 Workbooks in Statistics 1 (From 1st to 13th Edition), Authored by the Faculty of the Institute of Statistics, UP Los Baños, College Laguna 4031 1" ! DEVELOPMENT OF THE LESSON A. Motivation You may ask the students, a question that is in their mind at that moment. You may write their answers on the board. (Note: You may try to group the questions as you write them on the board into two, one group will be questions that are answerable by a fact and the other group are those that require more than one information and needs further thinking). The following are examples of what you could have written on the board: Group 1: How old is our teacher? Is the vehicle of the Mayor of our city/town/municipality bigger than the vehicle used by the President of the Philippines? How many days are there in December? Does the Principal of the school has a post graduate degree? How much does the Barangay Captain receive as allowance? What is the weight of my smallest classmate? Group 2: How old are the people residing in our town? Do dogs eat more than cats? Does it rain more in our country than in Thailand? Do math teachers earn more than science teachers? How many books do my classmates usually bring to school? What is the proportion of Filipino children aged 0 to 5 years who are underweight or overweight for their age? The first group of questions could be answered by a piece of information which is considered always true. There is a correct answer which is based on a fact and you don’t need the process of inquiry to answer such kind of question. For example, there is one and only one correct answer to the first question in Group 1 and that is your age as of your last birthday or the number of years since your birth year. On the other hand, in the second group of questions one needs observations or data to be able to respond to the question. In some questions you need to get the observations or responses of all those concerned to be able to answer the question. On the first question in the second group, you need to ask all the people in the locality about their age and among the values you obtained you get a representative value. To answer the second question in the second group, 2" ! you need to get the amount of food that all dogs and cats eat to respond to the question. However, we know that is not feasible to do so. Thus what you can do is get a representative group of dogs and another representative group for the cats. Then we measure the amount of food each group of animal eats. From these two sets of values, we could then infer whether dogs do eat more than cats. So as you can see in the second group of questions you need more information or data to be able to answer the question. Either you need to get observations from all those concerned or you get representative groups from which you gather your data. But in both cases, you need data to be able to respond to the question. Using data to find an answer or a solution to a problem or an inquiry is actually using the statistical process or doing it with statistics. Now, let us formalize what we discussed and know more about statistics and how we use it in decision-making. B. Main Lesson 1. Statistics as a Tool in Decision-Making Statistics is defined as a science that studies data to be able to make a decision. Hence, it is a tool in decision-making process. Mention that Statistics as a science involves the methods of collecting, processing, summarizing and analyzing data in order to provide answers or solutions to an inquiry. One also needs to interpret and communicate the results of the methods identified above to support a decision that one makes when faced with a problem or an inquiry. Trivia: The word “statistics” actually comes from the word “state”— because governments have been involved in the statistical activities, especially the conduct of censuses either for military or taxation purposes. The need for and conduct of censuses are recorded in the pages of holy texts. In the Christian Bible, particularly the Book of Numbers, God is reported to have instructed Moses to carry out a census. Another census mentioned in the Bible is the census ordered by Caesar Augustus throughout the entire Roman Empire before the birth of Christ. Inform students that uncovering patterns in data involves not just science but it is also an art, and this is why some people may think “Stat is eeeks!” and may view any statistical procedures and results with much skepticism Make known to students that Statistics enable us to characterize persons, objects, situations, and phenomena; explain relationships among variables; formulate objective assessments and comparisons; and, more importantly make evidence-based decisions and predictions. 3" ! And to use Statistics in decision-making there is a statistical process to follow which is to be discussed in the next section. 2. Statistical Process in Solving a Problem You may go back to one of the questions identified in the second group and use it to discuss the components of a statistical process. For illustration on how to do it, let us discuss how we could answer the question “Do dogs eat more than cats?” As discussed earlier, this question requires you to gather data to generate statistics which will serve as basis in answering the query. There should be plan or a design on how to collect the data so that the information we get from it is enough or sufficient for us to minimize any bias in responding to the query. In relation to the query, we said earlier that we cannot gather the data from all dogs and cats. Hence, the plan is to get representative group of dogs and another representative group of cats. These representative groups were observed for some characteristics like the animal weight, amount of food in grams eaten per day and breed of the animal. Included in the plan are factors like how many dogs and cats are included in the group, how to select those included in the representative groups and when to observe these animals for their characteristics. After the data were gathered, we must verify the quality of the data to make a good decision. Data quality check could be done as we process the data to summarize the information extracted from the data. Then using this information, one can then make a decision or provide answers to the problem or question at hand. To summarize, a statistical process in making a decision or providing solutions to a problem include the following: Planning or designing the collection of data to answer statistical questions in a way that maximizes information content and minimizes bias; Collecting the data as required in the plan; Verifying the quality of the data after they were collected; Summarizing the information extracted from the data; and Examining the summary statistics so that insight and meaningful information can be produced to support decision-making or solutions to the question or problem at hand. Hence, several activities make up a statistical process which for some the process is simple but for others it might be a little bit complicated to implement. Also, not all questions or problems could be answered by a simple statistical 4" ! process. There are indeed problems that need complex statistical process. However, one can be assured that logical decisions or solutions could be formulated using a statistical process. KEY POINTS Difference between questions that could be and those that could not answered using Statistics. Statistics is a science that studies data. There are many uses of Statistics but its main use is in decision-making. Logical decisions or solutions to a problem could be attained through a statistical process. ASSESSMENT Note: Answers are provided inside the parentheses and italicized. 1. Identify which of the following questions are answerable using a statistical process. a. What is a typical size of a Filipino family? (answerable through a statistical process) b. How many hours in a day? (not answerable through a statistical process) c. How old is the oldest man residing in the Philippines? (answerable through a statistical process) d. Is planet Mars bigger than planet Earth? (not answerable through a statistical process) e. What is the average wage rate in the country? (answerable through a statistical process) f. Would Filipinos prefer eating bananas rather than apple? (answerable through a statistical process) g. How long did you sleep last night? (not answerable through a statistical process) h. How much a newly-hired public school teacher in NCR earns in a month? (not answerable through a statistical process) i. How tall is a typical Filipino? (answerable through a statistical process) j. Did you eat your breakfast today? (not answerable through a statistical process) 2. For each of the identified questions in Number 1 that are answerable using a statistical process, describe the activities involved in the process. 5" ! For a. What is a typical size of a Filipino family? (The process includes getting a representative group of Filipino families and ask the family head as to how many members do they have in their family. From the gathered data which had undergone a quality check a typical value of the number of family members could be obtained. Such typical value represents a possible answer to the question.) For c. How old is the oldest man residing in the Philippines? (The process includes getting the ages of all residents of the country. From the gathered data which had undergone a quality check the highest value of age could be obtained. Such value is the answer to the question.) For e. What is the average wage rate in the country? (The process includes getting all prevailing wage rates in the country. From the gathered data which had undergone a quality check a typical value of the wage rate could be obtained. Such value is the answer to the question.) For f. Would Filipinos prefer eating bananas rather than apple? (The process includes getting a representative group of Filipinos and ask each one of them on what fruit he/she prefers, banana or apple? From the gathered data which had undergone a quality check the proportion of those who prefers banana and proportion of those who prefer apple will be computed and compared. The results of this comparison could provide a possible answer to the question.) For i. How tall is a typical Filipino? (The process includes getting a representative group of Filipinos and measure the height of each member of the representative group. From the gathered data which had undergone a quality check a typical value of the height of a Filipino could be obtained. Such typical value represents a possible answer to the question.) Note: Tell the students that getting a representative group and obtaining a typical value are to be learned in subsequent lessons in this subject. 6" CHAPTER 1: EXPLORING DATA Lesson 2: Data Collection Activity TIME FRAME: 60 minutes OVERVIEW OF LESSON As we have learned in the previous lesson, Statistics is a science that studies data. Hence to teach Statistics, real data set is recommend to use. In this lesson,we present an activity where the students will be asked to provide some data that will be submitted for consolidation by the teacher for future lessons. Data on heights and weights, for instance, will be used for calculating Body Mass Index in the integrative lesson. Students will also be given the perspective that the data they provided is part of a bigger group of data as the same data will be asked from much larger groups (the entire class, all Grade 11 students in school, all Grade 11 students in the district). The contextualization of data will also be discussed. LEARNING COMPETENCIES: At the end of the lesson, the learner should be able to: Recognize the importance of providing correct information in a data collection activity; Understand the issue of confidentiality of information in a data collection activity; Participate in a data collection activity; and Contextualize data LESSON OUTLINE: 1. Preliminaries in a Data Collection Activity 2. Performing a Data Collection Activity 3. Contextualization of Data REFERENCES Albert, J. R. G. (2008). Basic Statistics for the Tertiary Level (ed. Roberto Padua, Welfredo Patungan, Nelia Marquez), published by Rex Bookstore. Handbook of Statistics 1 (1st and 2nd Edition), Authored by the Faculty of the Institute of Statistics, UP Los Baños, College Laguna 4031 Workbooks in Statistics 1 (From 1st to 13th Edition), Authored by the Faculty of the Institute of Statistics, UP Los Baños, College Laguna 4031 https://www.khanacademy.org/math/probability/statistical-studies/statistical-questions/v/statistical- questions https://www.illustrativemathematics.org/content-standards/tasks/703 7" DEVELOPMENT OF THE LESSON A. Preliminaries in a Data Collection Activity Before the lesson, prepare a sheet of paper listing everyone’s name in class with a “Class Student Number” (see Attachment A for the suggested format). The class student number is a random number chosen in the following fashion: (a) Make a box with “tickets” (small pieces of papers of equal sizes) listing the numbers 1 up to the number of students in the class. (b) Shake the box, get a ticket, and assign the number in the ticket to the first person in the list. (c) Shake the box again, get another ticket, and assign the number of this ticket to the next person in the list. (d) Do (c) until you run out of tickets in the box. At this point all the students have their corresponding class student number written across their names in the prepared class list. Note that the preparation of the class list is done before the class starts. At the start of the class, inform each student confidentially of his/her class student number. Perhaps, when the attendance is called, each student can be provided a separate piece of paper that lists her/his name and class student number. Tell students to remember their class student number, and to always use this throughout the semester whenever data are requested of them. Explain to students that in data collection activity, specific identities like their names are not required, especially because people have a right to confidentiality, but there should be a way to develop and maintain a database to check quality of data provided, and verify from respondent in a data collection activity the data that they provided (if necessary). These preliminary steps for generating a class student number and informing students confidentially of their class student number are essential for the data collection activities to be performed in this lesson and other lessons so that students can be uniquely identified, without having to obtain their names. Inform also the students that the class student numbers they were given are meant to identify them without having to know their specific identities in the class recording sheet (which will contain the consolidated records that everyone had provided). This helps protect confidentiality of information. In statistical activities, facts are collected from respondents for purposes of getting aggregate information, but confidentiality should be protected. Mention that the agencies mandated to collect data is bound by law to protect the confidentiality of information provided by respondents. Even market research organizations in the private sector and individual researchers also guard confidentiality as they merely want to obtain aggregate data. This way, respondents can be truthful in giving 8" information, and the researcher can give a commitment to respondents that the data they provide will never be released to anyone in a form that will identify them without their consent. B. Performing a Data Collection Activity Explain to the students that the purpose of this data collection activity is to gather data that they could use for their future lessons in Statistics. It is important that they do provide the needed information to the best of their knowledge. Also, before they respond to the questionnaire provided in the Attachment B as Student Information Sheet (SIS), it is recommended that each item in the SIS should be clarified. The following are suggested clarifications to make for each item: 1. CLASS STUDENT NUMBER: This is the number that you provided confidentially to the student at the start of the class. 2. SEX: This is the student’s biological sex and not their preferred gender. Hence, they have to choose only one of the two choices by placing a check mark (√) at space provided before the choices. 3. NUMBER OF SIBLINGS: This is the number of brothers and sisters that the student has in their nuclear or immediate family. This number excludes him or her in the count. Thus, if the student is the only child in the family then he/she will report zero as his/her number of siblings. 4. WEIGHT (in kilograms): This refers to the student’s weight based on the student’s knowledge. Note that the weight has to be reported in kilograms. In case the student knows his/her weight in pounds, the value should be converted to kilograms by dividing the weight in pounds by a conversion factor of 2.2 pounds per kilogram. 5. HEIGHT (in centimeters): This refers to the student’s height based on the student’s knowledge. Note that the height has to be reported in centimeters. In case the student knows his/her height in inches, the value should be converted to centimeters by multiplying the height in inches by a conversion factor of 2.54 centimeters per inch. 6. AGE OF MOTHER (as of her last birthday in years): This refers to the age of the student’s mother in years as of her last birthday, thus this number should be reported in whole number. In case, the student’s mother is dead or nowhere to be found, ask the student to provide the age as if the mother is alive or around.You could help the student in determining his/her mother’s age based 9" on other information that the student could provide like birth year of the mother or student’s age. Note also that a zero value is not an acceptable value. 7. USUAL DAILY ALLOWANCE IN SCHOOL (in pesos): This refers to the usual amount in pesos that the student is provided for when he/she goes to school in a weekday. Note that the student can give zero as response for this item, in case he/she has no monetary allowance per day. 8. USUAL DAILY FOOD EXPENDITURE IN SCHOOL (in pesos): This refers to the usual amount in pesos that the student spends for food including drinks in school per day. Note that the student can give zero as response for this item, in case he/she does not spend for food in school. 9. USUAL NUMBER OF TEXT MESSAGES SENT IN A DAY: This refers to the usual number of text messages that a student send in a day. Note that the student can give zero as response for this item, in case he/she does not have the gadget to use to send a text message or simply he/she does not send text messages. 10. MOST PREFERRED COLOR: The student is to choose a color that could be considered his most preferred among the given choices. Note that the student could only choose one. Hence, they have to place a check mark (√) at space provided before the color he/she considers as his/her most preferred color among those given. 11. USUAL SLEEPING TIME: This refers to the usual sleeping time at night during a typical weekday or school day. Note that the time is to be reported using the military way of reporting the time or the 24-hour clock (0:00 to 23:59 are the possible values to use) 12. HAPPINESS INDEX FOR THE DAY : The student has to response on how he/she feels at that time using codes from 1 to 10. Code 1 refers to the feeling that the student is very unhappy while Code 10 refers to a feeling that the student is very happy on the day when the data are being collected. After the clarification, the students are provided at most 10 minutes to respond to the questionnaire. Ask the students to submit the completed SIS so that you could consolidate the data gathered using a formatted worksheet file provided to you as Attachment C. Having the data in electronic file makes it easier for you to use it in the future lessons. Be sure that the students provided the information in all items in the SIS. 10" Inform the students that you are to compile all their responses and compiling all these records from everyone in the class is an example of a census since data has been gathered from every student in class. Mention that the government, through the Philippine Statistics Authority (PSA), conducts censuses to obtain information about socio-demographic characteristics of the residents of the country. Census data are used by the government to make plans, such as how many schools and hospitals to build. Censuses of population and housing are conducted every 10 years on years ending in zero (e.g., 1990, 2000, 2010) to obtain population counts, and demographic information about all Filipinos. Mid-decade population censuses have also been conducted since 1995. Censuses of Agriculture, and of Philippine Business and Industry, are also conducted by the PSA to obtain information on production and other relevant economic information. PSA is the government agency mandated to conduct censuses and surveys. Through Republic Act 10625 (also referred to as The Philippine Statistical Act of 2013), PSA was created from four former government statistical agencies, namely: National Statistics Office (NSO), National Statistical Coordination Board (NSCB), Bureau of Labor and Employment of Statistics (BLES) and Bureau of Agricultural Statistics (BAS). The other agency created through RA 10625 is the Philippine Statistical Research and Training Institute (PSRTI) which is mandated as the research and training arm of the Philippine Statistical System. PSRTI was created from its forerunner the former Statistical Research and Training Center (SRTC). C. Contextualization of Data Ask students what comes to their minds when they hear the term “data” (which may be viewed as a collection of facts from experiments, observations, sample surveys and censuses, and administrative reporting systems). Present to the student the following collection of numbers, figures, symbols, and words, and ask them if they could consider the collection as data. 3, red, F, 156, 4, 65, 50, 25, 1, M, 9, 40, 68, blue, 78, 168, 69, 3, F, 6, 9, 45, 50, 20, 200, white, 2, pink, 160, 5, 60, 100, 15, 9, 8, 41, 65, black, 68, 165, 59, 7, 6, 35, 45, Although the collection is composed of numbers and symbols that could be classified as numeric or non-numeric, the collection has no meaning or it is not contextualized, hence it cannot be referred to as data. 11" Tell the students that data are facts and figures that are presented, collected and analyzed. Data are either numeric or non-numeric and must be contextualized. To contextualize data, we must identify its six W’s or to put meaning on the data, we must know the following W’s of the data: 1. Who? Who provided the data? 2. What? What are the information from the respondents and What is the unit of measurement used for each of the information (if there are any)? 3. When? When was the data collected? 4. Where? Where was the data collected? 5. Why? Why was the data collected? 6. HoW? HoW was the data collected? Let us take as an illustration the data that you have just collected from the students, and let us put meaning or contextualize it by responding to the questions with the Ws. It is recommended that the students answer theW-questions so that they will learn how to do it. 1. Who? Who provided the data? The students in this class provided the data. 2. What? What are the information from the respondents and What is the unit of measurement used for each of the information (if there are any)? The information gathered include Class Student Number, Sex, Number of Siblings, Weight, Height, Age of Mother, Usual Daily Allowance in School, Usual Daily Food Expenditure in School, Usual Number of Text Messages Sent in a Day, Most Preferred Color, Usual Sleeping Time and Happiness Index for the Day. The units of measurement for the information on Number of Siblings, Weight, Height, Age of Mother, Usual Daily Allowance in School, Usual Daily Food Expenditure in School, and Usual Number of Text Messages Sent in a Day are person, kilogram, centimeter, year, pesos, pesos and message, respectively. 3. When? When was the data collected? 12" The data was collected on the first few days of classes for Statistics and Probability. 4. Where? Where was the data collected? The data was collected inside our classroom. 5. Why? Why was the data collected? As explained earlier, the data will be used in our future lessons in Statistics and Probability 6. HoW? HoW was the data collected? The students provided the data by responding to the Student Information Sheet prepared and distributed by the teacher for the data collection activity. Once the data are contextualized, there is now meaning to the collection of number and symbols which may now look like the following which is just a small part of the data collected in the earlier activity. Usual Number Usual Usual daily Age of number Class of daily food Most Usual Happiness Sex Weight Height mother of text Student siblings allowance expenditure Preferred Sleeping Index for (in kg) (in cm) (in messages Number (in in school in school Color Time the Day years) sent in a person) (in pesos) (in pesos) day 1 M 2 60 156 60 200 150 20 RED 23:00 8 2 F 5 63 160 66 300 200 25 PINK 22:00 9 3 F 3 65 165 59 250 50 15 BLUE 20:00 7 4 M 1 55 160 55 200 100 30 BLACK 19:00 6 5 M 0 65 167 45 350 300 35 BLUE 20:00 8 : : : : : : : : : : : : : : : : : : : : : : : : KEY POINTS Providing correct information in a government data collection activity is a responsibility of every citizen in the country. Data confidentiality is important in a data collection activity. Census is collecting data from all possible respondents. Data to be collected must be clarified before the actual data collection. Data must be contextualized by answering six W-questions. 13" ATTACHMENT A: CLASS LIST CLASS STUDENT CLASS STUDENT STUDENT NAME STUDENT NAME NUMBER NUMBER 1. 36. 2, 37. 3. 38. 4. 39. 5. 40. 6. 41. 7. 42. 8. 43. 9. 44. 10. 45. 11. 46. 12. 47. 13. 48. 14. 49. 15. 50. 16. 51. 17. 52. 18. 53. 19. 54. 20. 55. 21. 56. 22. 57. 23. 58. 24. 59. 25. 60. 26. 61. 27. 62. 28. 63. 29. 64. 30. 65. 31. 66. 32, 67. 33. 68. 34. 69. 35. 70. 14" ATTACHMENT B: STUDENT INFORMATION SHEET Instruction to the Students: Please provide completely the following information. Your teacher is available to respond to your queries regarding the items in this information sheet, if you have any. Rest assured that the information that you will be providing will only be used in our lessons in Statistics and Probability. 1. CLASS STUDENT NUMBER: ______________ 2. SEX (Put a check mark, √): ____Male __ Female 3. NUMBER OF SIBLINGS: _____ 4. WEIGHT (in kilograms): ______________ 5. HEIGHT (in centimeters): ______ 6. AGE OF MOTHER (as of her last birthday in years): ________ (If mother deceased, provide age if she was alive) 7. USUAL DAILY ALLOWANCE IN SCHOOL (in pesos): _________________ 8. USUAL DAILY FOOD EXPENDITURE IN SCHOOL (in pesos): ___________ 9. USUAL NUMBER OF TEXT MESSAGES SENT IN A DAY: ______________ 10. MOST PREFERRED COLOR (Put a check mark, √. Choose only one): ____WHITE ____RED ____ PINK ____ ORANGE ____YELLOW ____GREEN ____BLUE ____PEACH ____BROWN ____GRAY ____BLACK ____PURPLE 11. USUAL SLEEPING TIME (on weekdays): ______________ 12. HAPPPINESS INDEX FOR THE DAY: On a scale from 1 (very unhappy) to 10 (very happy), how do you feel today? ______ 15" ATTACHMENT C: CLASS RECORDING SHEET (for the Teacher’s Use) Usual Number Usual Usual Daily Age of number Class of Daily food Most Usual Happiness Sex Weight Height mother of text Student siblings allowance expenditure Preferred Sleeping Index for (in kg) (in cm) (in messages Number (in in school in school Color Time the Day years) sent in a person) (in pesos) (in pesos) day 16" CHAPTER 1: EXPLORING DATA Lesson 3: Basic Terms in Statistics TIME FRAME: 60 minutes OVERVIEW OF LESSON As continuation of Lesson 2 (where we contextualize data) in this lesson we define basic terms in statistics as we continue to explore data. These basic terms include the universe, variable, population and sample. In detail we will discuss other concepts in relation to a variable. LEARNING OUTCOME(S): At the end of the lesson, the learner is able to Define universe and differentiate it with population; and Define and differentiate between qualitative and quantitative variables, and between discrete and continuous variables (that are quantitative); LESSON OUTLINE: 1. Recall previous lesson on ‘Contextualizing Data’ 2. Definition of Basic Terms in Statistics (universe, variable, population and sample) 3. Broad of Classification of Variables(qualitative and quantitative, discrete and continuous) REFERENCES Albert, J. R. G. (2008). Basic Statistics for the Tertiary Level (ed. Roberto Padua, WelfredoPatungan, Nelia Marquez), published by Rex Bookstore. Handbook of Statistics 1 (1st and 2nd Edition), Authored by the Faculty of the Institute of Statistics, UP Los Baños, College Laguna 4031 Takahashi, S. (2009). The Manga Guide to Statistics. Trend-Pro Co. Ltd. Workbooks in Statistics 1 (From 1st to 13th Edition), Authored by the Faculty of the Institute of Statistics, UP Los Baños, College Laguna 4031 17# DEVELOPMENT OF THE LESSON A. Recall previous lesson on ‘Contextualizing Data’ Begin by recalling with the students the data they provided in the previous lesson and how they contextualized such data. You could show them the compiled data set in a table like this: Usual Number Usual Usual Daily Age of number Class of Daily food Most Usual Happiness Sex Weight Height mother of text Student siblings allowance expenditure Preferred Sleeping Index for (in kg) (in cm) (in messages Number (in in school in school Color Time the Day years) sent in a person) (in pesos) (in pesos) day 1 M 2 60 156 60 200 150 20 RED 23:00 8 2 F 5 63 160 66 300 200 25 PINK 22:00 9 3 F 3 65 165 59 250 50 15 BLUE 20:00 7 4 M 1 55 160 55 200 100 30 BLACK 19:00 6 5 M 0 65 167 45 350 300 35 BLUE 20:00 8 : : : : : : : : : : : : : : : : : : : : : : : : Recall also their response on the first Ws of the data, that is, on the question “Who provided the data?” We said last time the students of the class provided the data or the data were taken from the students. Another Ws of the data is What? What are the information from the respondents? and What is the unit of measurement used for each of the information (if there are any)? Our responses are the following: The information gathered include Class Student Number, Sex, Number of Siblings, Weight, Height, Age of Mother, Usual Daily Allowance in School, Usual Daily Food Expenditure in School, Usual Number of Text Messages Sent in a Day, Most Preferred Color, Usual Sleeping Time and Happiness Index. The units of measurement for the information on Number of Siblings, Weight, Height, Age of Mother, Usual Daily Allowance in School, Usual Daily Food Expenditure in School, and Usual Number of Text Messages Sent in a Day are person, kilogram, centimeter, year, pesos, pesos and message, respectively. B. Main Lesson 1. Definition of Basic Terms The collection of respondents from whom one obtain the data is called the universe of the study. In our illustration, the set of students of this Statistics and Probability class is our universe. But we must precaution the students that a universe is not necessarily composed of people. Since there are studies where the observations were taken from plants or animals or even from non-living things like buildings, vehicles, farms, etc. So formally, we define universe as the collection or set of 18# units or entities from whom we got the data. Thus, this set of units answers the first Ws of data contextualization. On the other hand, the information we asked from the students are referred to as the variables of the study and in the data collection activity, we have 12 variables including Class Student Number. A variable is a characteristic that is observable or measurable in every unit of the universe. From each student of the class, we got the his/her age, number of siblings, weight, height, age of mother, usual daily allowance in school, usual daily food expenditure in school, usual number of text messages sent in a day, most preferred color, usual sleeping time and happiness index for the day. Since these characteristics are observable in each and every student of the class, then these are referred to as variables. The set of all possible values of a variable is referred to as a population. Thus for each variable we observed, we have a population of values. The number of population in a study will be equal to the number of variables observed. In the data collection activity we had, there are 12 populations corresponding to 12 variables. A subgroup of a universe or of a population is a sample. There are several ways to take a sample from a universe or a population and the way we draw the sample dictates the kind of analysis we do with our data. We can further visualize these terms in the following figure: VARIABLE 1 VARIABLE 2 VARIABLE 12 Unit 1 Value 1 Value 1 Value 1 Unit 2 Value 2 Value 2 Value 2 Unit 3 Value 3 Value 3 Value 3 : : : …..# : : : : : Unit N Value N Value N Value N UNIVERSE POPULATION POPULATION POPULATION OF OF VARIABLE 1 OF VARIABLE 2 VARIABLE 12 Unit 1 Value 1 : : : OR# : Unit n Value n SAMPLE A SAMPLE OF A SAMPLE OF UNITS POPULATION VALUES Figure 3.1 Visualization of the relationship among universe, variable, population and sample. 19# 2. Broad Classification of Variables Following up with the concept of variable, inform the students that usually, a variable takes on several values. But occasionally, a variable can only assume one value, then it is called a constant. For instance, in a class of fifteen-year olds, the age in years of students is constant. Variables can be broadly classified as either quantitative or qualitative, with the latter further classified into discrete and continuous types (see Figure 3.3 below). Figure 3.3 Broad Classification of Variables (i) Qualitative variables express a categorical attribute, such as sex (male or female), religion, marital status, region of residence, highest educational attainment. Qualitative variables do not strictly take on numeric values (although we can have numeric codes for them, e.g., for sex variable, 1 and 2 may refer to male, and female, respectively). Qualitative data answer questions “what kind.” Sometimes, there is a sense of ordering in qualitative data, e.g., income data grouped into high, middle and low-income status. Data on sex or religion do not have the sense of ordering, as there is no such thing as a weaker or stronger sex, and a better or worse religion. Qualitative variables are sometimes referred to as categorical variables. (ii) Quantitative (otherwise called numerical) data, whose sizes are meaningful, answer questions such as “how much” or “how many”. Quantitative variables have actual units of measure. Examples of quantitative variables include the height, weight, number of registered cars, household size, and total household expenditures/income of survey respondents. Quantitative data may be further classified into: 20# a. Discrete data are those data that can be counted, e.g., the number of days for cellphones to fail, the ages of survey respondents measured to the nearest year, and the number of patients in a hospital. These data assume only (a finite or infinitely) countable number of values. b. Continuous data are those that can be measured, e.g. the exact height of a survey respondent and the exact volume of some liquid substance. The possible values are uncountably infinite. With this classification, let us then test the understanding of our students by asking them to classify the variables, we had in our last data gathering activity. They should be able to classify these variables as to qualitative or quantitative and further more as to discrete or continuous. If they did it right, you have the following: TYPE OF TYPE OF VARIABLE QUANTITATIVE VARIABLE VARIABLE Class Student Number Qualitative Sex Qualitative Number of Siblings Quantitative Discrete Weight (in kilograms) Quantitative Continuous Height (in centimeters) Quantitative Continuous Age of Mother Quantitative Discrete Usual Daily Allowance in School (in Quantitative Discrete pesos) Usual Daily Food Expenditure in Quantitative Discrete School (in pesos) Usual Number of Text Messages Quantitative Discrete Sent in a Day Usual Sleeping Time Qualitative Most Preferred Color Qualitative Happiness Index for the Day Qualitative Special Note: For quantitative data, arithmetical operations have some physical interpretation. One can add 301 and 302 if these have quantitative meanings, but if, these numbers refer to room numbers, then adding these numbers does not make any sense. Even though a variable may take numerical values, it does not make the corresponding variable quantitative! The issue is whether performing arithmetical operations on these data would make any sense. It would certainly not make sense to sum two zip codes or multiply two room numbers. 21# KEY POINTS A universe is a collection of units from which the data were gathered. A variable is a characteristic we observed or measured from every element of the universe. A population is a set of all possible values of a variable. A sample is a subgroup of a universe or a population. In a study there is only one universe but could have several populations. Variables could be classified as qualitative or quantitative, and the latter could be further classified as discrete or continuous. ASSESSM ENT Note: Answers are provided inside the parentheses and italicized. 1. A market researcher company requested all teachers of a particular school to fill up a questionnaire in relation to their product market study. The following are some of the information supplied by the teachers: highest educational attainment predominant hair color body temperature civil status brand of laundry soap being used total household expenditures last month in pesos number of children in the household number of hours standing in queue while waiting to be served by a bank teller amount spent on rice last week by the household distance travelled by the teacher in going to school time (in hours) consumed on Facebook on a particular day a. If we are to consider the collection of information gathered through the completed questionnaire, what is the universe for this data set? (The universe is the set of all teachers in that school) b. Which of the variables are qualitative? Which are quantitative? Among the quantitative variables, classify them further as discrete or continuous. highest educational attainment (qualitative) predominant hair color (qu

Use Quizgecko on...
Browser
Browser