Introduction to Statistics PDF
Document Details
Uploaded by Deleted User
Tags
Summary
This document is a presentation on introduction to statistics. It covers basic definitions, branches of statistics, and types of variables. Examples of different sampling methods and some relevant statistics are also included.
Full Transcript
Introduction to Statistics Lesson Outline: 1 Basic Definitions and Terms 2 Branches of Statistics 3 Types of Variables 4 Levels of Measurement 1 Basic Definitions and Terms STATISTICS It is a branch of science that deals with the colle...
Introduction to Statistics Lesson Outline: 1 Basic Definitions and Terms 2 Branches of Statistics 3 Types of Variables 4 Levels of Measurement 1 Basic Definitions and Terms STATISTICS It is a branch of science that deals with the collection, presentation, organization, analysis, and interpretation of data. The term “statistics” originated from the Latin word “status” which means state. The use of the term itself became popular only in the 19th century and its original definition was “the science dealing with data about the condition of a state or community.” WHY SHOULD YOU STUDY STATISTICS? 1. Like professional people, you must be able to read and understand the various statistical studies performed in your fields. 2. You may be called on to conduct research in your field, since statistical procedures are basic to research. 3. You can also use the knowledge gained from studying statistics to become better consumers and citizens. POPULATION VS. SAMPLE Population – it consists of all subjects (human or otherwise) Population that are being studied. Sample Sample – it is a group of subjects selected from the population Why do researchers do not (or rarely) use the entire population in a statistical study? POPULATION VS. SAMPLE Example: We wish to determine the All households in Metro Manila average expenditure of all households in Metro Manila All households in Valenzuela or in Manila PARAMETER VS. STATISTIC Parameter – it is a numerical measure that describes characteristics of a population. PARAMETER VS. STATISTIC Statistic - is a numerical measure that describes characteristics of a sample. 2 Branches of Statistics DESCRIPTIVE STATISTICS It involves the organization, summarization, and display of data. It includes summary calculations, graphs, charts, and tables. INFERENTIAL STATISTICS It deals with predictions and inferences based on the analysis and interpretation of the results of the information gathered by the statistician. It includes performing estimation and hypothesis tests, determining relationships, and making predictions. DESCRIPTIVE STATISTICS VS. INFERENTIAL STATISTICS Example: How would you conduct the social experiment? Suppose you are to conduct a social experiment in your school. In the You can possibly do it by presenting the experiment, you will be dropping a scenario to all the students in your school. Php500 bill w/o you noticing. You wanted However, it can be time-consuming and to know whether the students would keep expensive. the money or return it to the owner. The social experiment is done to measure the You can present the scenario to a sample morality of students at your school by of students and use the results to make a determining the percent of students who statement about all the students at would return the money. school. DESCRIPTIVE STATISTICS VS. INFERENTIAL STATISTICS Example: Suppose we conduct it to 100 students. We Suppose you are to conduct a social found out that 83 out of 100 students said that experiment in your school. In the they would return the money to the owner. experiment, you will be dropping a Php500 bill w/o you noticing. You wanted 83% of the students said that they would to know whether the students would keep return the money. the money or return it to the owner. The Descriptive Statistics social experiment is done to measure the With 95% level of confidence, we can say morality of students at your school by that between 75% and 85% of the determining the percent of students who students would return the money. would return the money. Inferential Statistics PROCESS OF STATISTICS 1. Identify the research objective. A researcher must determine the question(s) he or she wants answered. The question(s) must be detailed so that it identifies a group that is to be studied and the questions that are to be answered. PROCESS OF STATISTICS 2. Collect the information needed to answer the questions. Everybody collects and uses information, much of it in numerical or statistical forms in day-to-day life. Gaining access to an entire population is often difficult and expensive. In conducting research, we typically look at a subset of the population called a sample. Example A research objective is presented. For each research objective, identify the population and sample in the study. The Philippine Mental Health Associations contacts 2,024 teenagers who are 15 to 19 years of age and live in Valenzuela City and asked whether or not they had been prescribed medications for any mental disorders, such as depression or anxiety. Population: Teenagers 15 -19 Sample: 2,024 teenagers 15 -19 years old who live in Valenzuela years old who live in Valenzuela City. City. Example A research objective is presented. For each research objective, identify the population and sample in the study. A farmer wanted to learn about the weight of his soybean crop. He randomly sampled 100 plants and weighted the soybeans on each plant Population: Entire soybean crop Sample: 100 selected soybean crop PROCESS OF STATISTICS 3. Organize and summarize the information This step in the process is referred to as descriptive statistics. Descriptive statistics describe the information collected through numerical measurements, charts, graphs, and tables. The main purpose of descriptive statistics is to provide an overview of the information collected. PROCESS OF STATISTICS 4. Draw conclusion from the information. In this step the information collected from the sample is generalized to the population. This process is referred to as Inferential statistics. Inferential statistics uses methods that takes results obtained from a sample, extends them to the population, and measures the reliability of the result. PROCESS OF STATISTICS Reminder: If the entire population is studied, then inferential statistics is not necessary, because descriptive statistics will provide all the information that we need regarding the population. Example For the following statements, decide whether it belongs to the field of descriptive statistics or inferential statistics. A basketball player wants to know his average score for the past 10 games. Descriptive Statistics Example For the following statements, decide whether it belongs to the field of descriptive statistics or inferential statistics. Based on the last year’s electricity bills, Mr. Bing would like to forecast the average monthly electricity he pay for the next year based on his average monthly bill in the past year. Inferential Statistics Example For the following statements, decide whether it belongs to the field of descriptive statistics or inferential statistics. A car manufacturer wishes to estimate the average lifetime of batteries by testing a sample of 50 batteries. Inferential Statistics Example For the following statements, decide whether it belongs to the field of descriptive statistics or inferential statistics. Because of the current economy, 49% of 18 to 34 years old have taken a job to pay the bills. Descriptive Statistics Example For the following statements, decide whether it belongs to the field of descriptive statistics or inferential statistics. In 2025, the world population is predicted to be 8 billion people (Source: United Nations) Inferential Statistics 3 Types of Variables VARIABLES A characteristic that is observable or measurable in every unit of the population. Example: A teacher was concerned on the test performance of her students in the subject Statistics. Population: all of the students in her Statistics class. Variable: Test scores in Statistics TYPES OF VARIABLES Variables can be classified into two groups: Qualitative variable - are variables that have distinct categories according to some characteristics or attribute. Example: Gender, religion, preferred course, region of residence TYPES OF VARIABLES Quantitative variable - variables that can be counted or measured Example: height, weight, body temperature, monthly allowance QUANTITATIVE VARIABLES Quantitative variables may be further classified into: Discrete variable – is a quantitative variable that either a finite number of possible values or a countable number of possible values. The terms countable means that the values result from counting, such as 0, 1, 2, 3, and so on. Example: number of siblings, number of people vaccinated with Pfizer, number of students in a class QUANTITATIVE VARIABLES Continuous variable – can assume an infinite number of values between any two specific values. They obtained by measuring. They often include fractions or decimals Example: temperature, height, weight, speed, age Example Determine whether the following variables are qualitative or quantitative. 1. Number of children Quantitative 2. Student number Qualitative 3. Attendance record (absent/present) Qualitative 4. Zip code Qualitative 5. Average monthly salary of employees Quantitative Example Classify each variable as discrete or continuous variable. 1. The height of highest building in Makati Continuous 2. The number of cars that arrive at a McDonald’s Discrete drive-through between 12:00 P.M and 1:00 P.M 3. Time of a runner to finish a lap. Continuous 4. The number of students in a class Discrete 5. The amount of money a person spends in online Continuous shopping 4 Levels of Measurement LEVELS OF MEASUREMENT 1. Nominal 2. Ordinal 3. Interval 4. Ratio NOMINAL LEVEL This is the first level of measurement Identify, name, classify, or categorize objects or events The data cannot be arranged in ordering scheme. Example: Jersey no. of an athlete, mobile number, eye color, type of school (public or private) ORDINAL LEVEL Like nominal scales, identify, name, classify, or categorize, objects or events but have an additional property of a logical or natural order to the categories or values. Example: Degree of agreement, Level of Satisfaction, Social economic class, letter grades INTERVAL LEVEL Ranks data, and precise differences between units of measurement do exist, however, there is no meaningful zero. Example: Temperature on Fahrenheit/Celsius, IQ RATIO LEVEL Highest level of measurement. Identify, order, represent equal distances between scores values, and have an absolute zero point. Example: height, weight, salary, age Example Classify the following data as nominal, ordinal, interval, or ratio. 1. Student number Nominal 2. Position of government officials Ordinal 3. Lot areas of houses in a village Ratio 4. Number of vehicles registered Ratio 5. Ranking of students in a class Ordinal Collection of Data Prepared by: Ms. Fatima Perez Lesson Outline: 1 Sources of Data 2 Methods of Collecting Data 3 Sampling Techniques 1 SOURCES OF DATA Use of Documented Data The researcher can obtain documented data from previous studies of individuals or private, government, non government agencies. They may find documented data in published or written reports, unpublished documents periodicals, and others. Primary Sources Secondary Sources PRIMARY SOURCES Example: Primary Sources are individuals, business firms, or agencies 1. Central Bank (CB) primary source of whether private, government, or data for banking and finance. non-government that collected the 2. Philippine Statistics Authority (PSA) is documented data themselves. a major collector of data for both private and government needs. It Primary Data provides data on various subject matters such as household income and expenditure, housing, education, Primary Data are data documented health, employment and others. by the primary source. The data collectors themselves documented 3. Pulse Asia – source of data on this data. opinions or sentiments of the people on current issues. SECONDARY SOURCES Example: Secondary Sources are individuals or organizations that got the documented 1. The United Nations’ compiled data for data from other sources, they did not its yearbook, which originally gathered collect the data themselves but simply by government statistical agencies of compiled data fromvarious sources. different countries. 2. A medical researcher's documented data for his research paper, which were Secondary Data originally collected by DOH. 3. The documented data of the research team of congressman for its report, Secondary Data are data documented by which were originally collected by a secondary source. An individual DepEd and CHED. agency, other than the data collectors, 4. The documented data for thesis, which documented this data. ere originally collected by DOLE Important Reminders when Sourcing a Data Always examine the source. Researcher must exercise caution when using secondary data. It is advisable to used primary data instead. Primary source usually presents a lot of vital information that will help the researcher asses the quality of data and its viability for use in the inquiry. Primary source provides detailed discussion of the research design. 2 METHODS OF COLLECTING DATA SURVEY It is a method of collecting data on the variable of interest by asking people questions. When the data came from asking all the people in the population then the study is called census. When the data came from asking a sample of people selected from a well-defined population, then the study is called a sample survey. The people who answer the questions are the respondents. Example Actual survey in the Philippines: Pulse Asia conducted a sample survey on voter response to political ads in the 2022 election. Its respondents were selected registered voters who intend to vote in the 2022 election. https://www.pulseasia.ph/january-2022-nationwide-survey-on-the-may-2022-elections/ Example Actual survey in the Philippines: The Food and Nutrition Research Institute regularly conducts the National Nutrition Survey that generate data on malnutrition, prevalence of anemia, Vitamin A and iodine deficiencies, and the nutrient intake/adequacy of the members in the household. https://1cms-img.imgix.net/malnutri.jpeg PERSONAL INTERVIEWS Interviewers personally ask the respondents and record their answers on the questionnaire. Suitable for obtaining data for many types of research problems including those that concern sentiments, motions and opinions of people regarding certain issues. The interviewer plays a significant role. They can make or break a survey. PERSONAL INTERVIEWS Advantages of conducting personal interview: It has accurate and complete response. For respondents who do not even know how to read or write because interviewer himself will have to ask the question. Relatively high response rate. Needs to prepare an interviewer’s manual before the actual interview. This is the most expensive and time consuming method. TELEPHONE INTERVIEWS OR VIDEO CALL Ask respondents via telephone or video call. Less cost and do not require too much time. Telephone interview should be short should not take more than 10 minutes. Limits the scope of the surveys. Researcher should not consider telephone interview if not possible to contact the targeted respondents. SELF-ADMINISTERED QUESTIONNAIRES The respondents personally fill up the questionnaire without the assistance from the researcher. Can distribute the questionnaire to large group of respondents either by mail or personally. Less expensive. Provides convenience of answering the questions on his own time and his preferred place. Respondents feel free to express his views and reveal his actual practice. ONLINE SURVEYS Respondents read the questions and send it via internet or email. Respondents can manipulate the results by completing the internet form several times. This happens if it is not password-protected. FOCUS GROUP DISCUSSION It is an in-depth discussion among participants in a small group, which a moderator facilitates. The moderator get the different sentiments, ideas, and attitudes of all participants to provide clear and rich description of certain issues, programs, and problems but for explanatory purposes only. https://ferpection.com/en/focus-group-qualitative-research/ EXPERIMENT The researcher intervenes by controlling the conditions that may affect the response variable by using randomization mechanism in assigning the treatments controlling the extraneous variables. Experiment is more effective method of data collection in establishing cause-and-effect. EXPERIMENT A control variable is any factor you control or hold constant during an experiment. A control variable is also called a controlled variable or constant variable. An experimental group is the group that receives an experimental procedure or a test sample. This group is exposed to changes in the independent variable being tested. A control group is a group separated from the rest of the experiment such that the independent variable being tested cannot influence the results. OBSERVATION It is a method of collecting data on the phenomenon of interest by recording the observations made about the phenomenon as it actually happens. Observation method is useful in studying the reactions of individuals or group of persons/objects in a given situation or environment as it happens. QUESTIONNAIRE It is a list of well-planned questions written on paper which can be either personally administered or mailed by the researcher to the respondents using any of the following forms. a) Close-ended questions b) Open-ended questions CLOSE-ENDED QUESTIONS Guided-response Type Give your reason for not looking for a job. Check all that applies. ___ Believe work available ___ Awaiting results of previous job application. ___ Temporary illness/disability ___ Waiting for rehire/job recall. ___ Schooling ___ Others, please specify ____ CLOSE-ENDED QUESTIONS Dichotomous Type Did you attend any religious service in the past week? ___ Yes ___ No CLOSE-ENDED QUESTIONS Multiple choice Type Which among these problems do you consider as the biggest problem facing the country today? a. Poverty b. Graft and Corruption c. Terrorism d. Illegal drugs e. Unemployment f. Traffic CLOSE-ENDED QUESTIONS Ranking Questions Below is a problems that our country is experiencing. Please rank them in order of what the government should prioritize. Put number (1) as the first given priority (2) second and so on. ___ Poverty ___ Terrorism ___ Unemployment ___ Graft and Corruption ___ Drugs ___ Traffic CLOSE-ENDED QUESTIONS Rating Scale How satisfied are you with your present job? 1 2 3 4 5 Very Very Dissatisfied Neutral Satisfied Dissatisfied Satistied CLOSE-ENDED QUESTIONS Recall Type Please supply the information asked for: Age ____ Sex ____ Date of Birth ____ Place of Birth ____ OPEN-ENDED QUESTIONS Example: a.) What are your sentiments on premarital sex? What else? Anything else? b.) What are your opinions on Filipinos leaving their families to work abroad? What else anything else? c.) How did you spend your summer vacation? PITFALLS TO AVOID IN WORDING QUESTIONS Avoid vague questions. Avoid biased questions. Avoid confidential and sensitive questions. Avoid questions that are difficult to answer. Avoid questions that are confusing or perplexing to answer. Keep the question short and simple. Avoid vague questions. Example: How often did you watch a movie in a movie theatre? ____ Very often ____ Often ____ Not too often ____ Never The word “often” is vague Revised Question: How many times did you watch a movie in a movie theatre for the last July? Avoid using the phrase “in the past month”. Avoid vague questions. Example: What is your income? Revised Question: In the year 2018, what was gross family income? Please include income from all family members who live here, including wages, allowances, honoraria, commission and bonuses, and payment received from the practice of a profession or any economic activity. ___ less than P50,000 ___ P50,000 to less than P100,000 ___ P100,000 to less than P150,000 ___ P150,000 to less than P200,000 ____ P200,000 or more Avoid biased questions. Example: There are many types of sports like basketball, badminton, billiards, bowling and tennis. Which type of sports do you usually enjoy watching? Problem: The sports mention will be on top of the mind of respondents. Avoid confidential and sensitive questions. Example: Have you ever cheated in an exam? Problem: Even if you assure these students of their anonymity, many students will still hesitate to admit the truth about their actual behavior because admitting to cheat is the same as admitting that you are w/out honor and integrity. Remedy: consider using observation method instead of survey method in collecting data. Avoid questions that are difficult to answer. Example: If you are the president of this nation, what are you going to do to attain economic recovery? If you are a scientist, what device will you invent to make life more convenient? If a new computer shop were to open near the school. would you visit the store? Avoid questions that are confusing or perplexing to answer. Example: Did you eat out and watch a movie last weekend? Remedy: Ask this two separate questions. Did you eat out last week ___ Yes ___ No Did you watch a movie last weekend ___ Yes ___ No Avoid questions that are confusing or perplexing to answer. Example: Indicate whether you agree or disagree with the following statement about Metro manila Development Authority (MMDA) traffic enforcers: “ MMDA traffic enforcers should not be required to apprehend traffic violators.” Do not recommend to use double negatives in Standard English communication. Revised: “MMDA traffic enforcer should be required to apprehend traffic violators.” 3 SAMPLING TECHNIQUES Sampling Techniques Probability Sampling Non-probability Sampling It involves the selection of respondents from The probability of each case being selected the population using random in which each from the total population is not known. Units element of the population has an equal of the sample are chosen on the basis of chance or independent chance of being personal judgment or convenience. chosen. Convenience sampling Simple random sampling Quota sampling Stratified sampling Purposive sampling Systematic sampling Cluster Sampling Multistage sampling PROBABILITY SAMPLING Simple Random Sampling The names of respondents are written on small pieces of paper and rolled then place in a jar and picked at random. All members of the population has an equal chance of being selected. https://lc.gcumedia.com/ PROBABILITY SAMPLING Stratified Sampling When a sample is obtained by dividing the population into subgroups or strata according to some characteristics relevant to the study. (There can be several subgroups). Then subject are selected from each subgroup. It is used when it is important for the sample to have members from each segment of the population. https://lc.gcumedia.com/ PROBABILITY SAMPLING Cluster Sampling When a sample is obtained by dividing the population into sections or clusters and then selecting one or more clusters and using all the members in the cluster(s) as the members of the sample. Some clusters aren't sampled; data is only collected from the chosen clusters. https://lc.gcumedia.com/ PROBABILITY SAMPLING Systematic Sampling A sample in which each member of the population is assigned a number. The members of the population are ordered in some way, a starting number is randomly selected and then sample members are selected at regular intervals from the starting number. (Ex. Every 3rd, 5th, or 100th member is selected) https://lc.gcumedia.com/ PROBABILITY SAMPLING Multi-stage Sampling A sample is obtained by using more than one kind of sampling. https://i.ytimg.com/vi/WO69QS3c2c4/maxresdefault.jpg NON-PROBABILITY SAMPLING Convenience Sampling This technique is resorted to by the researcher who need the information the fastest way possible. The researcher uses subjects who are convenient. Involves choosing respondents at the convenience of the researcher and the non-random selection of respondents based on their availability or convenient accessibility. https://lc.gcumedia.com/ NON-PROBABILITY SAMPLING Convenience Sampling Example: A computer software store conducts a marketing study by interviewing potential customers who happen to be in the store browsing through the available software. https://lc.gcumedia.com/ NON-PROBABILITY SAMPLING Purposive Sampling Also known as judgmental sampling. The sample is obtained based on the subject’s knowledge of the information required by the researcher. It enables the researcher to select cases that will best enable them to answer their research question(s) and objectives. https://cdn1.vectorstock.com/i/1000x1000/50/90/purposive- sampling-sample-taken-from-a-group-vector-28835090.jpg NON-PROBABILITY SAMPLING Purposive Sampling Example: Suppose a researcher wants to make a historical study about Town A. The target population is the senior citizens of the town living in Town A since birth since they are the most reliable persons to know the history of the town. https://cdn1.vectorstock.com/i/1000x1000/50/90/purposive- sampling-sample-taken-from-a-group-vector-28835090.jpg NON-PROBABILITY SAMPLING Quota Sampling Selection of members in this sampling technique happens on basis of a pre- set standard. In this case, as a sample is formed on basis of specific attributes, the created sample will have the same attributes that are found in the total population. It is an extremely quick method of collecting samples. It is therefore a type of stratified sample in which selection of cases within strata is entirely non-random. https://thumbs.dreamstime.com/b/people-crowd-pie-chart- group-graphic-sampling-statistics-vector-97203219.jpg NON-PROBABILITY SAMPLING Quota Sampling Example: You are to investigate the relationship of students’ performance in Math and their attitude towards the subject. However, you are only given limited time to do the study. You may only consider 25 out of 500 students in your school. https://thumbs.dreamstime.com/b/people-crowd-pie-chart- group-graphic-sampling-statistics-vector-97203219.jpg Census It is a method of data collection from entire population. https://lc.gcumedia.com/ CREDIT TO MS. FATIMA PEREZ OF THE PLV MATHEMATICS DEPARTMENT FOR THIS PRESENTATION