Introduction to Statistical Concepts PDF
Document Details
Uploaded by Deleted User
Tags
Related
Summary
This document introduces statistical concepts, objectives, and the importance of statistics in decision-making. It covers topics such as the definition of data, types of variables, levels of measurement, and different statistical techniques. The document also contains examples to illustrate these concepts.
Full Transcript
Introduction to the Statistical Concepts Objectives: Define statistics. Enumerate the importance and limitations of statistics Explain the process of statistics Know the difference between descriptive and inferential statistics. Distinguish between qualitative and quantitative variables....
Introduction to the Statistical Concepts Objectives: Define statistics. Enumerate the importance and limitations of statistics Explain the process of statistics Know the difference between descriptive and inferential statistics. Distinguish between qualitative and quantitative variables. Distinguish between discrete and continuous variables. Determine the level of measurement of a variable. Statistics: started on the 17 CENTURY Derived from the Latin word (statisticum collegium) which means Council of States and from the word (Statistica) which means statesman or politician. Original Purpose, 17th century – gather info for Gov. and Administrative bodies. Statistics: On the 19th century – collection and analysis in general On the 20th century – became an integral part of all aspects (Industry, education, health etc.) Statistics: By Warren and Brown (1992), Statistics means: Originally, statistics means the science dealing with data about the condition of a state or community (for CPI Consumer Price index, GDP Gross National Product, Birth rates, mortality rates, unemployment rate, literacy rates, and monetary exchange rates. STATISTICS (SCIENCE) – is a branch of science dealing with the collections, organizations, presentation, analysis, and interpretation of any kind of data. STATISTICS (MEASURE) – is any descriptive form of measurements such as mean, median, standard deviation, etc. which are computed as a sample data. STATISTICS – is a scientific body of knowledge that deals with collection of data, organization or presentation of data, analysis and interpretation of data. Statistics: is the science of collecting, organizing, summarizing, and analyzing information to draw conclusions or answer questions. In addition, statistics is about providing a measure of confidence in any conclusions. is a numerical summary of a sample. Why Statistics is important Statistics is important because it enables people to make decisions based on empirical evidence. Statistics provides us with tools needed to convert massive data into pertinent information that can be used in decision-making. Statistics can provide us with information that we can use to make sensible decisions. Why Statistics is important Statistics is important because it enables people to make decisions based on empirical evidence. Statistics provides us with tools needed to convert massive data into pertinent information that can be used in decision making. Statistics can provide us information What information is referred to in the definition? What information is referred to in the definition? The information referred to the definition is the DATA. According to the Merriam Webster dictionary, data are “factual information used as a basis for reasoning, discussion, or calculation”. What information is referred to in the definition? The information referred to the definition is the DATA. According to the Merriam Webster dictionary, data are “factual information used as a basis for reasoning, discussion, or calculation”. Data can be numerical, as in height, or nonnumerical, as in gender. In either case, data describe characteristics of an individual. Field of Statistics A. Mathematical Statistics- The study and development of statistical theory and methods in the abstract. B. Applied Statistics- The application of statistical methods to solve real problems involving randomly generated data and the development of new statistical methodology motivated by real problems. Example branches of Applied Statistics: psychometric, econometrics, and biostatistics. Limitation of Statistics 1. Statistics is not suitable to the study of qualitative phenomenon. 2. Statistics does not study individuals. 3. Statistical laws are not exact. 4. Statistics table may be misused. 5. Statistics is only, one of the methods of studying a problem. Terms to remember Universe - is the set of all entities under study. Population - is the total or entire group of individuals or observations from which information is desired by a researcher. Apart from persons, a population may consist of mosquitoes, villages, institutions, etc. Individual - is a person or object that is a member of the population being studied. Terms to remember Statistic - is a numerical summary of a sample. Sample - is the subset of the population. Parameter - is a numerical summary of a population sample PROCESS OF STATISTICS 1. Identify the research objective. A researcher must determine the question(s) he or she wants answered. The question(s) must clearly identify the population that is to be studied. Identify the research objective. PROCESS OF STATISTICS 2. Collect the information needed to answer the questions. Conducting research on an entire population is often difficult and expensive, so we typically look at a sample. This step is vital to the statistical process, because if the data are not collected correctly, the conclusions drawn are meaningless. Do not overlook the importance of appropriate data collection. Example 1. The Philippine Mental Health Associations contacts 1,028 teenagers who are 13 to 17 years of age and live in Antipolo City and asked whether or not they had been prescribed medications for any mental disorders, such as depression or anxiety. Population: ALL Teenagers 13 to 17 years of age who live in Antipolo City Population: ALL Teenagers 13 to 17 years of age who live in Antipolo City Sample: 1,028 teenagers 13 to 17 years of age who live in Antipolo City Example 2. A farmer wanted to learn about the weight of his soybean crop. He randomly sampled 100 plants and weighted the soybeans on each plant. Population: Entire soybean crop Sample: 100 selected soybean crop PROCESS OF STATISTICS 3. Organize and summarize the information. Descriptive statistics allow the researcher to obtain an overview of the data and can help determine the type of statistical methods the researcher should use. PROCESS OF STATISTICS 4. Draw conclusion from the information. In this step the information collected from the sample is generalized to the population. Inferential statistics uses methods that takes results obtained from a sample, extends them to the population, and measures the reliability of the result. Take note!!! 4. Draw conclusion from the information. If the entire population is studied, then inferential statistics is not necessary, because descriptive statistics will provide all the information that we need regarding the population. 2 Major Divisions of Statistics 2 Major Divisions of Statistics DESCRIPTIVE 2 Major Divisions of Statistics DESCRIPTIVE INFERENTIAL Descriptive Statistics – is a statistical procedure concerned with describing the characteristics and properties of a group of persons, places, or things; it is based on easily verifiable facts. Inferential Statistics – is a statistical procedure used to draw inferences about the population based on the information obtained from the sample. DESCRIPTIVE INFERENTIAL Presentation of Correlation & data; Sampling Distribution Definition Of Terms Summation, Regression Calculator Exercises Simple Time Series Hypothesis Testing Analysis Sampling Techniques Summary Of Measures Z-test, t-test, F-test, Test on Chi-square Normal Distribution Proportion Test Example For the following statements, decide whether it belongs to the field of descriptive statistics or inferential statistics. Example 1. A badminton player wants to know his average score for the past 10 games. Example 1. A badminton player wants to know his average score for the past 10 games. (Descriptive Statistics) Example 2. A car manufacturer wishes to estimate the average lifetime of batteries by testing a sample of 50 batteries. Example 2. A car manufacturer wishes to estimate the average lifetime of batteries by testing a sample of 50 batteries. (Inferential Statistics) Example 3. Janine wants to determine the variability of her six exam scores in Algebra. Example 3. Janine wants to determine the variability of her six exam scores in Algebra. (Descriptive Statistics) Example 4. A shipping company wishes to estimate the number of passengers traveling via their ships next year using their data on the number of passengers in the past three years. Example 4. A shipping company wishes to estimate the number of passengers traveling via their ships next year using their data on the number of passengers in the past three years. (Inferential Statistics) Example 5. A politician wants to determine the total number of votes his rival obtained in the past election based on his copies of the tally sheet of electoral returns. Example 5. A politician wants to determine the total number of votes his rival obtained in the past election based on his copies of the tally sheet of electoral returns. (Descriptive Statistics) Which of the following questions can be answered by descriptive/inferential statistics? Why? 1. How many students are interested to take Mathematics on line? 2. What are the highest and the lowest scores obtained by applicants in an interview test? 3. What are the characteristics of the most likeable professors according to students? 4. Who performed better in the entrance test ? 5. What proportion of NU MOA students likes mathematics? 6. Is there a significant difference in the academic performance of male and female students in Algebra? 7. Is there a significant correlation between educational attainment and job performance rating? 8. Is there a significant difference between the mean of First Year Section A students and Section B students in Filipino? 9. Is there a significant difference between the proportions of students who are interested to take Algebra online and whose who are not? 10. Is there a correlation between the weight of the students before and after attending the feeding program of Dep Ed Calabarzon? DISTINCTION BETWEEN QUALITATIVE AND QUANTITATIVE VARIABLES Variables - are the characteristics of the individuals within the population Variables - are the characteristics of the individuals within the population If variables did not vary, they would be constants, and statistical inference would not be necessary. Example Variable S = sex of a student E = Employment status of an employee I = monthly income of a person N =Number of Children of a teacher H =height of a basketball player 2 groups of Variables 2 groups of Variables QUALITATIVE VARIABLE 2 groups of Variables QUALITATIVE QUANTITATIVE VARIABLE VARIABLE Qualitative Variables (Categorical) - is a variable that yields categorical responses. It is a word or a code that represents a class or category. Quantitative Variables (Numeric) - takes on numerical values representing an amount or quantity Determine whether the following variables are qualitative or quantitative. 1. Haircolor 2. Temperature 3. Stages of Breast Cancer 4. Number of Hamburger sold 5. Number of Children 6.Zip code 7. Place of birth 8. Degree of Pain Qualitative Variable Dichotomous – variables can be made only in two categories Multinomial – variables can be made in more than two categories. Quantitative Variable Discrete Variable a quantitative variable that either a finite number of possible values or a countable number of possible values produces numerical responses that arise from count data; decimals have no meaning (How many) Continuous Variable a quantitative variable that either an infinite number of possible values that are not countable numbers. can take on numerical responses that arise from measured data; decimals have meaning. (How much) Determine whether the following quantitative variables are discrete or continuous. 1. The number of heads obtained after flipping a coin five times. 2. The number of cars that arrive at a McDonald’s drive-through between 12:00 P.M and 1:00 P.M. 3. The distance of a 2005 Toyota Prius can travel in city conditions with a full tank of gas 4. Number of words correctly spelled. 5. Time of a runner to finish one lap. 4 levels of Measurement Measurement – is the process of determining the value or label of the variable based on what has been observed. The four levels of measurement are nominal, ordinal, interval, and ratio. It is important to know the level of measurement used to measure a variable because it will help us in the interpretation of the value that the variable takes on; LOM helps us to decide on the appropriate statistical technique to use in analyzing the collected data because we would know the mathematical treatment that we can apply on the measurements that make up our data. RATIO Quantitative INTERVAL ORDINAL Qualitative NOMINAL Nominal Level - They are sometimes called categorical scales or categorical data. Such a scale classifies persons or objects into two or more categories. Whatever the basis for classification, a person can only be in one category, and members of a given category have a common set of characteristics. Example: - Method of payment (cash, check, debit card, credit card) Type of school (public vs. private) Eye Color (Blue, Green, Brown) Ordinal Level - This involves data that may be arranged in some order, but differences between data values either cannot be determined or are meaningless. An ordinal scale not only classifies subjects but also ranks them in terms of the degree to which they possess characteristics of interest. In other words, an ordinal scale puts the subjects in order from highest to lowest, from most to least. Although ordinal scales indicate that some subjects are higher, or lower than others, they do not indicate how much higher or how much better. Example: Food Preferences Stage of Disease Social Economic Class (First, Middle, Lower) Interval Level - This is a measurement level not only classifies and orders the measurements, but it also specifies that the distances between each interval on the scale are equivalent along the scale from low interval to high interval. A value of zero does not mean the absence of the quantity. Arithmetic operations such as addition and subtraction can be performed on values of the variable. Example: Temperature on Fahrenheit/Celsius Thermometer Trait anxiety (e.g., high anxious vs. low anxious) IQ (e.g., high IQ vs. average IQ vs. low IQ) Ratio Level - A ratio scale represents the highest, most precise, level of measurement. It has the properties of the interval level of measurement and the ratios of the values of the variable have meaning. A value of zero means the absence of the quantity. Arithmetic operations such as multiplication and division can be performed on the values of the variable. Example: Height and weight Time Time until death Both interval and ratio data involve measurement. Most data analysis techniques that apply to ratio data also apply to interval data..Therefore, in most practical aspects, these types of data (interval and ratio) are grouped under metric data. In some other instances, these type of data are also known as numerical discrete and numerical continuous. Example Categorize each of the following as nominal, ordinal, interval or ratio measurement. 1. Ranking of college athletic teams. 2. Employee number. 3. Number of vehicles registered. 4. Brands of soft drinks. 5. Number of car passers along C5 on a given day. 6. Zip code 7. Degree of pain