Stat-213 Chapter 1 PDF
Document Details
Uploaded by MeritoriousSelenite
University of Southern Mindanao
Tags
Summary
This document introduces the concept of statistics, defining it as a collection of methods for analyzing data. It covers theoretical and applied aspects of statistics, including data collection and types. It also explains different levels of measurement and data collection methods.
Full Transcript
INTRODUCTION Intended Learning Outcomes: 1. Define statistics. 2. Identify the levels of measurement and data collection and sampling techniques. 3. Apply the levels of measurement, data collection and sampling techniques, and summation notation. Most people hear about statisti...
INTRODUCTION Intended Learning Outcomes: 1. Define statistics. 2. Identify the levels of measurement and data collection and sampling techniques. 3. Apply the levels of measurement, data collection and sampling techniques, and summation notation. Most people hear about statistics through radio, television, newspapers, and magazines. The term statistics has different meanings as either a plural or a singular noun. In plural form, it refers to a set of numerical data, such as a record of the birth rate in a rural area compared with the birth rate in an urban area. In singular form, Statistics is an academic discipline such as Mathematics or Physics. Statistics as an academic discipline stresses the analysis of data to facilitate the process of decision- making. It is used to analyze the results of surveys, and as a tool in scientific studies, to make decisions based on controlled experiments. WHAT IS STATISTICS? Statistics is a collection of methods for planning experiments, obtaining data, and then analyzing, interpreting, and drawing conclusions based on the data. Statistics has two aspects: theoretical and applied. The theoretical aspect deals with the development, derivation, and proof of statistical theorems, formulas, rules, and laws. Applied statistics involves the applications of those theorems, rules, and laws to solve real-world problems. In order for a statistician to gain information, he collects data for variables used to describe an event. Data are the values that the variables can assume. Variables whose values are determined by chance are called random variables. These data can be used in different ways. There are two types of variables – qualitative and quantitative. Qualitative variables are words or codes that represent a class or category. On the other hand, quantitative variables are numbers that represent an amount or a count. Quantitative variables can be further classified as discrete or continuous. Discrete variables can be assigned values such as 0, 1, 2, 3, … and are said to be countable. On the other hand, continuous variables can assume all values between any two specific values like 0.5, 1.2, etc. For example, the temperature of a given person is a continuous variable, while the number of persons in a room is a discrete variable. Types of Statistics Statistics is sometimes divided into two main areas. Descriptive Statistics summarizes or describes the important characteristics of a known set of data. For example, the National Statistics Office conducts surveys to determine the average age, income, and other characteristics of the Filipino population. Inferential Statistics uses sample data to make inferences about a population. It consists of generalizing from samples to populations, performing hypothesis testing, determining relationships among variables, and making predictions. This kind of statistics uses the concept of probability – the chance of an event to happen. In statistics, we commonly use the terms population and sample. A population is the complete and entire collection of elements to be studied. Sometimes, a population is very large. To save time and money, statisticians may study only a part of the population. This is called a sample. A sample is a subset of a population. Closely related to the concepts of a population and a sample are the concepts of parameter and statistic. A parameter is a numerical measurement describing some characteristics of a population. A statistic is a numerical measurement describing some characteristics of a sample. LEVEL OF MEASUREMENT Aside from being classified as qualitative or quantitative, variables can also be classified according to how they are categorized, counted, or measured. 1. Nominal Level This is characterized by data that consist of names, labels, or categories only. The data cannot be arranged in an ordering scheme. There is no criterion as to which values can be identified as greater than or less than other values. For example, in classifying the instructors in a university as male or female, no ranking can be placed on the data. Another example is classifying residents according to their area codes. Although, numbers are assigned as area codes, there is no meaningful order. 2. Ordinal Level This involves data that may be arranged in some order, but differences between data values either cannot be determined or meaningless. An example is the grading system involving letters (A, B, C, D, F). 3. Interval Level This is the same as ordinal level, with an additional property that we can determine meaningful amounts of differences between the data. Data at this level may lack an inherent zero starting point. For example, temperature is an interval measurement. There is a meaningful difference in one degree between each unit such as 80 and 81 degrees. But a zero-degree temperature does not mean there is no heat. 4. Ratio Level This is an interval level modified to include the inherent zero starting point. The difference and ratios of data are meaningful. This is also the highest level of measurement. An example would be the measure of height, weight, or area. There is a meaning between values, and a true zero exists. DATA COLLECTION AND SAMPLING TECHNIQUES Data can be collected in different ways. The most common is through survey – telephone, mailed questionnaire, or personal interview. There are also other methods of collecting data: surveying records or direct observation. Four Basic Methods of Sampling 1. Random Sampling/Simple Random Sampling This is done by using chance methods or random numbers. For example, number each subject in the population. Place each number in a bowl, and select as many card numbers as needed. The subjects whose numbers are selected composes a sample. 2. Systematic Sampling/Systematic Random Sampling This is done by numbering each subject of the population and then selecting every kth number. For example, there are 5 000 families in a city. Fifty families are needed as sample for an experiment. Since 5 000 ÷ 50 = 100, then 𝑘 = 100. This means that every 100th subject would be selected. However, the first subject would be selected at random from subjects 1 to 100. Suppose subject 88 was selected, then the sample would consist of subjects whose numbers were 88, 188, 288, and so on until 50 families were obtained. 3. Stratified Sampling If a population has distinct groups, it is possible to divide the population into these groups and to draw SRSs (Systematic Random Samples) from each of the groups. The groups are called strata. Strata are designed so that members in each stratum are more homogeneous, that is, more similar to each other. The results are then grouped together to form the sample. This technique is particularly useful in populations that can be stratified into groups by gender, race, or geography. 4. Cluster Sampling This method uses intact groups called clusters. Suppose a medical researcher wants to study the patients in Metro Manila. It would be very costly and time-consuming for a random sample since they would be spread over different parts of Metro Manila. Rather, a few hospitals could be selected at random and the patients in these hospitals would be studied in a cluster. SUMMATION NOTATION This section describes the summation notation that is used to denote the sum of values. The uppercase Greek letter Σ (sigma) is used to denote the sum of all values. 𝑛 ∑ 𝑥𝑖 = 𝑥1 + 𝑥2 + ⋯ + 𝑥𝑛 𝑖=1 The symbols above and below the summation sign Σ define the limits of summation. The subscript 𝑖 = 1 means that we start with the first value of x and the superscript n means that we end with the nth value of x and calculate the sum of all values from 𝑥1 to 𝑥𝑛. Example 1 Let 𝑥1 = 8, 𝑥2 = 9, 𝑥3 = 12, 𝑥4 = 15, 𝑥5 = 6, 𝑥6 = 3, 𝑥7 = 10, 𝑥8 = 5, 𝑥9 = 2, and 𝑥10 = 1. Evaluate a) 10 ∑ 𝑥𝑖 = 𝑥1 + 𝑥2 + 𝑥3 + 𝑥4 + 𝑥5 + 𝑥6 + 𝑥7 + 𝑥8 + 𝑥9 + 𝑥10 𝑖=1 10 ∑ 𝑥𝑖 = 8 + 9 + 12 + 15 + 6 + 3 + 10 + 5 + 2 + 1 𝑖=1 10 ∑ 𝑥𝑖 = 71 𝑖=1 b) 5 ∑ 𝑥𝑖 = 𝑥1 + 𝑥2 + 𝑥3 + 𝑥4 + 𝑥5 𝑖=1 5 ∑ 𝑥𝑖 = 8 + 9 + 12 + 15 + 6 𝑖=1 5 ∑ 𝑥𝑖 = 50 𝑖=1 c. 10 ∑ 𝑥𝑖 = 𝑥6 + 𝑥7 + 𝑥8 + 𝑥9 + 𝑥10 𝑖=6 10 ∑ 𝑥𝑖 = 3 + 10 + 5 + 2 + 1 𝑖=6 10 ∑ 𝑥𝑖 = 21 𝑖=6 Here’s another way to solve this. 10 10 5 ∑ 𝑥𝑖 = ∑ 𝑥𝑖 − ∑ 𝑥𝑖 𝑖=6 𝑖=1 𝑖=1 10 ∑ 𝑥𝑖 = 71 − 50 𝑖=6 10 ∑ 𝑥𝑖 = 21 𝑖=6 Example 2 Using the data from the previous example, suppose before summing the xs we wish to multiply each of the first five terms of 𝑥𝑖 by 8. Evaluate 5 ∑ 8𝑥𝑖 = 8𝑥1 + 8𝑥2 + 8𝑥3 + 8𝑥4 + 8𝑥5 𝑖=1 5 ∑ 8𝑥𝑖 = 8(8) + 8(9) + 8(12) + 8(15) + 8(6) 𝑖=1 5 ∑ 8𝑥𝑖 = 64 + 72 + 96 + 120 + 48 𝑖=1 5 ∑ 8𝑥𝑖 = 400 𝑖=1 Here’s another way to solve this. 5 5 ∑ 8𝑥𝑖 = 8 ∑ 𝑥𝑖 = 8(50) = 400 𝑖=1 𝑖=1 Formulas of Summation Formula 1. For any constant c, 𝑛 𝑛 ∑ 𝑐𝑥𝑖 = 𝑐 ∑ 𝑥𝑖 𝑖=1 𝑖=1 𝑛 ∑ 𝑐𝑥𝑖 = 𝑐𝑥1 + 𝑐𝑥2 + 𝑐𝑥3 + ⋯ + 𝑐𝑥𝑛 𝑖=1 𝑛 ∑ 𝑐𝑥𝑖 = 𝑐(𝑥1 + 𝑥2 + 𝑥3 + ⋯ + 𝑥𝑛 ) 𝑖=1 𝑛 𝑛 ∑ 𝑐𝑥𝑖 = 𝑐 ∑ 𝑥𝑖 𝑖=1 𝑖=1 Example 3 Evaluate: ∑8𝑖=1 10 This implies that we add 10 by itself eight times. 8 ∑ 10 = 10 + 10 + 10 + 10 + 10 + 10 + 10 + 10 = 8(10) = 80 𝑖=1 We can generalize this as Formula 2 For any constant c, 𝑛 ∑ 𝑐 = 𝑛𝑐 𝑖=1 Formula 3 𝑛 𝑛 𝑛 𝑛 ∑(𝑥𝑖 + 𝑦𝑖 + 𝑧𝑖 ) = ∑ 𝑥𝑖 + ∑ 𝑦𝑖 + ∑ 𝑧𝑖 𝑖=1 𝑖=1 𝑖=1 𝑖=1 Formula 4 𝑛 ∑(𝑥𝑖 ± 𝑐) = (𝑥1 ± 𝑐) + (𝑥2 ± 𝑐) + (𝑥3 ± 𝑐) + ⋯ + (𝑥𝑛 ± 𝑐) 𝑖=1 𝑛 (∑ 𝑥𝑖 ) ± 𝑛𝑐 𝑖=1 Example 4 Given: ∑10 𝑖=1 𝑥𝑖 = 55, ∑𝑖=1 𝑦𝑖 = 65, and ∑𝑖=1 𝑧𝑖 = 165. 10 10 Evaluate: a) ∑10 𝑖=1(𝑥𝑖 + 𝑦𝑖 + 𝑧𝑖 ) b) ∑10 𝑖=1(3𝑥𝑖 + 4𝑦𝑖 − 7) Solution a. 10 10 10 10 ∑(𝑥𝑖 + 𝑦𝑖 + 𝑧𝑖 ) = ∑ 𝑥𝑖 + ∑ 𝑦𝑖 + ∑ 𝑧𝑖 𝑖=1 𝑖=1 𝑖=1 𝑖=1 10 ∑(𝑥𝑖 + 𝑦𝑖 + 𝑧𝑖 ) = 55 + 65 + 165 𝑖=1 10 ∑(𝑥𝑖 + 𝑦𝑖 + 𝑧𝑖 ) = 285 𝑖=1 b) 10 10 10 10 ∑(3𝑥𝑖 + 4𝑦𝑖 − 7) = 3 ∑ 𝑥𝑖 + 4 ∑ 𝑦𝑖 − ∑ 7 𝑖=1 𝑖=1 𝑖=1 𝑖=1 10 ∑(3𝑥𝑖 + 4𝑦𝑖 − 7) = 3(55) + 4(65) − 10(7) 𝑖=1 10 ∑(3𝑥𝑖 + 4𝑦𝑖 − 7) = 165 + 260 − 70 = 355 𝑖=1 Formula 5 𝑛 ∑ 𝑥𝑖2 = 𝑥12 + 𝑥22 + 𝑥32 + ⋯ + 𝑥𝑛2 𝑖=1 Formula 6 𝑛 𝑛 ∑ 𝑐𝑥𝑖2 = 𝑐𝑥12 + 𝑐𝑥22 + 𝑐𝑥32 + ⋯+ 𝑐𝑥𝑛2 = 𝑐 ∑ 𝑥𝑖2 𝑖=1 𝑖=1 Formula 7 𝑛 ∑(𝑥𝑖 ± 𝑐)2 = (𝑥1 ± 𝑐)2 + (𝑥2 ± 𝑐)2 + (𝑥3 ± 𝑐)2 + ⋯ + (𝑥𝑛 ± 𝑐)2 𝑖=1 Example 6 The scores of five students in an English class are 75, 80, 97, 91, and 63. Find the following if x represents a score. a) ∑ 𝑥 b) ∑ 𝑥 2 c) (∑ 𝑥)2 Solution a) ∑ 𝑥 = 75 + 80 + 97 + 91 + 63 = 406 b) ∑ 𝑥 2 = (75)2 + (80)2 + (97)2 + (91)2 + (63)2 ∑ 𝑥 2 = 5 625 + 6 400 + 9 409 + 8 281 + 3 969 ∑ 𝑥 2 = 33 684 c) (∑ 𝑥)2 = (75 + 80 + 97 + 91 + 63)2 = (406)2 = 164 836 TECHNOLOGY APPLICATION Summation Notation in Excel https://www.excel-easy.com/examples/sum.html Use the SUM function in Excel to sum a range of cells, an entire column, or non-contiguous cells. To create awesome SUM formulas, combine the SUM function with other Excel functions. Sum Range Most of the time, you'll use the SUM function in Excel to sum a range of cells. Sum Entire Column You can also use the SUM function in Excel to sum an entire column. Note: you can also use the SUM function in Excel to sum an entire row. For example, = 𝑆𝑈𝑀(5: 5) sums all values in the 5th row. Sum Non-contiguous Cells You can also use the SUM function in Excel to sum non-contiguous cells. Non-contiguous means not next to each other. Note: = 𝐴3 + 𝐴5 + 𝐴8 produces the exact same result! AutoSum Use AutoSum or press ALT + = to quickly sum a column or row of numbers. 1. First, select the cell below the column of numbers (or next to the row of numbers) you want to sum. 2. On the Home tab, in the Editing group, click AutoSum (or press ATL + =). 3. Press Enter.