The Nature of Probability and Statistics PDF
Document Details
Uploaded by Deleted User
Marianne Jane Antoinette D. Pua, M.S.
Tags
Related
- Statistics And Probability STAT 303 PDF
- JG University Probability and Statistics Sample Mid-Term Exam Paper 2023-2024 PDF
- JG University Probability and Statistics Mid-Term Exam 2023-24 PDF
- Chapter 8 - Statistics and Probability PDF
- Probability & Statistics 2024-2025 PDF
- Mathematics of Data Management Notes (Lower Canada College)
Summary
This document provides an introduction to probability and statistics, covering various aspects like descriptive and inferential statistics. It outlines different types of variables and measurement levels. This is a good resource for those studying statistics as a subject.
Full Transcript
**THE NATURE OF PROBABILITY AND STATISTICS** **Introduction** Decision makers make better decisions when they use all available information in an effective and meaningful way. The primary role of statistics is to provide decision makers with methods for obtaining and analyzing information to help...
**THE NATURE OF PROBABILITY AND STATISTICS** **Introduction** Decision makers make better decisions when they use all available information in an effective and meaningful way. The primary role of statistics is to provide decision makers with methods for obtaining and analyzing information to help make these decisions. Statistics is used to answer long-range planning questions, such as when and where to locate facilities to handle future sales. The word statistics is derived from the Latin word *status* meaning "state". In the beginning, statistics involved compilation of data and graphs describing various aspects of state or country. The word statistics means different to different people. To some, statistics means actual numbers derived from data and others refer to statistics as a method of analysis. Thus, specifically, statistics is defined as the science of collecting, organizing, presenting, analyzing and interpreting numerical data for the purpose of assisting in making a more effective decision. Statistical methods are vital tools in many researches in education, psychology, medicine, business, agriculture, and other disciplines. **Types of Statistics** **Statistics is a tool which helps us develop general and meaningful conclusions that go beyond the original data. There are two types of statistical analyses: Descriptive and Inferential or Inductive Statistics.** 1. *Descriptive Statistics* are all the methods used to collect, organize, summarize or present data, usually to make the data easier to understand. It is concerned with summary calculations such as averages, and percentages and construction of graphs, charts and tables. 2. *Inferential Statistics* is concerned with the formulation of conclusions or generalizations about a population based on an observation or a series of observations of a sample drawn from a population. It consists of performing hypothesis testing, determining relationships among variables, and making predictions. For example, the average family income of the residents in Region 2 can be estimated from figures obtained from a few hundred (the sample) of families. **Population vs Sample:** A [Population] is the collection of all possible observations of a specified characteristic of interest. Example: All students in ISU for the SY 2009-2010. A [sample] is a subset of the population. Example: First year students in ISU. **Quantitative and Qualitative Variables or Data** **In doing a research, initially, we have to define the variables relevant to the data. The term** [variable] means an item of interest that can take on many different numerical values while a collection of this is called [data]. The variable may take on different value. If a given value does not vary or fixed, it is called constant. There are two major qualifications of variables: qualitative and quantitative. 1. *[Qualitative Variables]* are nonnumeric variables and can\'t be measured. Examples include gender (male, female), religious affiliation (Roman Catholic, Iglesia ni Cristo, Methodist, etc), ethnicity (Ilocano, Tagalog, Ibanag, etc.) 2. *[Quantitative Variables]* are numerical variables and can be measured. Examples include balance in your checking account, number of children in your family. *Some quantitative variables can take on only specific or isolated values along a scale, for example, the number of children in the family may be 1, 2, 3, or any other whole number but it can never be 1.25 or 0.5. Thus, this variable has values which can only be obtained through the process of counting and is referred to as discrete or discontinuous variables.* *Specifically, quantitative variables can be ordered and ranked. It can be classified in to two groups: Discrete and Continuous.* *[Discrete variables] are values that are obtained by counting. The results are whole numbers. For example, the number of students in the room.* *[Continuous variables] are values that are obtained by measuring. The results can be any value between two specific values. For example, if you take everyone's height of students in the room, you could get any number between two reasonable amounts. So height is a continuous variable.* **Levels of Measurement:** **Variables can also be classified according to the level of measurement. There are four levels of measurement: Nominal, Ordinal, Interval and Ratio.** 1. *[Nominal Data]:* The weakest data measurement. Numbers are used to represent an item or characteristic. Examples include: names, gender, religious affiliation, civil status, college majors. *Note that such data should not be treated as numerical, since relative size has no meaning.* 2. *[Ordinal or Rank Data]:* This can be ordered or ranked, but a specific difference in the levels can not be determined. For example, the performance rating (Outstanding, Very Satisfactory, Satisfactory, Poor). This can be ordered. You know that Outstanding is higher than Very Satisfactory or Very Satisfactory is higher than Satisfactory, etc. , but there is no exact difference between any two of them. For example, the grade of Outstanding and Very Satisfactory may be close (4.65 and 4.45) or may be far apart (5.00 and 4.25), so the exact difference cannot be determined. 3. *[Interval Data]:* This can be ordered and has exact difference between any two units but has no meaningful zero or starting point. For example, Temperature is an interval data since they can be ordered, there is an exact difference between two degrees, but the zero does not mean the starting point since there can be temperatures below zero. 4. *[Ratio Data]:* Is the highest level of measurement and allows for all basic arithmetic operations, including division and multiplication. Data at this level can be ordered, has exact difference between units, and has a meaningful zero. Things that are counted are usually ratio level, for example, business data, such as cost, revenue and profit. **Data Collection: Data can be collected in various ways:** 1. Focus Group 2. Telephone Interview 3. Mail Questionnaires 4. Door-to-Door Survey 5. Mall Intercept 6. New Product Registration 7. Personal Interview 8. Experiments **Sources of Data:** 1. *[Secondary Data]:* Data which are already available. For example, ISU enrollment data. Secondary data is less expensive; however, it may not satisfy the researcher's need. 2. *[Primary Data]:* Data which must be collected. **Sampling Techniques:** **Sampling Techniques** are used when a part of the population is to be surveyed. If it takes too long or very expensive to interview the whole population, a sample is used. If a sample is chosen correctly to represent the population, it is called [unbiased] while if it does not represent the whole population, it is called [biased]. There are many ways to collect a sample, statistical or non-statistical. The most commonly used methods are: **A. Statistical Sampling:** 1\. *[Simple Random Sampling]:* This is used to see that all possible elements of the population have an equal opportunity of being selected for the sample. 2\. *[Stratified Random Sampling]:* This is obtained by selecting simple random samples from strata (or mutually exclusive sets). Some of the criteria for dividing a population into strata are: Gender (male, female); Age (under 18, 18 to 28, 29 to 39); Occupation (blue-collar, professional, other). [3. *Cluster Sampling*]*:* This is a simple random sample of groups or cluster of elements. Cluster sampling is useful when it is difficult or costly to generate a simple random sample. For example, to estimate the average annual household income in a large city we use cluster sampling, because to use simple random sampling we need a complete list of households in the city from which to sample. To use stratified random sampling, we would again need the list of households. A less expensive way is to let each block within the city represent a cluster. A sample of clusters could then be randomly selected, and every household within these clusters could be interviewed to find the average annual household income. **B. Nonstatistical Sampling:** 1. *[Judgement Sampling]:* In this case, the person taking the sample has direct or indirect control over which items are selected for the sample. 2. *[Convenience Sampling]:* In this method, the decision maker selects a sample from the population in a manner that is relatively easy and convenient. 3. *[Quota Sampling]:* In this method, the decision maker requires the sample to contain a certain number of items with a given characteristic. Many political polls are, in part, quota sampling.\ \ **Note:** The random number table provides lists of numbers that are randomly generated and can be used to select random samples. Computer packages are used to generate lists of random numbers. For the table, refer to any texts in Statistics. **\ ** **Worksheet no. 1** 1. What is statistics? Give one specific applications of statistics in the following fields: a. Education d. Biology b. Business e. Economics c. Psychology 2\. Differentiate the following and give example for each. a. Descriptive and inferential statistics. b. Sample and population c. Discrete and continuous variables. d. Qualitative and quantitative data 3\. Write the correct answer in the space provided for. a. \_\_\_\_\_\_\_\_\_\_\_\_\_\_\_ A collection of all the objects to be studied. b. \_\_\_\_\_\_\_\_\_\_\_\_\_\_\_ The highest level of measurement. c. \_\_\_\_\_\_\_\_\_\_\_\_\_\_\_ The level of measurement that can only be classified into groups. d. e. \_\_\_\_\_\_\_\_\_\_\_\_\_\_\_ A subset or a part of the subjects to be studied. f. \_\_\_\_\_\_\_\_\_\_\_\_\_\_\_ The level of measurement of the variable I.Q. g. \_\_\_\_\_\_\_\_\_\_\_\_\_\_\_ A sample that does not represent a population correctly. h. i. \_\_\_\_\_\_\_\_\_\_\_\_\_\_\_ The use of preexisting groups in a sample. j. \_\_\_\_\_\_\_\_\_\_\_\_\_\_\_ A sampling procedure done by taking every third item to be tested. k. 4\. On the space provided after each number, write **Q** if the variable is qualitative and if it is quantitative, write **D** if it is discrete and **C** if continuous. a. \_\_\_\_\_\_\_ Educational attainment h. \_\_\_\_\_\_\_ Brand of watches b. \_\_\_\_\_\_\_ ID number i. \_\_\_\_\_\_\_ Student number c. \_\_\_\_\_\_\_ IQ score j. \_\_\_\_\_\_\_ Height of the building d. \_\_\_\_\_\_\_ Political affiliation k. \_\_\_\_\_\_\_ Number of years in school e. \_\_\_\_\_\_\_ Rank of teachers l. \_\_\_\_\_\_\_ Speed of cars f. \_\_\_\_\_\_\_ Place of residence m. \_\_\_\_\_\_\_ Weight of children g. \_\_\_\_\_\_\_ Time required to **References:**