Chapter 1-2 Probability and Statistics PDF

Summary

This document introduces the concepts of probability and statistics. It covers definitions and classifications, including descriptive and inferential statistics.

Full Transcript

2/18/2021 Chapter One Introduction Definition of statistics Definition: 1. Plural sense (lay man definition): Statistics is def...

2/18/2021 Chapter One Introduction Definition of statistics Definition: 1. Plural sense (lay man definition): Statistics is defined as the collection of numerical facts or figures ( or the raw data themselves). Examples: Statistics of births, deaths, students, imports & exports, etc. 2. Singular sense (formal definition): Statistics is the subject that deals with the methods of collecting, organizing, presenting, analyzing and interpreting of numerical Compiled by: Bacha E., Applied Mathematics, ASTU data.  Statistical methods can be used to find answers to the questions like: What kind and how much data need to be collected? How should we organize and summarize the data? How can we analyze the data and draw conclusions from it? How can we assess the strength of the conclusions and evaluate their uncertainty? Classifications of Statistics  Depending on how data can be used statistics is some times divided in to two main areas or branches. Compiled by: Bacha E., Applied Mathematics, ASTU Compiled by: Bacha E. Applied Mathematics, ASTU 1 2/18/2021 i. Descriptive Statistics  Is the branch of statistics devoted to the summarization and description of data.  It includes the construction of graphs, charts, and tables, and the calculation of various descriptive measures such as averages, measures of variation, and percentiles. Example: 1. Suppose that the mark of 6 students in Probability and Statistics course is given as 40, 45, 50, 60, 70 and 80. The average mark of the 6 students is 57.5 2. 85% of the instructors in ASTU are males. 3. The average age of football players participated in 2018 Russia world cup was 26 years. Compiled by: Bacha E., Applied Mathematics, ASTU ii. Inferential Statistics  consist of methods for drawing and measuring the reliability of conclusions about population based on information obtained from a sample of the population.  It deals with making inferences and/or conclusions about a population based on data obtained from a sample of observations.  Making predictions and generalizing about phenomena represented by the data.  It consists of performing hypothesis testing, determining relationships among variables and making predictions. Compiled by: Bacha E., Applied Mathematics, ASTU Compiled by: Bacha E. Applied Mathematics, ASTU 2 2/18/2021 Cont…  For example, the average income of all families (the population) in Ethiopia can be estimated from figures obtained from a few hundred (the sample) families. Compiled by: Bacha E., Applied Mathematics, ASTU Introduction ………  Descriptive Statistics Inferential Statistics  Collect  Predict and forecast values of  Organize population parameters  Summarize  Test hypotheses about values of  Display population parameters  Analyze  Make decisions Stages of Statistical Investigation There are five stages or steps in any statistical investigation. Collection of data Organization of data Presentation of data Analysis of data Compiled by: Bacha E., Applied Mathematics, ASTU Interpretation of data Compiled by: Bacha E. Applied Mathematics, ASTU 3 2/18/2021 Definition of some statistical terms Population: - is a complete observations or measurements of individuals or objects under study.  The word population doesn’t necessarily refer to people. e.g. -All clients of Telephone Company -All students of Adama Science and Technology University (ASTU) - All households in Adama town Sample: - is a part or subset of the population under study. Survey: - is an investigation of a certain population to assess its characteristics. It may be census or sample. Census survey: a complete enumeration of the population under study. Sample survey: the process of collecting data covering a representative part or portion of a population. Compiled by: Bacha E., Applied Mathematics, ASTU  Parameter: Characteristic or measure obtained from a population data. Examples: -Population mean (µ-read as “mu”) -Population variance ( σ2 -read as “sigma square”) - Population standard deviation(σ) -etc  Statistic: Characteristic or measure obtained from a sample. - That is, a statistic describes a characteristic of the sample which can then be used to make inference about unknown parameters. Compiled by: Bacha E., Applied Mathematics, ASTU Compiled by: Bacha E. Applied Mathematics, ASTU 4 2/18/2021 Introduction ……… Examples: -Sample mean 𝑥 -Sample variance (S2) -Sample standard deviation (S) Sample size(n): - The number of elements or observation to be included in the sample. Variable: - is an item of interest that can take numerical or non-numerical values for different elements Example: Sex, marital status, age, weight, height, expenditure, etc There are two types of variable. 1. Qualitative variable:- is variable that assume non-numerical values e.g. Sex, religion marital status, nationality, language, hair color, etc. 2. Quantitative variable:- is variable that assume numerical values e.g. Age, income, height, weight, family size, volume, expenditure, etc. Compiled by: Bacha E., Applied Mathematics, ASTU Introduction ……… Note that quantitative variables are either discrete (which can assume only certain values, and there are usually "gaps" between the values, such as the number of bedrooms in your house) or continuous (which can assume any value within a specific range, such as the air pressure in a tire.) Data:- is a measurement or observation value recorded for a certain element or variable. Data Based on their nature data divided into two Categorical Numerical 1. Qualitative data (Qualitative) (Quantitative) 2. Quantitative data 1. Discrete data 2. Continuous data Discrete Continuous. Compiled by: Bacha E., Applied Mathematics, ASTU Compiled by: Bacha E. Applied Mathematics, ASTU 5 2/18/2021 Data … Qualitative data: is the recorded values of qualitative variable. Example:  Gender – Male & Female  Religion – Orthodox, Muslim, Catholic, Protestant, Wakefeta, etc.  Marital Status – Single, Married, Divorced and Widowed  Political Party  Quantitative data: is the recorded values of quantitative variable. Example:  Number of Children -Weight - Income  Defects per hour - Voltage - Wage of workers Compiled by: Bacha E., Applied Mathematics, ASTU Data … Quantitative 1. Discrete data data 2. Continuous data  Discrete data:- the possible values are known. - Countable e.g. Number of Children per family, No. of students in a class, etc  Continuous data:- take any value within a specific range. Most of them are obtained by measurable. e.g. Income, length, salary, weight, height, etc. Compiled by: Bacha E., Applied Mathematics, ASTU Compiled by: Bacha E. Applied Mathematics, ASTU 6 2/18/2021 Applications, Uses and Limitations of Statistics Applications: Statistics is a very broad subject, with applications in a vast number of different fields. Statistics can be applied in any field of study which seeks quantitative evidence. Statistics have wide application in Engineering -To determine the probability of reliability of a product. -To compare the breaking strength of two types of materials -To control the quality of products in a given production process. -etc. Compiled by: Bacha E., Applied Mathematics, ASTU Functions/Uses of Statistics  It condenses and summarizes a mass of data  It facilitates comparison of data  Statistics helps to predict future trends  Statistics helps to formulate & review policies  Statistics helps in Formulating and testing hypothesis Compiled by: Bacha E., Applied Mathematics, ASTU Compiled by: Bacha E. Applied Mathematics, ASTU 7 2/18/2021 Introduction ……… Limitations of Statistics It does not deal with individual values It does not deal with qualitative characteristics directly Statistical conclusions are not universally true It can be misused: statistics cannot be used to full advantage in the absence of proper understanding of the subject matter etc Compiled by: Bacha E., Applied Mathematics, ASTU Level of Measurements Proper knowledge about the nature and type of data to be dealt with is essential in order to specify and apply the proper statistical method for their analysis and inferences. Four levels of measurement scales are commonly distinguished: 1. Nominal scale -no ranking or ordering -all arithmetic & relational operations are not applicable - no numerical or quantitative value Example : -Sex (Male or Female), -Marital Status (married, single, widow, divorce) 2. Ordinal Scale Can be arranged in some order, but the differences between the data values are meaningless. All arithmetic operations are not applicable All relational operations are applicable Example:- latter grading (A, B, C, D, F) - Rating Compiled by: Bachascales (excellent, E., Applied very Mathematics, good, ASTU good, fair, poor) - military status (general, colonel, lieutenant, etc). Compiled by: Bacha E. Applied Mathematics, ASTU 8 2/18/2021 3. Interval Scale All relational operations are applicable All arithmetic operations except division are applicable There is no true zero, or starting point. That is, zero on the scale is arbitrary (artificial origin) Example: - 𝑇𝑒𝑚𝑝𝑒𝑟𝑎𝑡𝑢𝑟𝑒(℃) - Intelligence Quotient (IQ) 4. Ratio Scale All arithmetic & relational operations are applicable Zero on the scales implies absolute absence of the characteristics under considered Example: - Weight, age, number of students, number of children per family etc Compiled by: Bacha E., Applied Mathematics, ASTU CHAPTER TWO Methods of Data Collection and Presentation I. Methods Data Collection Collection Based on their sources data can be classified into two. i. Primary data ii. Secondary data. Primary data are those collected by the investigator for the purpose a specific study, whereas Secondary data are obtained from available data already collected by some other agency for the same or different purpose. Compiled by: Bacha E., Applied Mathematics, ASTU Compiled by: Bacha E. Applied Mathematics, ASTU 9 2/18/2021 Examples of secondary data Taking data from:  Different Organizations such as o Central Statistics Agency (CSA) o World Bank, o Commercial Bank of Ethiopia, o National Bank of Ethiopia, …  Books  Journals  Internet  etc Compiled by: Bacha E., Applied Mathematics, ASTU Methods of Data Collection… The common primary data collection are: i) Direct personal Interviews  Face-to-face interview  Telephone interview ii) Written questionnaire method iii) Experimental (Experimentation) iv) Indirect interview V) Observation Compiled by: Bacha E., Applied Mathematics, ASTU Compiled by: Bacha E. Applied Mathematics, ASTU 10 2/18/2021 Methods of Data Presentation  There are two methods of data presentation 1. Tabular presentation of data (Frequency Distribution) Categorical (qualitative) Frequency Distribution  Discrete Frequency Distribution Continuous Frequency Distribution 2. Diagrammatic and Graphical presentation Compiled by: Bacha E., Applied Mathematics, ASTU Data Presentation… 1. Categorical Frequency Distribution  Used for data that can be place in specific categories such as nominal, or ordinal  count the occurrences in each category and find the totals. Example: Social worker collected the following data on marital status from 60 persons. (M=married, S=single, W=widowed, D=divorced). M M M S S S S D W W D W D S …… Compiled by: Bacha E., Applied Mathematics, ASTU Compiled by: Bacha E. Applied Mathematics, ASTU 11 2/18/2021 Data Presentation… Table 1. The Marital status of 60 adults Marital S M D W Total Status Frequency (fi) 25 20 8 7 60 2. Discrete Frequency Distribution  Count the number of times each possible value is repeated Example: In a survey of 30 families, the number of children per family was recorded and obtained the following data: 4 2 4 3 2 8 3 4 4 2 2 8 5 3 4 5 4 5 4 3 5 2 7 3 3 6 7 3 8 4. Compiled by: Bacha E., Applied Mathematics, ASTU Data Presentation… These individual observations can be arranged in ascending order of magnitude to from an array: 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 6, 7, 7, 8, 8, 8.  The distribution of children in 30 families would be: No. of 2 3 4 5 6 7 8 Total Children Frequency 5 7 8 4 1 2 3 30 (fi) Compiled by: Bacha E., Applied Mathematics, ASTU Compiled by: Bacha E. Applied Mathematics, ASTU 12 2/18/2021 3. Continuous Frequency Distribution:- Continuous FD’s arise from continuous variables. - When the range of the data is large, the data must be grouped in to classes that are more than one unit in width. Basic Terms in a continuous frequency distribution  Class Frequency (or frequency):- refers to the number of items belonging to a class.  Class limits (C.L.):- It divided into two. i) Lower class limit (LCL) ii) Upper class limit (UCL) Example: Consider the mark of 40 students out of 60 given below. 𝐿𝐶𝐿1 = 6 , 𝐿𝐶𝐿2 = 12, … , 𝐿𝐶𝐿6 = 36 𝑈𝐶𝐿 Compiled1by:= 11, Bacha 𝑈𝐶𝐿 E., Applied 2 = 17, Mathematics, ASTU … , 𝑈𝐶𝐿6 = 41 Table 1: The mark of 40 students out of 60 Class No of Class Class Relative % R.F L.C.F M.C.F limits Students Boundary Mark frequency (Mark) (fi) (C.B) (C.M) (R.F) 6 - 11 2 5.5-11.5 8.5 0.05 5 2 40 12 - 17 4 11.5-17.5 14.5 0.1 10 6 38 18 - 23 10 17.5-23.5 20.5 0.25 25 16 34 24 - 29 16 23.5-29.5 26.5 0.4 40 32 24 30 - 35 5 29.5-35.5 32.5 0.12 12 37 8 36 - 41 3 35.5-41.5 38.5 0.08 8 40 3 Compiled by: Bacha E., Applied Mathematics, ASTU Compiled by: Bacha E. Applied Mathematics, ASTU 13 2/18/2021 Unit of Measure  Unit of Measure (U):- is the difference b/n any two successive (consecutive) of upper and lower class limits. 𝑈 = 𝐿𝐶𝐿𝑖+1 − 𝑈𝐶𝐿𝑖 e.g. From the above example, 𝐿𝐶𝐿2 = 12 & 𝑈𝐶𝐿1 =11 𝑈 = 𝐿𝐶𝐿2 − 𝑈𝐶𝐿1 = 12 − 11 = 𝟏 Compiled by: Bacha E., Applied Mathematics, ASTU Continuous F.D …. Class Boundary (C.B) - Add half the unit of measure on all upper class limits to get the upper class boundary (UCB) - Subtract half the unit of measure from all lower class limits to get the lower class boundary (LCB). That is LCBi  LCLi  U 2 UCBi  UCLi  U 2 e.g. Using the above example find the lower and upper class boundaries. Table one.pptx 𝐿𝐶𝐵1 = 5.5, 𝐿𝐶𝐵2 = 11.5, …. 𝐿𝐶𝐵6 =35.5 𝑈𝐶𝐵1 = 11.5, 𝑈𝐶𝐵2 = 17.5, … 𝑈𝐶𝐵6 = 41.5 Compiled by: Bacha E., Applied Mathematics, ASTU Compiled by: Bacha E. Applied Mathematics, ASTU 14 2/18/2021 Continuous F.D ….  Class Mark (C.M):- is the mid-point of a class interval. It is obtained as: LCLi  UCLi LCBi  UCBi C.M i  orC.M i  2 2 e.g. Using the above example find the class marks. Table one.pptx c.m1  8.5, c.m4  26.5 c.m2  14.5 c.m5  32.5 c.m3 Bacha Compiled by:  E.,20.5 Applied Mathematics, ASTU c.m4  38.5 Continuous F.D …. Class Width (w):- is the difference b/n any two successive (consecutive) of LCL or UCL or LCB or UCB or class marks. That is: w  LCLi 1  LCLi w  LCBi 1  LCBi w  UCLi 1  UCLi or or w  C.M i 1  C.M i w  UCBi 1  UCBi e.g. Calculate the class width of the above example. Table one.pptx Solution: w  LCLi 1  LCLi  LCL2  LCL1  12  6  6 Compiled by: Bacha E., Applied Mathematics, ASTU Compiled by: Bacha E. Applied Mathematics, ASTU 15 2/18/2021 Relative and Percentage frequency distribution Relative frequency (R.F) :- is the number of objects/cases per category divided by the total number of objects. - it gives proportions for each category out of the total. classfrequency f i R.F   totalfrque ncy n Percentage frequency distribution fi %R.F  100% n Example: Table one.pptx Compiled by: Bacha E., Applied Mathematics, ASTU Cumulative Frequency Distribution 1. Less than cumulative frequency (L.C.F):- is obtained by adding the frequency of all the preceding classes including the frequency of that class. - In other word L.C.F is the total number of observations less than the UCB of that class. 2. More than cumulative frequency (M.C.F) :- is also obtained by adding the frequency of all the succeeding classes including the frequency of that class - M.C.F is the total number of observations greater than the LCB of that class. Example: Using the above example find the L.C.F and M.C.F Table one.pptx Compiled by: Bacha E., Applied Mathematics, ASTU Compiled by: Bacha E. Applied Mathematics, ASTU 16 2/18/2021 Constructing a continuous frequency distribution  Practical steps in constructing continuous frequency distribution 1. Determine the number of classes (k) Using Sturges‟ rule-of-thumb: k = 1 + 3.322 log n where k is the number of classes, log is common logarithm n is the total number observations in our sample 2. Determine the Class Width (w) 𝑅𝑎𝑛𝑔𝑒 𝑤= and rounded to the nearest integer. 𝑘 Range = largest value – smallest value R = L – S and rounded to the nearest integer Compiled by: Bacha E., Applied Mathematics, ASTU 3. Determine the Class Limits  The lower class limit of the first class should be less than or equal to the smallest value of the observations collected from the field  Add the class width on the lower class limit to obtain the lower class limit of the next higher class.  Subtract the unit of measure from 2nd LCL to obtain the 1st UCL.  Then Add the class width on the UCL to obtain the upper class limit of the next higher class. Example:- Construct a continuous frequency distribution for the following raw data on marks (out of 100) obtained by 50 students in Statistics course. 57, 53, 65, 55, 50, 45, 64, 52, 15, 46, 42, 63, 33, 64, 53, 25, 54, 35, 48, 55, 70, 47, 39, 58, 52, 36, 65, 75, 26, 20, 55, 60, 83, 61, 45, 63, 49, 42, 35, 18, 51, 45, 42, 65, 39, 59, 45, 41, 30, 40. Compiled by: Bacha E., Applied Mathematics, ASTU Compiled by: Bacha E. Applied Mathematics, ASTU 17 2/18/2021 Cont… Solution: n = 50, L = 83, S = 15. Then  k = 1 + 3.322 log n= 1+3.322 (log50) = 6.64 ≈ 7  R = L – S =83 – 15 = 68  w = R/k = 68/7 = 9.71 ≈ 10 Table 2: The marks of 50 students (out of 100) obtained in Statistics course. Class 15 -24 25-34 35-44 45-54 55-64 65-74 75-84 Total limits fi 3 4 10 15 12 4 2 50 C.B C.M R.Compiled F by: Bacha E., Applied Mathematics, ASTU L.C.F M.C.F Diagrammatic and Graphical Method of Data Presentation i) Diagrammatic Presentation of Data  It usually used to present qualitative and discrete data.  The common diagrammatic presentation of data are: 1. Bar Chart i) Simple bar chart ii) Component (subdivided) bar chart 2. Pie Chart Bar Chart:- Used to represent & compare the frequency distribution of discrete data and attributes or categorical data. - Bars can be drawn either vertically or horizontally. Compiled by: Bacha E., Applied Mathematics, ASTU Compiled by: Bacha E. Applied Mathematics, ASTU 18 2/18/2021 Cont…  Simple bar chart:- used to display data on one variable,  They are thick lines (narrow rectangles) having the same breadth (size).  It used for aggregate data  Component Bar chart  It used when there is a desire to show how a total (or aggregate) is divided into its component parts. Example: Number of students in the four department of Science College given as follows:  Draw simple and component bar charts. Compiled by: Bacha E., Applied Mathematics, ASTU Bar chart … Department Physics Maths Chemistry Biology Number of 200 400 450 600 students Male 170 350 250 200 Simple bar chart Sub-divided bar chart Female 30 50 200 400 800 600 800 Female Frequency 600 400 450 600 400 Frequency 400 Male 200 200 200 0 0 Phys Maths Chem Bio Phys Maths Chem Bio Deprtment Department Compiled by: Bacha E., Applied Mathematics, ASTU Compiled by: Bacha E. Applied Mathematics, ASTU 19 2/18/2021 Pie Chart Pie chart:- is a circle that is divided in to sections or wedges according to the percentage of frequencies in each category of the distribution. Example: Draw a pie chart to represent the following population data in a town. Men Women Girls Boys 2500 2000 4000 1500 Solution: First find the percentage of each class. fi % 100% n Compiled by: Bacha E., Applied Mathematics, ASTU Pie chart ….. Class Frequence (fi) percentage Men 2500 25% Women 2000 20% Girls 4000 40% Boys 1500 15% Boys Men 15% 25% Girls Women 40% 20% Compiled by: Bacha E., Applied Mathematics, ASTU Compiled by: Bacha E. Applied Mathematics, ASTU 20 2/18/2021 Pie chart …..  Advantages - Pie chart can:  display relative proportions of multiple classes of data  show areas proportional to the number of data points in each category  summarize a large data set in visual form  be visually simpler than other types of graphs  permit a visual check of the reasonableness or accuracy of calculations  Disadvantages - Pie charts can:  reveal little about central tendency, dispersion, skew, or kurtosis  be easily manipulated to yield false impressions Compiled by: Bacha E., Applied Mathematics, ASTU Exercise Construct a sub-divided bar chart for the four types of products in relation to the opinion of consumers purchasing the given products as given below: Products Definitely Probably Unsure No Product 1 50% 40% 10% 2% Product 2 60% 30% 12% 15% Product 3 70% 45% 8% 8% Product 4 60% 35% 5% 20% Compiled by: Bacha E., Applied Mathematics, ASTU Compiled by: Bacha E. Applied Mathematics, ASTU 21 2/18/2021 Graphical Presentation of data  Graphical presentation of data is used to present a continuous data.  The common graphical presentation of data are: 1. Histogram 2. Frequency polygon 3. Cumulative frequency curves (Ogives)  Less than ogive (Less than cumulative frequency curves )  More than ogive (More than cumulative frequency curves ) Compiled by: Bacha E., Applied Mathematics, ASTU Histogram  To construct a histogram,  the class boundaries or the class marks are plotted on the horizontal axis and  the class frequencies are plotted on the vertical axis. Example: Draw histogram for the marks of 50 students (out of 100) obtained in Statistics course. Table Two.pptx Compiled by: Bacha E., Applied Mathematics, ASTU Compiled by: Bacha E. Applied Mathematics, ASTU 22 2/18/2021 Frequency Polygon  A frequency polygon is a line graph where class frequencies are plotted against the class marks and the successive points are connected by straight lines. Example: Draw frequency polygon for the marks of 50 students (out of 100) obtained in Statistics course. Table Two.pptx Compiled by: Bacha E., Applied Mathematics, ASTU Cumulative frequency curves (Ogives)  To draw less/ more than ogive the less/more than cumulative frequencies are plotted against upper/lower class boundaries of their respective classes and they are joined by either straight lines or smooth curves. Example: Draw cumulative frequency curves for the marks of 50 students (out of 100) obtained in Statistics course. Table Two.pptx Compiled by: Bacha E., Applied Mathematics, ASTU Compiled by: Bacha E. Applied Mathematics, ASTU 23 2/18/2021 Exercise  Draw (a) histogram (b) frequency polygon (c) Ogive for the following frequency distribution of grades in a final examination in introduction to statistics. Compiled by: Bacha E., Applied Mathematics, ASTU Compiled by: Bacha E., Applied Mathematics, ASTU Compiled by: Bacha E. Applied Mathematics, ASTU 24

Use Quizgecko on...
Browser
Browser