Business Statistics Notes PDF
Document Details
Uploaded by EliteChrysanthemum5505
Amity Directorate of Distance & Online Education
Tags
Summary
These notes from the Amity Directorate of Distance & Online Education provide an introduction to business statistics, including definitions, functions, limitations, and methods of data collection. It also categorizes the study of statistics into descriptive and inferential statistics.
Full Transcript
Business Statistics 1 Module-1: Introduction to Statistics Notes...
Business Statistics 1 Module-1: Introduction to Statistics Notes e Objectives in 1. To get introduced with limitations, applications and functions of statistics 2. To discuss data collection and presentation techniques nl Outcomes 1. The learner will be able to utilize the knowledge of statistics in answering statistical O questions “Statistics are measurements, enumerations or estimates of natural phenomenon, usually systematically arranged, analyzed and presented as ity to exhibit important interrelationships among them” According to A.M Tuttle- 1.1 Introduction s The word Statistics is derived from the Italian word ‘Stato’ which means ‘state’; and the word ‘Statista’ refers to a person who is involved with the affairs of state. Thus, er statistics originally was meant for collection of facts useful for affairs of the state, like the taxes, land records, population demography, etc. There is an evidence of use of some of the principles of statistics by ancient Indian civilization as well. Some of the v techniques have found their mention in Vedic Mathematics. However, the modern statistical methods spread from Italy to France, Holland and Germany in 16th century. ni Definitions of Statistics The definitions of statistics are as follows: “Statistics are the classified facts U representing the conditions of the people in the state. Specially those facts which can be stated in number or in table of numbers or in any tabular or classified arrangement.” – Webster ity “Statistics may be defined as the science of collection, presentation, analysis and interpretation of data.” – Croxton and Cowden “By statistics we mean quantitative data affected to a marked extent by multiplicity of causes.” –Yule and Kendall Functions of Statistics m 1. Condensation: Statistics compresses a mass of figures to small meaningful information, for example, average sales, BSE index, the growth rate etc. It is impossible to get a precise idea about the profitability of a business from a )A mere record of income and expenditure transactions. The information of Return OnInvestment (ROI), Earnings Per Share (EPS), profit margins, etc., however, can be easily remembered, understood and thus used in decision-making. 2. Forecast: Statistics helps in forecasting by analyzing trends, which are essential (c for planning and decision-making. Predictions based on the gut feeling or hunch can be harmful for the business. For example, to decide the refining capacity for a petrochemical plant, it is required to predict the demand of petrochemical product Amity Directorate of Distance & Online Education 2 Business Statistics mix, supply of crude oil, the cost of crude, substitution products, etc., for next 10 to Notes e 20 years, before committing an investment. 3. Testing of hypotheses: Hypotheses are the statements about population parameters in based on past knowledge or information. It must be checked for its validity in the light of current information. Inductive inference about the population based on the sample estimates involves an element of risk. However, sampling keeps the decision- nl making costs low. Statistics provides quantitative base for testing our beliefs about the population. 4. Relationship between Facts: Statistical methods are used to investigate the cause O and effect relationship between two or more facts. The relationship between demand and supply, money-supply and price level can be best understood with the help of statistical methods. 5. Expectation: Statistics provides the basic building block for framing suitable policies. ity For example how much raw material should be imported, how much capacity should be installed, or manpower recruited, etc., depends upon the expected value of outcome of our present decisions. s 1.2 Limitations of Statistics Statistical techniques, because of their flexibility have become popular and er are used in numerous fields. But statistics is not a cure-all technique and has few limitations. It cannot be applied to all kinds of situations and cannot be made to answer all queries. The major limitations are: v 1. Statistics deals with only those problems, which can be expressed in quantitative terms and amenable to mathematical and numerical analysis. These are ni not suitable for qualitative data such as customer loyalty, employee integrity, emotional bonding, motivation etc. 2. Statistics deals only with the collection of data and no importance is attached to U an individual item. 3. Statistical results are only an approximation and not mathematically correct. There is always a possibility of random error. ity 4. Statistics, if used wrongly, can lead to misleading conclusions, and therefore, should be used only after complete understanding of the process and the conceptual base. 5. Statistics laws are not exact laws and are liable to be misused. m 6. The greatest limitation is that the statistical data can be used properly only by a profressional. A person having thorough knowledge of the methods of statistics and proper training can only come to conclusions. )A 7. If statistical data are not uniform and homogenous, then the study of the problem is not possible. Homogeneity of data is essential for a proper study. 8. Statistical methods are not the only method for studying a problem. There are other methods as well, and a problem can be studied in various ways. (c Amity Directorate of Distance & Online Education Business Statistics 3 1.3 Data Collection Notes e The collection and analysis of data constitute the primary stages of execution of any statistical investigation. The procedure for collection of data depends upon various in considerations such as the scope, objective, nature of investigation, etc. Availability of resources such as time, money, manpower, etc., also affect the procedure choice. Data may be collected either from a primary or from a secondary source, which are nl described below. Types of Data – Primary and Secondary O The data used in statistical study is termed as either ‘primary’ or ‘secondary’ depending upon whether it was collected specifically for the undertaken study or for some other purpose. ity When the data used in a statistical study is collected under the control and supervision of the investigator, such type of data is referred to as ‘primary data’. Primary data is collected afresh and for the first time, and thus, happen to be original in character. On the other hand, when the data is not collected for this purpose, but is derived from other sources then such data is referred to as ‘secondary data’. Often, s secondary data is collected by some other organization to satisfy their needs, but it is used by someone else for entirely different reasons. er The difference between primary and secondary data is only in terms of degree. For example, data, which are primary in the hands of one, becomes secondary in hands of other. Suppose an investigator wants to study the working conditions of labourers in v an industry. If the investigator or their agent collects the data directly, then it is called a ‘primary data’. But if subsequently someone else uses this collected data for some ni other purpose, then this data becomes a ‘secondary data’. 1.4 Types of Statistics U The study of statistics can be categorized into two main branches. These branches are descriptive statistics and inferential statistics. Descriptive statistics is used to sum up and graph the data for a category picked. ity This method helps to understand a particular collection of observations. A sample is defined on descriptive statistics. There is no confusion in concise numbers, since you just identify the individuals or things which are calculated. Descriptive statistics give information that describes the data in some manner. For m example, suppose a pet shop sells cats, dogs, birds and fish. If 100 pets are sold, and 35 out of the 100 were dogs, then one description of the data on the pets sold would be that 35% were dogs. )A Inferential statistics are techniques that allow us to use certain samples to generalize the populations from which the samples were taken. Hence, it is crucial that the sample represents the population accurately. The method to get this done is called sampling. Since the inferential statistics aim at drawing conclusions from a sample and generalizing them to a population, we need to be sure that our sample accurately (c represents the population. This requirement affects our process. At a broad level, we must do the following: Amity Directorate of Distance & Online Education 4 Business Statistics Define the population we are studying. Notes e Draw a representative sample from that population. Use analyses that incorporate the sampling error. in 1.5 Methods of Collecting Data nl Methods of collecting primary data Generally, for managerial decision-making, it is necessary to analyze information regarding a large number of characteristics. Collection of primary data can be time O consuming, expensive, and hence requires a great deal of deliberation. According to the nature of information required, one of the following methods or their combination can be selected. 1. Observation Method: In this method, the investigator collects the data through ity personal observations. This method is very useful if data is created in the system through capturing transactions. Computerized transaction processing can be modified to generate necessary data or information. An investigator well versed with the system or a part of the system is ideally suited for collecting this kind of data. Since s the investigator is solely involved in collecting the data, their training, knowledge and skills play an important role as far as the quality of the data is concerned. Sometimes, er the audio/video aids can also be used to record the observations. 2. Indirect Investigation: In this case, the information collected by oral or written interrogation forms the primary data. Usually enquiry commissions, board of v investigations, investigation teams and committees collect data in this manner. Quality of the data largely depends upon the person interviewed, their motives, ni memory, overall cooperation, and the interviewer’s repute with the person being interviewed. 3. Questionnaire with Personal Interview: This is the most common and popular U method for data collection. In this method, individuals are personally interviewed and answers are recorded to collect the data. Questionnaire is structured and followed in specific sequence. Occasionally, a part of the questionnaire may be unstructured to motivate the interviewee to give additional information or information on intimate ity matters. Accuracy of the data depends on the ability, sincerity and the tactfulness of the interviewer to conduct the interview in friendly and professional environment. 4. Mailed Questionnaire: In this method, the structured questionnaire is mailed to selected people with a request to fill it and return. Along with the questions, the m supplementary information clarifying terms, explaining process, etc., is also attached. In a few cases, inducements for filling and returning the questionnaire are also given. To develop a report, the covering of a letter with a questionnaire is necessary to )A explain the reason for the data collection and, if any, to alleviate the respondent’s fears. The respondents are believed to be literate and be able to answer the questions without any confusion. This is a less expensive and faster method to collect large volume of data, over a wide geographic area, in a standard form, and at the convenience of the respondent. Hence this method is most popular and extensively (c used. However, this method needs a guard against two drawbacks viz. The absence of an interviewer, which results in a large proportion of the non-response and the Amity Directorate of Distance & Online Education Business Statistics 5 possibility of reducing the reliability of the replies if the respondent is not sufficiently Notes e motivated. These shortcomings can be overcome by increasing sample size and designing the questionnaire comprehensively. in 5. Telephonic Interview: This method is less expensive but has limited in scope, as the respondent must possess a telephone and has it listed. Further, the respondent must be available and in the frame of mind to provide correct answers. This method nl is comparatively less reliable for public surveys. However, for industrial survey, in developed regions, and with known customers, this method is best suited. There is a limit to the number of questions that the interviewee could answer in three to four minutes. The mthod is efficient If there are just three to five yes/no type questions O and two to three short questions. 6. Internet Surveys: Of late, Internet surveys have become popular. These are less expensive, fast and can be interactive. However, its scope is limited to those who have ity regular Internet access. With rapid growth in technology and Internet connectivity it would be one of the main methods of collecting primary data. With its interactivity and multimedia facilities it also combines the advantages of other methods. Methods of Collecting Secondary Data s Secondary data is one that has been collected or analyzed by some other agency er for another purpose. Sources of secondary data are - 1. Publications of central, state and local governments. This is an important and reliable source to get unbiased data. v 2. Publications of foreign governments or of international bodies. Although it is a good source, context under which it is collected needs to be verified before ni using this data. For international situations this data could be very useful and authentic. 3. Journals of trade, commerce, economics, scientific, engineering, medicine, etc. U This data could be very reliable for a specific purpose. 4. Other published sources like books, magazines, reports, newspapers, etc. 5. Unpublished data, based on internal records and documents of an organization ity can provide most authentic and much cheaper information provided we could identify the source. 6. Diaries, letters, mailers can also provide secondary data. The problem with the unpublished data is that it’s difficult to locate and get access. m Applications of Statistics Data is a collection of any number of related observations. We can collect the )A number of telephones installed in a given day by several workers or the numbers of telephones installed per day over a period of several days by one worker and call the results our data. A collection of data is called a data set and a single observation is called as a data point. Data is used everywhere in day to day life. It is applicable in very large number (c of fields such as economics, management, sociology, anthropology, agriculture, Amity Directorate of Distance & Online Education 6 Business Statistics medicine, psychology, education. All the fields lean heavily on data and its analysis. The Notes e application of data is so vast and ever expanding that it is very difficult to define. Its use has permeated almost in every facet of our lives. in Application of Statistics in Business Decision Statistics is not restricted to only information about the State, but it also extends nl to almost every realm of the business. Statistics is about scientific methods to gather, organize, summarize and analyze data. More important still is to draw valid conclusions and make effective decisions based on such analysis. To a large degree, company performance depends on the preciseness and accuracy of the forecast. Statistics O is an indispensable instrument for manufacturing control and market research. Statistical tools are extensively used in business for time and motion study, consumer behaviour study, investment decisions, credit ratings, performance measurements and compensations, inventory management, accounting, quality control, distribution channel ity design, etc. For managers, therefore, understanding statistical concepts and knowledge about using statistical tools is essential. With an increase in a company’s size and market uncertainty due to reduced competition, the need for statistical knowledge and s statistical analysis of various business circumstances has greatly increased. Prior to this, when the size of business used to be small without much complexities, a single er person, usually owner or manager of the firm, used to take all decisions regarding the business. Example: A manager used to decide, from where the necessary raw materials and other factors of production were to be acquired, how much of output will v be produced, where it will be sold, etc. This type of decision making was usually based on experience and expectations of this single individual and as such had no scientific ni basis. 1.6 Classification of Data U Classification refers to the grouping of data into homogeneous classes and categories. It is the process of arranging things in groups or classes as per their resemblances and affinities. ity Rules of Classification - The principal rules of classifying data are: 1. To condense the data mass in such a way that salient features can be readily noticed; for example, household incomes can be grouped as higher income group, middle- income group and lower income group based on certain criterion. m 2. To facilitate comparison between attributes of variables; for example, comparison between education and income, income and expenditure on consumer durables, etc. )A 3. To prepare the data for tabulation. 4. To highlight the significant features of the data; for example, data is concentrated on one side, or one particular value may be dominant. 5. To enable grasping of data. (c 6. To study the relationship formed. Amity Directorate of Distance & Online Education Business Statistics 7 Bases of Classification Notes e Some common types of bases of classification are: in 1. Geographical classification: In this type, data is classified according to area or region. For example, State wise industrial production, city wise consumer behaviour, area wise sales figures, etc. nl 2. Chronological classification: In this type, data is classified according to the time of its occurrence; for example, monthly sales, daily demand, yearly production, etc. 3. Qualitative classification: When the data is classified according to some attributes, O which are not capable of measurement, it is known as qualitative classification. In dichotomous classification, an attribute is divided into two classes, one possessing the attribute and other not possessing it; for example, smoker, non-smoker, employed, unemployed, etc. In many-fold classification, attribute is divided so as to form several ity classes like education level, religion, mother tongue, etc. 4. Classification of data according to characteristics: It refers to the classification of data according to some characteristics which can be measured; for example, age, salary, height, etc. Quantitative data may be further classified into two types namely s discrete and continuous. In case of discrete type, values of the variables taken are countable (could be infinitely large also for example, integers). Examples of these er are number of accidents, number of defectives, etc. In case of continuous quantities, data can take any real values; for example, weight, height, distance, volume, etc. 1.7 Methods of Re-presentation of Data v One of the most convincing and appealing ways in which statistical results may be ni represented is through graphs and diagrams. Diagrams and graphs are extremely used because of the following reasons: U (i) Diagrams and Graphs attract to the eye. (ii) They have more memorizing effect. (iii) It facilitates for easy comparison of data from one period to another. ity (iv) Diagram and graphs give bird’s eye view of entire data; therefore, it conveys meaning very quickly. a. Bar Diagram m In a bar diagram, only the length of the bar is taken into account but not the width. In other words bar is a thick line whose width is merely shown, but length of the bar is taken into account and is called one-dimensional diagram. )A Simple Bar Diagram It represents only one variable. Since these are of the same width and vary only in lengths (heights), it becomes very easy for a comparative study. Simple bar diagrams are very popular in practice. A bar chart can be either vertical or horizontal; for example (c sales, production, population figures etc. for various years may be shown by simple bar charts Amity Directorate of Distance & Online Education 8 Business Statistics Illustration - 1 Notes e The following table gives the birth rate per thousand of different countries over a certain period of time. in New Country India Germany U. K. Sweden China Zealand nl Birth Rate 33 16 20 30 15 40 40 B 40 Simple Bar Diagram O I 35 30 r 30 t 25 h 20 ity 20 16 15 R 15 a 10 t 5 s e 0 India Germ UK New Swe Chin eran zeala nd den a Countries Comparing the size of bars, China’s birth rate is highest, next is India whereas Germany and Sweden equal in the lowest positions. v Illustration 2 - Represent the data by using a simple bar diagram. ni Countries: A B C D E F Production of U Rice (000’s 38 42 29 28 18 11 tons): Production of Rice (000's Tons ity 50 42 40 38 30 29 28 18 m 20 11 10 )A 0 A B C D E F Sub-divided BarDiagram In a subdivided bar diagram, each bar representing the magnitude of given value is (c further subdivided into various components. Each component occupies a part of the bar proportional to its share in total. Amity Directorate of Distance & Online Education Business Statistics 9 Illustration - Notes e Present the following data in a sub-divided bar diagram. in Year/Faculty Science Humanities Commerce 2014-2015 240 560 220 2015-2016 280 610 280 nl Y 1400 O 1200 1000 Scale: 1 cm = 200 800 Index ity Sci 600 Hum 400 Com 200 s er X 2014-15 2015-16 Illustration – 2 The Number of Students in University X during 2008 to 2011 areas follows. v Represent the data by a similar diagram. ni Year Arts Commerce Science Total 2008 - 2009 20,000 10,000 5,000 35,000 2009 - 2010 26,000 9,000 7,000 42,000 U 2010 - 2011 31,000 9,500 7,500 48,000 60000 N u ity m 50000 b 75 e, e i e nc ,7 0 Sc 00 r 40000 ce ie n rce 50 Sc 00 m me 0 e, Co , 950 m o e nc rce f 30000 i Sc 00 m me 0 Co , 900 rce s m me 0 0 t Co 100 , )A u 20000 0 31 d t s, 2 00 r A 00 e 0 s, 20 Art 00 n 10000 r t s, A 00 t s (c 0 2008-2009 2009-2010 2010-2011 Years Amity Directorate of Distance & Online Education 10 Business Statistics Multiple Bar Diagram Notes e In a multiple bar diagram, two or more sets of related data are represented and the components are shown asseparate adjoining bars. The height of each bar represents in the actual value of the component. The components are shown by different shades or colours. Illustration 1 - Construct a suitable bar diagram for the following data of number of nl students in two different colleges in different faculties. College Arts Science Commerce Total O A 1200 800 600 2600 B 700 500 600 180 1800 = College 'A' ity 1600 = College 'B' 1400 1200 s 1200 No. of students 1000 er 800 800 700 600 600 v 600 500 ni 400 200 U ARTS SCIENCE COMMERCE Different departments ity Fig: A multiple bar diagram showing numbers of students in two different colleges in differentdepartments. Illustration 2 Read the following data of results of III semester. B.B.A.examination of Mangalore m University held in May 2006, 2007 and 2008 in a multiple bar diagram Year Class I Class II Class III Failed )A 2006 100 300 500 300 2007 120 400 600 280 2008 100 500 700 300 Percentage bar Diagram (c In percentage bar diagram the length of the entire bar kept equal to 100 (Hundred). Various segment of each bar may change and represent percentage on an aggregate. Amity Directorate of Distance & Online Education Business Statistics 11 Illustration 1 Notes e Year Men Women Children in 1995 45% 35% 20% 1996 44% 34% 22% 1997 48% 36% 16% nl 700 600 O 500 400 Ist Class ity 300 IInd Class 200 IIIrd Class Failed 100 0 s 2006 2007 2008 er 1.8 Line Graph A line graph is a type of chart used to show information changing over time. We use multiple dots to plot line graphs connected by straight lines. It is also known as a v line chart. The line graph consists of two axes, defined as the axis ‘x’ and the axis ‘y.’ The horizontal axis is known as the x axis ni The vertical axis is known as the y axis Plotting a line graph U Plotting a line graph is easy. There are simple steps to consider while plotting a line graph. Draw the x-axis and y-axis on the graph paper. Make sure to write the title ity above the table so that it determines the purpose of the graph. For instance, if one of the factors is time, it goes on the horizontal axis, referred to as the x-axis. The other factor would subsequently go on the vertical axis, which is known as the y-axis. Both the axes are to be labeled as m per their respective factors. For example, The x axis can be labeled as time or day. Afterward, with the help of the already given data, the exact values on the )A graph can be pointed. Once the points are joined, a clear inference about the trend can be made. 1.9 Pie Chart A pie chart or a circle chart is a circular statistical graphic that is divided into (c slices to illustrate a numerical proportion. In a pie chart, the arc length of each slice is proportional to the quantity it represents. While it is named for its resemblance to a Amity Directorate of Distance & Online Education 12 Business Statistics pie which has been sliced, there are variations on the way it can be presented..In a pie Notes e chart, categories of data are represented by wedges in the circle and are proportional in size to the percent of individuals in each category. in Pie charts are very widely used in the business world and the mass media. Pie charts are generally used to show percentage or proportional data and usually the percentage represented by each category is provided next to the corresponding slice of nl pie. Pie charts are good for displaying data for around six categories or fewer. Example: O Show the following data of expenditure of an average working class family by a suitable diagram Item of Expenditure Percent of Total Expenditure ity Food 65 Clothing 10 Housing 12 Fuel and Lighting 5 s Miscellaneous er 8 Solution: 1. Food = 65/ 100 x 360 = 234 2. Clothing = 10/ 100 x 360 = 36 v 3. Housing = 12/ 100 x 360 = 43.2 ni 4. Fuel and Lighting = 5/ 100 x 360 = 18 5. Miscellaneous = 8/ 100 x 360 = 28.8 The angles of different sectors are calculated as shown below: U Food Pie Chart ity m )A 1.10 Frequency Distribution (c Classification of data shows the different values of a variable and their respective frequency of occurrence is called a frequency distribution of the values. Amity Directorate of Distance & Online Education Business Statistics 13 There are two kinds of frequency distributions, namely, discrete frequency Notes e distribution (or simple, or ungrouped frequency distribution), and continuous frequency distribution (or condensed or grouped frequency distribution). in a. Discrete Frequency Distribution The process of preparing discrete frequency distribution is simple. First, all the nl possible values of variables are arranged in ascending order in a column. Then another column of ‘Tally’ mark is prepared to count the number of times a particular value of the variable is repeated. To facilitate counting, a block of five ‘Tally’ marks is prepared. The last column contains frequency. To illustrate this let us consider one example. O Example: Construct frequency distribution table for the following data of number of family ity members in 30 families: 4 3 2 3 4 5 5 7 3 2 3 4 2 1 1 6 3 4 5 4 2 7 3 4 5 6 2 1 5 3 s Number of Family ‘Tally Marks’ er Frequency Members 1 ||| 3 v 2 |||| 5 3 |||| || 7 ni 4 |||| | 6 5 |||| 5 6 || 2 U 7 || 2 Total N = 30 b. Continuous Frequency Distribution ity For continuous data a ‘grouped frequency distribution’ is necessary. For discrete data, discrete frequency distribution is better than array, but this does not condense the data. ‘Grouped frequency distribution’ is useful for condensing discrete data by putting them into smaller groups or classes called class-intervals. Some important terms used m in case of continuous frequency distribution are as follows: 1. Class limits: Class limits denote the lowest and highest value which can be included in the class. The two boundaries of class are known as the lower limit and upper limit )A of the class. For example, 10-18, 20-28, where 10 and 18 are limits of the first class; 20 and 28 are limits of second class,etc. 2. Class intervals: The class interval represents the width, the span or the size of a class. The width may be determined by subtracting the lower limit of one class from the lower limit of the following class. For example, classes 10-15, 15-20, etc have (c class interval 20 – 15 =5. Amity Directorate of Distance & Online Education 14 Business Statistics 3. Class frequency: The number of observations falling within a particular class Notes e is known as its class frequency. Total frequency indicates the total number of observations N =Σf. in 4. Mid-point of a class is defined as the sum of two successive lower limits divided by two. Thus class mark is the value lying halfway between lower and upper class limits. For example, classes 10-20, 20-30, etc have nl class marks 15, 25etc. 5. Types of class intervals: There are many different ways in which limits of class intervals can beshown. O 6. Exclusive method: In this method, the class intervals are so arranged that upper limit of one class is the lower limit of next class. This method always presumes that the upper limit is excluded from the class, for example, with class limits 20-25, 25-30 observation with value 25 is included in class25-30. ity 7. Inclusive method: In this method, the upper limit of the class is included in the same class itself. In such case there is no overlap of upper limit of former class and lower limit of successive class. For example, with class limits 20-29.5, 30-39.5, 40- 49.5, etc. there is no ambiguity but values from 29.5 to 30 or 39.5 to 40 etc. are not s allowed. er 8. Open end: In an open-end distribution, the lower limit of the very first class or upper limit of the last class is not given. For example, while stating the distribution of monthly salary of managers in rupees, one may specify class limits as, below 10000, 10000-15000, 15000-20000, 20000-25000, above 25000. Similarly, while recording v weights of college students in kg as grouped data the class intervals could be less than 40, 40 to 50, 50 to 60, 60 to 70, 70 to 80, greater than80. ni 9. Unequal class interval: The method Is also used to limit the class intervals where the width of the classes is not equal for all classes. This method is of practical use when there are large gaps in the data, or distribution of the data is uneven. It is used U for explaining, visualizing and plotting data with unequal class interval. However, we must adjust formulae for calculationsaccordingly. Cumulative and Relative Frequency ity In many situations rather than listing the actual frequency opposite each class, it may be appropriate to list either cumulative frequencies or relative frequencies orboth. Cumulative Frequencies m The cumulative frequency of a given class interval thus, represents the total of all the previous class frequencies including the class against which it is written. )A Relative Frequencies Relative frequency is obtained by dividing the frequency of each class by the total number of observations ie. the totalfrequency. If the relative frequency is multiplied by 100, we get the percentagefrequency. (c There are two important advantages in looking at relative frequencies (percentages) instead of the absolute frequencies in a frequency distribution. Amity Directorate of Distance & Online Education Business Statistics 15 Theseare: Notes e Relative frequencies facilitate the comparison of two or more than sets ofdata. Relative frequencies constitute the basis of understanding the probability in concept. Example: Age of 50 employees is given. Find cumulative frequency, relative frequency and percentage frequency. nl 22 21 37 33 28 42 56 33 32 59 40 47 29 65 45 48 55 43 42 40 O 37 39 56 54 38 49 60 37 28 27 32 33 47 36 35 42 43 55 53 48 29 30 32 37 43 54 55 47 38 62 ity Class Class Cumulative Relative Percentage Interval Frequency Frequency Frequency Frequency 20-30 7 (0+7) = 7 7/50 = 0.14 14 s 30-40 16 (7+16) = 23 16/50 = 0.32 32 40-50 15 (23+15) = 38 15/50 = 0.30 30 er 50-60 9 (38+9) = 47 9/50 = 0.18 18 60-70 3 (47+3) = 50 3/50 = 0.06 6 N = f = 50 Total = 1 Total = 100 v ni A frequency distribution is constructed to satisfy three objectives: (i) to facilitate the analysis of data, (ii) to estimate frequencies of the unknown population distribution from the distribution of sample data, and (iii) to facilitate the computation of various statistical U measures. 1.11 Histogram A histogram consists of contiguous boxes and has both horizontal axis and a ity vertical axis. The horizontal axis is labeled with what the data represents (for instance, distance from your home to school). The vertical axis is labeled either Frequency or relative frequency. The graph will have the same shape with either label. The histogram (like the stemplot) can give you the shape of the data, the center, and the spread of the m data. (The next section tells you how to calculate the center and the spread.) The relative frequency is equal to the frequency for an observed value of the data divided by the total number of data values in the sample. (In the chapter on Sampling )A and Data (Section 1.1), we defined frequency as the number of times an answer occurs.) RF = f/n Where f is the frequency n is the total number of data values (or the sum of the (c individual frequencies), and RF is the relative frequency. Example – If 3 students in Mathematics class of 40 students received from 90% to Amity Directorate of Distance & Online Education 16 Business Statistics 100%, then, Notes e f = 3, n = 40 and in RF = f/n = 3/40 = 0.075 nl Seven and a half percent of the students received 90% to 100%. Ninety percent to 100% are quantitative measures. O Example: Formulate the Histogram from the following data – ity Class Interval Frequency 10.5 – 18.5 3 18.5 – 26.5 5 26.5 – 34.5 5 s 34.5 – 42.5 2 42.5 – 50.5 4 er 50.5 – 58.5 2 Solution: v Histogram ni U ity m )A 1.12 Frequency Polygons These are the frequencies plotted against the mid-points of the class-intervals and the points thus obtained are joined by line segments. On comparing the Histogram and a frequency polygon, in frequency polygons the points replace the bars (rectangles). (c Also, when several distributions are to be compared on the same graph paper, frequency polygons are better than Histograms. Amity Directorate of Distance & Online Education Business Statistics 17 Illustration 1 Notes e Draw a histogram and frequency polygon from the following data in Age in Years Number of Persons 10-20 3 20-30 16 nl 30-40 22 40-50 35 O 50-60 24 60-70 15 70-80 2 ity Scale along y axis 1cm = 5 units 35 30 Frequency polygon No. of persons 25 s 20 erHistogram 15 10 5 v 0 10 40 20 30 50 60 70 80 age ni Frequency polygon showing the distribution of persons of different age group Ogives When frequencies are added, they are called the cumulative frequencies. The U curve obtained by plotting cumulating frequencies is called a cumulative frequency curve or an ogive (pronounced as ojive). To constructan Ogive: (i)Add up the progressive totals off requencies, class by ity class, to get the cumulative frequencies. (ii) Plot classes on the horizontal (x-axis) and cumulative frequencies on the vertical (y-axis). Less than Ogive: To plot a less than ogive, data is arranged in ascending order of magnitude and frequencies are cumulated from the top i.e. adding. Cumulative frequencies m are plotted against the upper class limits.Ogives under this method, gives a positivecurve Greater than Ogive: To plot a greater than ogive, the data is arranged in the ascending order of magnitude and frequencies are cumulated from the bottom or )A subtracted from the total from the top. Cumulative frequencies are plotted against the lower class limits.Ogives under this method, gives negative curve Uses: Certain values like median, quartiles, quartile deviation, co-efficient of skewness etc. can be located using ogives. Ogives are helpful in the comparison of the (c two distributions. Amity Directorate of Distance & Online Education 18 Business Statistics Illustration 1 – Notes e Draw less than and more than ogive curves for the following frequency distribution and obtain median graphically. Verify the result. in CI 0-20 20-40 40-60 60-80 80-100 100-120 120-140 140-160 nl f 5 12 18 25 15 12 8 5 Size Icf Mcf Size O 20 5 100 0 40 17 95 20 60 35 83 40 ity 80 60 65 60 100 75 40 80 120 87 25 100 140 95 13 120 s 160 100 5 140 180 Y er 160 v 140 ni 120 100 Less than U 80 60 ity 40 More than 20 0 m X 20 40 60 80 100 120 140 160 180 Key takeaways )A Statistics: It is used as a general name for a large group of mathematical tools, not aiming at absolutely accurate results but the approximate results based on the probability theory, used to collect, analyse and inter pretnumerical facts for solving specific problems.The facts being dealt with in statistics must be capable of numerical expression; else they do not fall within the preview of statistics. (c Statistics is also concerned with a group of data, for example, population of acountry, sales price of the finished goods produced by a concernetc. Amity Directorate of Distance & Online Education Business Statistics 19 Sample: A sample consists one or more observations drawn from the population. Notes e Sample is the group of people who actually took part in your research. Population: A population includes all of the elements from a set of data. in Population is the broader group of people that you expect to generalize your study results to. Frequency Polygon: These are the frequencies plotted against the mid-points of nl the class-intervals and the points thus obtained are joined by line segments Bar Diagram: Only length of the bar is taken into account but not the width. In other wards bar is a thick line whose width is shown merely, but length of thebarist O akenintoaccountiscalledone-dimensionaldiagram. Simple Bar Diagram: It represents only one variable. Since these are of the same width and vary only in lengths (heights), it becomes very easy for ity comparativestudy.Simplebardiagramsareverypopularinpractice. Percentage bar diagram: the length of the entire bar kept equal to 100 (Hundred).Varioussegmentofeachbarmaychangeandrepresentpercentage on anaggregate. s Range: The ‘Range’ of the data is the difference between the largest value of data and smallest value of data. er Multiple bar diagram: It is where two or more set of related data are represented, and the components are shown as separate adjoining bars. The height of each bar represents the actual value of the component.The component sare shown by v different shades or colours. Deviation bars: They are used to represent the net quantities - excess or deficit ni i.e. net profit, net loss, net exports or imports etc. Such bars have both positive and negative values. Positive values lie above the base line and negative values lie below it. U Frequency Distribution: A frequency distribution is the principle tabular summary of either the discrete data or a continuous data. The frequency distribution shows actual, relative or the cumulative frequencies. Actual and relative frequencies may ity be charted as either histogram (a bar chart) or a frequency polygon. Histogram: A histogram consists of contiguous boxes and has both horizontal axis and a vertical axis. Cumulative frequency: cumulative frequency of a given class interval thus, m represents the total of all the previous class frequencies including the class against which it is written. Pie Chart: A pie chart or a circle chart is a circular statistical graphic that is divided )A into slices to illustrate a numerical proportion. In a pie chart, the arc length of each slice is proportional to the quantity it represents Check your progress: 1. Techniques that allow us to use certain samples to generalize the populations from (c which the samples were taken. a) Sample data Amity Directorate of Distance & Online Education 20 Business Statistics b) Descriptive Statistics Notes e c) Inferential statistics d) Probability in 2. In which technique the information collected by oral or written interrogation forms the primary data. nl a) Indirect investigation b) Descriptive Statistics c) Inferential statistics O d) Observation method 3. A type of chart used to show information changing over time ity a) Bar chart b) Pie chart c) Line graph d) Multiple bar diagram s 4. _________________ represents the total of all the previous class frequencies er including the class against which it is written. a) Cumulative frequency b) Pie chart v c) Relative frequency ni d) Multiple bar diagram 5. _______________ consists of contiguous boxes and has both horizontal axis and a vertical axis. U a) Cumulative frequency b) Pie chart c) Histogram ity d) Multiple bar diagram Questions & Exercises m 1. What is meant by Statistics? What are the functions of Statistics 2. What is data collection? Explain the types of data 3. What are the methods of re- presentation of data )A 4. What is meant by an Ogive? 5. Draw less than and more than ogive curves for the following frequency distribution and obtain median graphically. Verify the result. (c CI 0-20 20-40 40-60 60-80 80-100 100-120 120-140 140-160 f 5 12 18 25 15 12 8 5 Amity Directorate of Distance & Online Education Business Statistics 21 Further Readings Notes e 1. Richard I. Levin, David S. Rubin, Sanjay RastogiMasood Husain Siddiqui, Statistics for Management, Pearson Education, 7th Edition,2016. in 2. Prem.S.Mann, Introductory Statistics, 7th Edition, Wiley India,2016. 3. GarethJames,DanielaWitten,TrevorHastie,RobertTibshirani,AnIntroduction toSt atisticalLearningwithApplicationsinR,Springer,2016. nl Bibliography 1. SrivastavaV.K.etal–QuantitativeTechniquesforManagerialDecisionMaking, Wiley O EasternLtd 2. Richard, I.Levin and Charles A.Kirkpatrick – Quantitative Approaches to Management, McGraw Hill, Kogakusha Ltd. ity 3. Prem.S.Mann, Introductory Statistics, 7th Edition, Wiley India,2016. 4. Budnik, Frank S Dennis Mcleaavey, Richard Mojena – Principles of Operation Research - AIT BS New Delhi. s 5. SharmaJK–OperationResearch-theoryandapplications-McMillan,NewDelhi 6. KalavathyS.–OperationResearch–VikasPubCo 7. er GouldFJ–IntroductiontoManagementScience–EnglewoodCliffsNJPrentice Hall. 8. NarayJK,OperationResearch,theoryandapplications–McMillan,NewDehi. v 9. TahaHamdy,OperationsResearch,PrenticeHallofIndia 10. Tulasian: Quantitative Techniques: PearsonEd. ni 11. Vohr.N.D. Quantitative Techniques in Management,TMH. 12. Stevenson W.D, Introduction to Management Science, TMH. U ity m )A (c Amity Directorate of Distance & Online Education 22 Business Statistics Module-2: Measures of Central Tendency Notes e Objectives in 1. To get introduced with limitations, applications and functions of statistics 2. To know about the measures of central tendency nl Outcomes 1. The learner will be able to use the measures of central tendency in business situations O “ It is the science of collection, presentation, analysis, and interpretation of numerical data from logical analysis” ity Croxton and Cowden define- 2.1.1 Introduction Measures of central tendency are a single value which can be considered as s representative of a set of observations. The value around which the observations can be considered as centered is known as an Average or average value or a er location center. Since such representative values tend to lie centrally within a set of observations when arranged according to magnitudes, these averages are then called measures of central tendency. v 2.1.2 Central Tendency Measures ni Central tendency has three main measures: mode, median and mean. Each of those measurements represents a specific indication of the distribution’s typical or central value. U Mean - The mean is the average of the numbers. It is easy to calculate: add up all the numbers, then divide by how many numbers there are. In other words it is the sum divided by the count ity Median - Within a sorted, ascending or descending list of numbers, the median is the middle number and may be more representative of that set of data than the average. The median is often used as opposed to the mean when the series includes outliers that may distort the average of values. m Mode - The mode is the number most frequently seen in a dataset. A collection of numbers may have one mode, one mode, or no mode at all. Other popular central tendency measurements include a set’s mean, or mean, and a set ‘s median, )A middle value. 2.1.3 Average An average is a single figure that sums up the characteristics of a whole group of (c figures. Amity Directorate of Distance & Online Education Business Statistics 23 In the words of clark “average is an attempt to find one single figure to describe Notes e whole of figures. An average is described as a measure of central tendency as it is more or less a central value around which various values cluster. in In the world of CROXTON and COWDEN “an average is a single value within the range of the data that is used to represent all of the values in the series. Since an average is somewhere within the range of the data, it is called a measure of cultural value. nl Objectives served by Averages Averages serve the following purposes: O 1. To obtain a clear and concise picture of large number of numerical data. 2. To compare different groups by the means of averages. 3. To obtain a clear picture of a whole group studying sample data. ity 4. To provide definite rates to the relationship between different groups. Characteristics of good average 1. It is rigidly defined and its value is always definite. s 2. It is easy to understand and calculate, hence it is very popular. er 3. It is based on all the observations; so that it can become a good representative. 4. It can be easily used for comparisons. 5. It is capable of further algebraic treatments, like finding the sum of the observation v values. Finding the mean and total number of the observations, and finding the combined arithmetic mean when different groups are given etc. ni 6. It is not affected much by sampling fluctuations. Essentials of a good Average U The essentials of a good average are as follows: 1. It must be defined rigidly. ity 2. It must be based on all the observation of the data. 3. It must be readily comprehensible or understandable. 4. It must be capable of being calculated with reasonable ease and rapidity. m 5. It must be affected as little as possible by fluctuations of sampling. 6. It must be readily amenable to arithmetic or algebraic treatment. )A 2.1.4 Arithmetic Mean Arithmetic mean is defined as the value obtained by dividing the total values of all items in the series by their number. In other word is defined as the sum of the given observations divided by the number of observations, i.e., add values of all items together and divide this sum by the number of observations. (c Symbolically – x= x1 + x2 + x3 + xn/n Amity Directorate of Distance & Online Education 24 Business Statistics Properties of Arithmetic Mean Notes e 1. The sum of the deviations, of all the values of x, from their arithmetic mean, is zero. 2. The product of the arithmetic mean and the number of items gives the total of all in items. 3. Finding the combined arithmetic mean when different groups are given. nl Demerits of Arithmetic Mean 1. Arithmetic mean is affected by the extreme values. O 2. Arithmetic mean cannot be determined by inspection and cannot be located graphically. 3. Arithmetic mean cannot be obtained if a single observation is lost or missing. ity 4. Arithmetic mean cannot be calculated when open-end class intervals are present in the data. Arithmetic Mean for Ungrouped Data s A) Individual Series 1. Direct Method