Unit 1 Population and Sample PDF
Document Details
Uploaded by ThinnerSeattle
Savitribai Phule Pune University
Tags
Summary
This document introduces the concept of population and sample in statistics. It explains the different definitions and uses of statistics, including examples of statistical data, and highlights the importance of these concepts in various fields. It also contains exercises for further understanding and analyzing the topics involved.
Full Transcript
# Chapter 1... Population and Sample ## Contents - 1.1 Introduction - 1.2 Definition - 1.3 Importance of Statistics - 1.4 Scope and Applications of Statistics - 1.5 Population and Sample - 1.6 Types of Sampling ## Key Words: - Uses of Statistics - Scope of Statistics - Limitations of Statistics...
# Chapter 1... Population and Sample ## Contents - 1.1 Introduction - 1.2 Definition - 1.3 Importance of Statistics - 1.4 Scope and Applications of Statistics - 1.5 Population and Sample - 1.6 Types of Sampling ## Key Words: - Uses of Statistics - Scope of Statistics - Limitations of Statistics - Sample - Population - SRSWOR, SRSWR - Stratified Sampling - Random Sampling ## Objectives: - In this chapter the various aspects of statistics, uses, scope and applications in various fields are discussed. - The concept of statistical population and sample is introduced. - Random sample and methods of drawing sample are introduced. ## 1.1 Introduction - It is believed that statistics is in use from the time when man began to count and measure - In ancient days kings used to maintain records of land, agricultural yield, wealth, taxes, live stock, soldiers, weapons, deaths and births etc. - There are references that Hebrews conducted population census. - In ancient days Maurya kings, King Ashoka, Gupta kings had collected Statistics. - Kautilya's Arthashastra mentions that the statistics of population, land etc. were collected from time to time. - Emperor Akbar gave details of population, land, agriculture etc. in his publication Ain-i-Akbari. - It is considered that the word Statistics seems to be derived from the Italian word 'statista' or the Greek word 'statistika'. - Both words mean "political states". - The word statistics carries several meanings. - Many times statistics is considered as statistical data, which contains numerical information of a characteristic under study. - Ex: Statistics of a batsman, population statistics etc. - Statistics or statistical methods is treated as a branch of science which deals with: - Collection - Presentation - Analysis - Interpretation of data. ## 1.2 Definition - Number of statisticians had made an attempt to define statistics. - They used statistics for different purpose, with a different view-point. - Accordingly they defined statistics emphasizing their view point. - **Webster's definition**: Webster defines statistics as "the classified facts representing the conditions of people in the state, especially those facts which can be stated in a table or tables of numbers or in any tabular or classified arrangement." - This definition gives importance to presentation of facts and figures. - Remaining aspects of statistics are not considered in this definition. - **Horace Secrist's definition**: Secrist defines statistics as follows: 'By statistics we mean aggregates of facts affected to a marked extent by multiplicity of causes numerically expressed, enumerated or estimated according to resonable standards of accuracy, collected in a systematic manner for a predetermined purpose and placed in relation to each other.' - This definition takes into account almost all functions and aspects of statistics. - It covers the fair important aspects viz. (i) collection, (ii) presentation, (iii) analysis and (iv) interpretation of data. ## 1.3 Importance of Statistics - We know that many phenomena in nature and activities, experiments are subject to measurements, moreover variation in different types of characteristics is inevitable. - Ex: income of a family, height of a person, sales of a company, electricity consumption of a city etc. - This produces voluminous data. - It becomes difficult to comprehend. - This forces the use of statistical methods. - Thus statistics is important from the following view points: - Statistical methods enable to condense the data. - It facilitates several functions apart from summerisation. - Statistical methods give tools of comparison. - Estimation, prediction is also possible using statistical tools. - We can get idea about the shape, spread symmetry of the data. - Inter-relation between two or more variables can be measured using statistical techniques. - Statistical methods help in planning, controlling, decision-making etc. - The use of statistical methods is important because considerable amount of time, money, manpower can be saved. - Uncertainities can be reduced to get reliable results. - Statistical methods give systematic methods of data collection and investigation. - Thus statistics reveals several aspects of phenomena. - H. G. Wells expresses the importance and need of statistics in the following words. - "Statistical thinking will one day be necessary for effective citizenship as the ability to read and write". ## 1.4 Scope and Applications of Statistics - The tools and techniques given by statistical methods are used in almost all fields at several phases. - Because of diversified applications of statistics, an exhaustive list of fields is difficult to prepare - We find use of statistics indispensable in the agriculture, business, commerce, demography, economics, education, government agencies, industries, social sciences, biological sciences, medical sciences, management sciences etc. - We discuss briefly the scope of statistics in some of the above stated fields. - **(a) Statistics in industry**: Industry makes use of statistics at several places such as administration, planning, production, growth and development. - In many industries 'Statistical Quality Control' division is separately operating. - Mainly, whether manufactured goods possess a desirable standard or not is examined using various control charts. - These inspections are done at the time of production. - On-line process capability study is conducted to set-up the machines to give desired standards. - Moreover purchased goods or semifinished goods are inspected using acceptance sampling plans of various types. - Now-a-days, ISO 9000 makes use of Statistics to a large extent. - Apart from this in some industries the technique known as designs of experiment is also used. - Newly installed machinery is tested for its performance using statistical methods. - Sampling is required to be used because of its several advantages. - Multiple regression planes are used for forecasting, when several factors are interlinked. - Efficiency measurement, index number of production, work sampling etc. are very much useful for administration and planning department. - **(b) Statistics and Economics**: In the field of economics, huge amount of data are needed to be processed and interpreted. - Statistics is very much helpful in this field. - In order to collect data, various statistical methods of investigations are used. - Many a times questionnaires are drafted. - A proper representative of a group is selected using sampling methods. - Statistical methods are used in this activity to get reliable results. - Estimation of national income, per capita income, poverty line, industrial production etc. is done using statistical techniques. - Probability distribution of income can be useful in various economic activities. - A tool known as index number developed in Statistics is used every now and then in economics. - It performs number of functions. - It measures average increase in prices, production, income, volume of import, export etc. - Index numbers are called as economic barometers. - Index numbers are used in determining real income, deflation, cost of living index numbers. - To measure the changes in prices of shares in stock market index number provides the best tool. - Several interlinked activities in economics can be studied. - Ex: (i) the relation between prices and supply (ii) the relation between demand and prices (iii) the relation between sales and profit. - Demand analysis, time series analysis techniques are mainly developed to study economics. - Those are the gifts given by statistics. - **(c) Statistics and Management Sciences**: Most of the managerial functions make use of statistics. - For efficient working of various sections of management such as sales, production, marketing statistical method are used. - Different statistical tools such as forecasting, tests of significance, index numbers, time series analysis, statistical quality control, estimation play vital role in management activities. - Apart from this, various optimisation techniques known as linear programming, transportation techniques, job assignment problems, sequencing, CPM and PERT, replacement problems, inventory control are also useful. - Portfolio management makes use of regression analysis. - The regression coefficient called beta index in portfolio is used in decision-making. - Risk measurement is done using standard deviations, covariance. - Various statistical techniques are used in decision-making. - **(d) Statistics and Social Sciences**: Bowley says "Statistics is the science of measurement of social organism, regarded as a whole in all its manifestation". - Research in social sciences need questionnaire. - Further analysis is required to be done using statistical tools. - In social sciences we need to test association between two variables such as (i) education and criminality (ii) education and marriage adjustment score (iii) sex and education (iv) richness and criminality etc. ## 1.5 Population and Sample - In order to study a group of large number of items we require to draw sample. - We use technique of sampling several times in every day life. - Ex: while purchasing food grains we inspect only handful of grains and draw conclusion about the whole sack. - Population: In the technical language of statistics the word population is used in somewhat a wider sense. - It does not mean only a human population. - Ex: - In the study of industrial development, all the industries under consideration is the population. - In the study of socio-economic conditions of a particular village, all families or houses in the village will be a population. - In the study of agricultural yield, all the cultivated farms together will be a population. - In titration experiment solution in beaker is a population. - Thus population may be a group of employees, collection of books, total industrial production, a group of persons suffering from a particular disease, collection of explosives, group of students etc. - **Definition**: An aggregate of objects or individuals under study is called population or universe. - Population may contain finite or infinite elements. - Accordingly, it is called as finite or infinite population. - **Statistical Population**: We have defined population as an aggregate of objects or individuals; however, many a times we record some quantitative or qualitative characteristic of each member in the population. - These observations (or data) are collectively called as statistical population. - Thus in the further study we will be interested in 'statistical population'. - In order to study the population, one of the ways is to collect information about each and every element in the population. - This method is called as census or complete enumeration. - After every ten years 'population census' of India is conducted. - In this census, information regarding every individual is collected. - **Limitations of census method**: - Census method provides reliable results; but due to voluminous work, it is expensive and time consuming. It requires a large amount of manpower. - There are some situations where census is possible but impracticable. - Ex: testing blood of an individual. In this case, entire blood cannot be tested. Thus census cannot be used here. - Similarly in testing explosives, testing of average life of bulb produced in a lot, testing strength of construction material, census method cannot be used. - If the population is infinite, census cannot be used. - **Sampling**: In order to overcome the limitations of census, sampling is used. In this case, some representative items are selected from the population, so that all important characteristics of population are covered in the items of this group. - Such a group is called a sample and the method of selecting such a group is called as sampling method. - **Definition**: Any part of population under study is called a sample.. - **Illustrations**: - While purchasing food grains, we inspect only a handful of grains and draw conclusions about the quality of the whole lot. - In this case, handful of grains is a sample and the whole lot is a population. - While examining blood of an individual, a few drops are taken out of human body for diagnosis. - These drops form a sample whereas entire blood in the body is a population. - In this case, conclusions based on sample are accepted for population without any doubt as far as the method is concerned. - In this case, census is impracticable. - Sampling method is appealing in such situations. - For testing quality of milk, a small quantity of milk is tested instead of entire bulk. - A housewife confirms whether the food is properly cooked or not with the help of few particles taken out of the container. - Clearly, the food in the container is a population, whereas food taken out of container for inspection is a sample. - Note that: - Sampling is a well accepted means of collecting information. - It is believed to be scientific and objective procedure of selecting items - Thus, sampling plays important role in further statistical analysis. - As the sampling methods are used to study population, the samples should be chosen carefully. - A natural requirement would be that the sample should be representative of concerned population. - There are several methods of sampling in practice. - We shall deal with some of these in later sections. ## 1.6 Types of Sampling - A success of sampling method mainly depends upon proper selection of sampling method. - Different sampling methods are used in practice. - A sampling method which suits to the purpose is selected. - Sampling methods are mainly classified into two classes viz. (i) non-random sampling and (ii) random sampling (or probability sampling). - In the earlier discussion we have studied the importance of random sampling. - We discuss two popularly used random sampling methods. - Simple random sampling - Stratified random sampling - **1. Simple Random Sampling (SRS):** It is the easiest and most commonly used method of sampling. - In this method each element of population is given same chance of getting selected in the sample. - If population consists of N elements then probability of selecting any element at any draw is 1/N. - Further, there are two types of simple random sampling due to slight difference in procedure of selecting the elements. - **(a) Simple Random Sampling with Replacement. (SRSWR):** - In this method, first element is selected at random from the population. - It is recorded or studied completely and then replaced back in the population. - Afterwards second element is selected similarly. - This process is continued till a sample of required size is selected. - In this method population size remains the same at every draw. - This method of sampling is called as simple random sampling with replacement. - One of the serious drawbacks of this method is that, the same element may be selected more than once in the sample. - **(b) Simple Random Sampling Without Replacement (SRSWOR):** There is another procedure of selecting elements in which, elements are selected at random but those are not replaced back in the population. - This method of selecting sample is called as simple random sampling without replacement. - In this method population size goes on decreasing at each draw. - The drawback of getting the same element selected more than once is overcome in SRSWOR. - Illustrations of Simple Random Sampling : - Suppose a lot of 500 articles is submitted for inspection to determine the proportion of defective articles one can use SRSWOR. - In order to conduct a socio-economic survey of certain village we can take SRSWOR and find per capita income of a village. - In order to test average petrol consumption of a lot of scooter manufactured SRSWOR or SRSWR can be used. - To find diameter of a rod, generally we take reading at few points on a rod and then find the average of readings. These readings form SRSWOR. This is practised for physical measurements of articles. - Testing human blood by taking few drops out an individual's body is a SRSWOR. - In order to find average life of a bulb we take SRSWOR from a manufactured lot. - Simple random sampling is widely used due to its simplicity and convenience. - However, it suffers from some drawbacks such as, it may not be proper representative when population is heterogeneous, widely spread etc. - Some part of population may not be represented in simple random sample at all. - In order to avoid these problems some other sampling methods are in use. - **2. Stratified Random Sampling:** If population is not homogeneous, SRS is not very effective. - Therefore the entire population is divided into several homogeneous groups called as strata (singular stratum). - A simple random sample of a suitable size is selected from each stratum and then combining these sampled observations we can form a sample. - The sample thus formed is called as a stratified random sample. - A properly designed stratified random sampling gives better results than simple random sampling. - Moreover this method is more suitable from administrative point of view. - Illustrations of Stratified Random Sampling : - To estimate annual income per family we divide the population into homogeneous groups such as families with yearly income below 20,000; between ₹20,000 - ₹50,000; between 50,000 - ₹ 1 lakh and above₹1 lakh. Afterwards we use stratified random sampling taking above groups as strata. - Suppose the proportion of defective articles is to be estimated in a manufacturing process. Then we can use stratified random sampling by taking strata as production in the different shifts. - In order to estimate crop yield we can divide the field under cultivation in plots, which are equally fertile considered as strata. - To conduct health survey in a college we can use stratified random sampling by considering strata as the faculties or classes or sex etc. - In the above discussion we have seen how stratified random sampling is better than simple random sampling. - However, in practice an another simple procedure is adopted which we discuss below. - **3. Systematic Sampling:** To draw a systematic sample of size n, sampling units are numbered 1 to N, where N is the population size. - In this method we divide population in n equal parts according to serial numbers. - Suppose each of the part includes k units (we assume N = nk). - Note that 1th group will contain units bearing serial numbers 1 to k, 2nd group will contain units bearing serial numbers k + 1 to 2k and so on. - Then we select a number at random from 1 to k. - Suppose this is j then jth unit in serial order from each group is taken. - Thus it will form a sample of size n which is called as systematic sample. - If j unit is selected then systematic sample will include j, (j + k), (j + 2k)th j + (n - 1) kth observations from original list. - For example: If a sample of size 15 from 150 units is to be drawn, we need to make 15 groups, each of size 10. - Thus, here N = 150, k = 10, n = 15. - We need to select one unit from first group. - Suppose 3rd unit gets selected at random. - Then other 14 units will be automatically selected. - Those will be bearing serial numbers 13, 23, 33, ..., 143. - Entire sample can be selected by selecting every kth unit after the unit selected from the first group. - Thus only one unit drawn at random from first group determines entire sample. - Illustration of Systematic Sampling : - To select houses for a survey we can use house numbers, in this case systematic sample is preferred. - Suppose a shopkeeper wants to study customers' purchasing habits, he may use bill book. He can choose a systematic sample using the numbers on bill he has. - Farms can be selected by taking systematic sample using survey numbers. - Suppose a committee of n = 6 students is to be selected from a class of N = 60 students then we can make 6 groups each of k = 10 students using their roll numbers. We select a student at random from first group. - If 7th student is selected then from each of the next groups we select 7th student. - Thus a systematic sample will include students with roll numbers 7, 17, 27, 37, 47, 57. - Generally, systematic sampling gives better results than that of simple random sampling. - It is easier to implement than the stratified sampling. - However, there are two drawbacks which are given below : - Systematic sample may not be proper representative if population has hidden periodicities. - For example, suppose sales during a year are available. If we take sales of every seventh day in a sample, then sample may contain all Sundays, on which sales might be high. - If N≠nk, sample, size does not remain fixed. - **Case Study** - A manager on a highway mall observed that, customers demand for tea and coffee was increasing and it should to be served at the earliest. He was thinking of installing automatic tea and coffee machine. - He decided to take customers' opinion. Whether the customer would like the test of tea, coffee prepared on machine. - He gathered opinion of customers for a week by using simple random sampling and installed the machine due to favorable opinion about machine made tea and coffee. - He also took review for a month by asking the customers at random whether they were satisfied with the tea and coffee. - It saved time and gained customers' satisfaction also. - **Points to Remember** - Advantages of sampling are (i) reduction in time, cost, manpower, (iii) increases accuracy, (iii) greater scope. - Random sample is preferred to non-random sample because it is proper representative. - The result based on random sample are statistically valid. - It is unbiased selection procedure. - Stratified sampling is used if population is heterogeneous. ## Exercise 1.1 - **A. Theory Questions:** - Define 'statistics'. - Explain the importance of statistics or statistical methods. - Describe the scope and utility of statistics in the field of (i) industry, (ii) economics, (iii) management sciences, (iv) social sciences. - Mention the application of statistics in the following fields : (i) industry, (ii) economics, (iii) management sciences, (iv) social sciences. - Explain the terms with illustration: Population, sample, sampling unit, sampling frame. - Describe the limitations of sampling over census. - Describe the advantages of sampling over census. - What are the requirements of a good sample? - Explain what is a random sample. Why random sample is preferable ? Explain the various methods of achieving randomness. - Explain the procedure of drawing (a) SRSWR (b) SRSWOR (c) Stratified random sample (d) Systematic sample. - State the advantages of simple random sampling and drawbacks of the same. Also explain how these drawbacks can be overcome. - State the advantages and limitations of stratified sampling. - State the limitations of stratified sampling. - How does SRSWR differ from SRSWOR? - Make critical comparison between (a) Sampling and census. (b) Stratified random sampling and sample random sampling. (c) SRSWR and SRSWOR (d) Random sampling and non-random sampling. - Give illustrations of each of the following sampling methods : (a) SRSWR (b) SRSWOR (c) Stratified sampling (d) Systematic sampling. - Explain the situation where sampling has larger scope as compared to census. - **B. Numerical Problems: ** - Explain with illustration the terms (i) finite population (ii) infinite population. - In a population of size N = 6, the observations were 3, 4, 7, 9, 11, 14. Draw all possible SRSWOR of size 2. - In a population of size N = 8, the observations were 2, 4, 7, 9, 11, 0, 25, 14. Draw all possible SRSWOR of size 5. - If a population consists of 50 items then how many: (a) SRSWOR each of size 10 can be selected. (b) SRSWR each of size 10 can be selected. - Suggest appropriate sampling methods, giving reason, in each of the following situations. - To estimate the average price of books in a library a sample of 500 books is to be selected from 10,000 books having accession numbers. - In order to estimate average pocket money spent by the students in a certain college having 3000 students, a sample of 400 students is to be selected. - A market surveyer wants to select a sample of 1000 persons using telephone directory - To find the daily total requirement of electricity consumption in township containing 3000 houses, 500 offices, 600 shops, 100 factories; a sample of 1000 units is to be selected. - To find the daily total requirement of petrol for two wheelers in a certain city a sample of 5% of two wheelers using RTO registers is to be selected. - In a socio-ecnomic survey a sample of 1000 families is to be selected from a certain village. - To find the average house tax paid by citizens a sample of 500 families is to be selected using municipal corporation records. - To find the average income of employee, in a certain factory employing various categories such as managers, supervisors, clerks, workers. - In an industrial survey a sample of size 50 is to be selected. The area under consideration includes 100 small scale, 200 medium scale and 50 large scale industries. - **21. Identify the sampling scheme used in the following situations:** - For a exhibition, 5 students are to be selected from each class to work as volunteers. - A teacher distributed hand-outs to the students in the first row only. - An examination question paper contains 10 questions of which any 5 are to be attempted. Ramesh selected questions bearing even serial numbers. - Salesman contactd the first 100 customers visiting the shop for survey - Suppose there are 10 divisions of F.Y.B.Com. named as A, B, ... J in a certain college. A sample of 10 students from each division is chosen for managing sports activity. ## Answers 1.2 - (3, 4); (3, 7); (3, 9); (3, 11); (3, 14); (4, 7); (4, 9); (4, 11); (4, 14); (7, 9); (7, 11); (7, 14); (9, 11); (9, 14); (11, 14). - In all 56 samples are possible. - (a) C <sup>50</sup><sub>10</sub> (b) 50<sup>10</sup> - (a) stratified (b) stratified (c) stratified (d) stratified (e) SRSWOR, stratified (h) stratified - (a) stratified (b) non-random (c) stratified (d) non-random (e) stratified.