Collection of Data PDF
Document Details
Uploaded by Deleted User
2024
Tags
Related
- 2024 Community-Based Monitoring System Household Profile Questionnaire PDF
- Statistics for Economics Class-XI PDF
- International Recommendations for Energy Statistics (IRES) PDF
- Pre-Finals Reviewer PDF
- Probability & Statistics for Engineers & Scientists PDF
- XI Economics (030) Annual Syllabus 2024-25 PDF
Summary
This document details methods of data collection, including primary and secondary sources, and various data collection techniques. It also discusses sample surveys and censuses, illustrating with examples. The document is ideal for undergraduate-level students studying statistics in economics.
Full Transcript
C HA P T E R 2 Collection of Data chapter, you will study the sources of Studying this chapter should data and the mode of data collection. enable you t...
C HA P T E R 2 Collection of Data chapter, you will study the sources of Studying this chapter should data and the mode of data collection. enable you to: understand the meaning and The purpose of collection of data is to purpose of data collection; show evidence for reaching a sound and distinguish between primary and clear solution to a problem. secondary sources; In economics, you often come know the mode of collection of data; across a statement like this, distinguish between Census and “After many fluctuations the output Sample Surveys; of food grains rose to 132 million tonnes be familiar with the techniques of sampling; in 1978-79 from 108 million tonnes in know about some important sources 1970-71, but fell to 108 million tonnes of secondary data. in 1979-80. Production of food grains then rose continuously to 252 million tonnes in 2015-16 and touched 272 1. INTRODUCTION million tonnes in 2016–17.” In the previous chapter, you have read In this statement, you can observe about what is economics. You also that the food grains production in studied about the role and importance different years does not remain the of statistics in economics. In this same. It varies from year to year and 2024-25 10 STATISTICS FOR ECONOMICS from crop to crop. As these values vary, 2. WHAT ARE THE SOURCES OF DATA? they are called variable. The variables Statistical data can be obtained from are generally represented by the letters two sources. The researcher may X, Y or Z. Each value of a variable is an collect the data by conducting an observation. For example, the food enquiry. Such data are called Primary grain production in India varies between 108 million tonnes in 1970– Data, as they are based on first hand 71 to 272 million tonnes in 2016-17 information. Suppose, you want to as shown in the following table. The know about the popularity of a filmstar years are represented by variable X and among school students. For this, you the production of food grain in India will have to enquire from a large (in million tonnes) is represented by number of school students, by asking variable Y. questions from them to collect the desired information. The data you get, TABLE 2.1 Production of Food Grain in India is an example of primary data. (Million Tonnes) If the data have been collected and processed (scrutinised and tabulated) X Y by some other agency, they are called 1970–71 108 Secondary Data. They can be obtained 1978–79 132 either from published sources such as 1990–91 176 government reports, documents, 1997–98 194 newspapers, books written by 2001–02 212 economists or from any other source, 2015-16 252 for example, a website. Thus, the data 2016-17 272 are primary to the source that collects Here, the values of these variables and processes them for the first time X and Y are the ‘data’, from which we and secondary for all sources that later can obtain information about the use such data. Use of secondary data production of food grains in India. To saves time and cost. For example, after know the fluctuations in food grains collecting the data on the popularity of production, we need the ‘data’ on the the filmstar among students, you production of food grains in India for publish a report. If somebody uses the various years. ‘Data’ is a tool, which data collected by you for a similar helps in understanding problems by study, it becomes secondary data. providing information. You must be wondering where do 3. HOW DO WE COLLECT THE DATA? ‘data’ come from and how do we collect Do you know how a manufacturer these? In the following sections we will decides about a product or how a discuss the types of data, method and instruments of data collection and political party decides about a sources of obtaining data. candidate? They conduct a survey by 2024-25 COLLECTION OF DATA 11 asking questions about a particular Poor Q product or candidate from a large (i) Is increase in electricity charges group of people. The purpose of surveys justified? is to describe some characteristics like (ii) Is the electricity supply in your price, quality, usefulness (in case of the locality regular? product) and popularity, honesty, Good Q loyalty (in case of the candidate). The (i) Is the electricity supply in your purpose of the survey is to collect data. locality regular? Survey is a method of gathering (ii) Is increase in electricity charges information from individuals. justified? Preparation of Instrument The questions should be precise and clear. For example, The most common type of instrument Poor Q used in surveys is questionnaire/ What percentage of your income do you interview schedule. The questionnaire spend on clothing in order to look is either self administered by the presentable? respondent or administered by the Good Q researcher (enumerator) or trained What percentage of your income do you investigator. While preparing the spend on clothing? questionnaire/interview schedule, you should keep in mind the following The questions should not be points; ambiguous. They should enable the respondents to answer quickly, The questionnaire should not be correctly and clearly. For example: too long. The number of questions Poor Q should be as minimum as Do you spend a lot of money on books possible. in a month? Good Q The questionnaire should be easy (Tick mark the appropriate option) to understand and avoid How much do you spend on books in ambiguous or difficult words. a month? The questions should be arranged (i) Less than Rs 200 in an order such that the person (ii) Rs 200–300 answering should feel (iii) Rs 300–400 comfortable. (iv) More than Rs 400 The series of questions should The question should not use double move from general to specific. The negatives. The questions starting questionnaire should start from with “Wouldn’t you” or “Don’t you” general questions and proceed to should be avoided, as they may lead more specific ones. For example: to biased responses. For example: 2024-25 12 STATISTICS FOR ECONOMICS Poor Q Q. Why did you sell your land? Don’t you think smoking should be (i) To pay off the debts. prohibited? (ii) To finance children’s education. Good Q (iii) To invest in another property. Do you think smoking should be (iv) Any other (please specify). prohibited? Closed-ended questions are easy The question should not be a to use, score and to codify for analysis, because all respondents can choose leading question, which gives a clue about how the respondent from the given options. But they are should answer. For example: difficult to write as the alternatives Poor Q should be clearly written to represent How do you like the flavour of this high- both sides of the issue. There is also a quality tea? possibility that an individual’s true Good Q response is not present among the How do you like the flavour of this tea? options given. For this, the choice of ‘Any Other’ is provided, where the The question should not indicate respondent can write a response, which alternatives to the answer. For was not anticipated by the researcher. example: Moreover, another limitation of Poor Q multiple-choice questions is that they Would you like to do a job after college or be a housewife? tend to restrict the answers by Good Q providing alternatives, without which What would you like to do after college ? the respondents may have answered differently. The questionnaire may consist of Open-ended questions allow for closed-ended (or structured) questions or open-ended (or unstructured) more individualised responses, but questions. The above question about they are difficult to interpret and hard what a student wants do after college to score, since there are a lot of is an open-ended question. variations in the responses. Example, Closed-ended or structured Q. What is your view about questions can either be a two-way globalisation? question or a multiple choice question. When there are only two possible Mode of Data Collection answers, ‘yes’ or ‘no’, it is called a two- Have you ever come across a television way question. show in which reporters ask questions When there is a possibility of more than two options of answers, multiple from children, housewives or general choice questions are more appropriate. public regarding their examination Example, performance or a brand of soap or a 2024-25 COLLECTION OF DATA 13 political party? The purpose of asking Mailing Questionnaire questions is to do a survey for collection When the data in a survey are collected of data. There are three basic ways of by mail, the questionnaire is sent to collecting data: (i) Personal Interviews, each individual by mail (ii) Mailing (questionnaire) Surveys, with a request to and (iii) Telephone Interviews. complete and return it by a given date. The Personal Interviews advantages of this This method is method are that, it is used when the less expensive. It allows the researcher researcher has to have access to people in remote access to all the areas too, who might be difficult to reach in person or by telephone. It does members. The not allow influencing of the respondents researcher (or by the interviewer. It also permits the investigator) respondents to take sufficient time to conducts face- give thoughtful answers to the to-face interviews with the respondents. questions. Personal interviews are preferred These days online surveys or due to various reasons. Personal surveys through short messaging contact is made between the service, i.e., SMS are popular. Do you respondent and the interviewer. The know how an online survey is interviewer has the opportunity of conducted? explaining the study and answering the The disadvantages of mail survey queries of respondents. The interviewer are that there is less opportunity to can request the respondent to expand provide assistance in clarifying on answers that are particularly instructions, so there is a possibility of important. Misinterpretation and misunderstanding the questions. Mailing is also likely to produce low misunderstanding can be avoided. response rates due to certain factors, Watching the reactions of respondents such as returning the questionnaire can provide supplementary without completing it, not returning the information. questionnaire at all, loss of Personal interview has some questionnaire in the mail itself, etc. demerits too. It is expensive, as it requires trained interviewers. It takes Telephone Interviews longer time to complete the survey. In a telephone Presence of the researcher may inhibit interview, the respondents from saying what they investigator asks really think. questions over the 2024-25 14 STATISTICS FOR ECONOMICS telephone. The advantages of telephone providing a preliminary idea about the interviews are that they are cheaper survey. It helps in pre-testing of the than personal interviews and can be questionnaire, so as to know the conducted in a shorter time. They allow shortcomings and drawbacks of the the researcher to assist the respondent questions. Pilot survey also helps in by clarifying the questions. Telephonic assessing the suitability of questions, interview is better in cases where the clarity of instructions, performance of respondents are reluctant to answer enumerators and the cost and time certain questions in personal interviews. involved in the actual survey. The disadvantage of this method is Activities access to people, as many people may You have to collect information not own telephones. from a person, who lives in a remote village of India. Which Pilot Survey mode of data collection will be appropriate and why? Discuss. Once the questionnaire is ready, it is You have to interview the parents advisable to conduct a try-out with a about the quality of teaching in small group which is known as Pilot a school. If the principal of the Survey or Pre-testing of the school is present there, what questionnaire. The pilot survey helps in types of problems can arise? Advantages Disadvantages Personal Interview Highest Response Rate Most expensive Allows use of all types of questions Possibility of influencing Better for using open-ended respondents questions More time-taking. Allows clarification of ambiguous questions. Mailed Interview Least expensive Cannot be used by illiterates Only method to reach remote Long response time areas Does not allow explanation of No influence on respondents unambiguous questions Maintains anonymity of Reactions cannot be watched. respondents Best for sensitive questions. Telephonic Interviews Relatively low cost Limited use Relatively less influence on Reactions cannot be watched respondents Possibility of influencing Relatively high response rate. respondents. 2024-25 COLLECTION OF DATA 15 4. CENSUS AND SAMPLE SURVEYS According to the Census 2011, Census or Complete Enumeration population of India was 121.09 crore, which was 102.87 crore in 2001. A survey, which includes every element Census 1901 indicated that the of the population, is known as Census population of the country was 23.83 or the Method of Complete crore. Since then, in a period of 110 Enumeration. If certain agencies are years, the population of the country has interested in studying the total increased by more than 97 crore. The population in India, they have to average annual growth rate of obtain information from all the population which was 2.2 per cent per households in rural and urban India. year in the decade 1971-81 came down It is carried out every ten years. A to 1.97 per cent in 1991-2001 and house-to-house enquiry is carried out, 1.64 per cent during 2001-2011. covering all households in India. Demographic data on birth and death Population and Sample rates, literacy, employment, life expectancy, size and composition of Population or the Universe in statistics population, etc., are collected and means totality of the items under study. published by the Registrar General of Thus, the Population or the Universe is India. The last Census of India was a group to which the results of the held in 2011. study are intended to apply. A population is always all the individuals/items who possess certain characteristics (or a set of characteristics), according to the purpose of the survey. The first task in selecting a sample is to identify the population. Once the population is identified, the researcher selects a method of studying it. If the researcher finds that survey of the whole population is not possible, then he/ she may decide to select a Representative Sample. A sample refers to a group or section of the population from which information is to be obtained. A good sample (representative sample) is generally smaller than the population and is capable of providing reasonably accurate information about the population at a much lower cost and shorter time. 2024-25 16 STATISTICS FOR ECONOMICS Suppose you want to study the Now the question is how do you do average income of people in a certain the sampling? There are two main types region. According to the Census of sampling, random and non-random. method, you would be required to find out the income of every individual in Activities the region, add them up and divide by In which years will the next number of individuals to get the Census be held in India and average income of people in the region. China? This method would require huge If you have to study the opinion expenditure, as a large number of of students about the new enumerators have to be employed. economics textbook of class XI, what will be your population and Alternatively, you select a sample? representative sample, of a few If a researcher wants to individuals, from the region and find estimate the average yield of out their income. The average income wheat in Punjab, what will be of the selected group of individuals is her/his population and sample? used as an estimate of average income of the individuals of the entire region. The following description will make their distinction clear. Example Research problem: To study the Random Sampling economic condition of agricultural As the name suggests, random labourers in Churachandpur district of sampling is one where the individual Manipur. units from the population (samples) Population: All agricultural are selected at random. The labourers in Churachandpur district. government wants to determine the Sample: Ten per cent of the impact of the rise in petrol price on the agricultural labourers in household budget of a particular Churachandpur district. locality. For this, a representative Most of the surveys are sample (random) sample of 30 households has surveys. These are preferred in statistics to be taken and studied. The names of because of a number of reasons. A all 300 households of that area are sample can provide reasonably reliable written on paper and mixed, then 30 and accurate information at a lower names to be interviewed are selected cost and shorter time. As samples are one by one. smaller than population, more detailed In random sampling, every information can be collected by individual has an equal chance of conducting intensive enquiries. As we being selected. In the above example, need a smaller team of enumerators, it all 300 sampling units (also called is easier to train them and supervise sampling frame) of the population got their work more effectively. an equal chance of being included in 2024-25 COLLECTION OF DATA 17 Using the Random Number Tables, how will you select your sample years? Non-Random Sampling There may be a situation that you have A Population of 20 to select 10 out of 100 households in a Kuchha and 20 Pucca Houses locality. You have to decide which household to select and which to reject. You may select the households conveniently situated or the A Representative A non-representative households known to you or your Sample Sample friend. In this case, you are using your the sample of 30 units and hence the judgement (bias) in selecting 10 sample, such drawn, is a random households. This way of selecting 10 sample. This is also called lottery out of 100 households is not a random method. Nowadays computer selection. In a non-random sampling programmes are used to select random method all the units of the population samples. do not have an equal chance of being Exit Polls selected and convenience or You must have seen that when an judgement of the investigator plays an election takes place, the television important role in selection of the networks provide election coverage. sample. They are mainly selected on They also try to predict the results. the basis of judgment, purpose, This is done through exit polls, convenience or quota and are non- wherein a random sample of voters random samples. who exit the polling booths are asked whom they voted for. From the data 5. S A M P L I N G AND N O N -S A M P L I N G of the sample of voters, the prediction E RRORS is made. You might have noticed that Sampling Errors exit polls do not always predict correctly. Why? A population consisting of numerical values has two important characteristics which are of relevance Activity here. First, Central Tendency which You have to analyse the trend of may be measured by the mean, the foodgrains production in India for median or the mode. Second, the last fifty years. As it is difficult to collect data for all the years, Dispersion, which can be measured by you are asked to select a sample caculating the “standard deviation”, of production of ten years. ‘‘ mean deviation”, “ range”, etc. 2024-25 18 STATISTICS FOR ECONOMICS The purpose of the sample is to get Sampling Bias one or more estimate of the population Sampling bias occurs when the parameters. Sampling error refers to the sampling plan is such that some difference between the sample estimate members of the target population and the corresponding population could not possibly be included in the parameter (actual value of the sample. characteristic of the population for example, average income, etc). Thus, Non-Response Errors the difference between the actual Non-response occurs if an interviewer value of a parameter of the population is unable to contact a person listed in and its estimate (from the sample) is the sample or a person from the sample the sampling error. It is possible to refuses to respond. In this case, the reduce the magnitude of sampling error sample observation may not be by taking a larger sample. representative. Example Errors in Data Acquisition Consider a case of incomes of 5 farmers of Manipur. The variable x (income of This type of error arises from farmers) has measure-ments 500, 550, recording of incorrect responses. 600, 650, 700. We note that the Suppose, the teacher asks the population average of students to measure the length of the (500+550+600+650+700) teacher’s table in the classroom. The ÷ 5 = 3000 ÷ 5 = 600. measurement by the students may Now, suppose we select a sample differ. The differences may occur due of two individuals where x has to differences in measuring tape, measurements of 500 and 600. The carelessness of the students, etc. sample average is (500 + 600) ÷ 2 Similarly, suppose, we want to collect data on prices of oranges. We know = 1100 ÷ 2 = 550. that prices vary from shop to shop Here, the sampling error of the estimate and from market to market. Prices also = 600 (true value) – 550 (estimate) = 50. vary according to the quality. Therefore, we can only consider the Non-Sampling Errors average prices. Recording mistakes Non-sampling errors are more serious can also take place as the than sampling errors because a enumerators or the respondents may sampling error can be minimised by commit errors in recording or trans- taking a larger sample. It is difficult scripting the data, for example, he/ to minimise non-sampling error, even she may record 13 instead of 31. by taking a large sample. Even a Census can contain non-sampling 6. CENSUS OF INDIA AND NSSO errors. Some of the non-sampling There are some agencies both at the errors are: national and state level to collect, 2024-25 COLLECTION OF DATA 19 process and tabulate the statistical utilisation of educational services, data. Some of the agencies at the employment, unemployment, national level are Census of India, manufacturing and service sector National Sample Survey (NSS), Central enterprises, morbidity, maternity, child Statistics Office (CSO), Registrar care, utilisation of the public General of India (RGI), Directorate distribution system etc. The NSS 60th General of Commercial Intelligence and round survey (January–June 2004) Statistics (DGCIS), Labour Bureau, etc. was on morbidity and healthcare. The The Census of India provides the NSS 68th round survey (2011-12) was most complete and continuous on consumer expenditure. The NSS demographic record of population. The also collects details of industrial Census is being regularly conducted activities and retail prices for various every ten years since 1881. The first goods. They are used by Government Census after Independence was of India for planning puposes. conducted in 1951. The Census officials collect information on various 7. CONCLUSION aspects of population such as the size, Economic facts, expressed in terms of density, sex ratio, literacy, migration, numbers, are called data. The purpose rural-urban distribution, etc. Census of data collection is to understand, data is interpreted and analysed to explain and analyse a problem and understand many economic and social causes behind it. Primary data is issues in India. obtained by conducting a survey. The NSS was established by the Survey includes various steps, which Government of India to conduct nation- need to be planned carefully. There are wide surveys on socio-economic issues. various agencies which collect, process, The NSS does continuous surveys in successive rounds. The data collected tabulate and publish statistical data. by NSS are released through reports These are used as secondary data. and its quarterly journal However, the choice of source of data Sarvekshana. NSS provides periodic and mode of data collection depends estimates of literacy, school enrolment, on the objective of the study. 2024-25 20 STATISTICS FOR ECONOMICS Recap Data is a tool which helps in reaching a sound conclusion on any problem. Primary data is based on first hand information. Survey can be done by personal interviews, mailing questionnaires and telephone interviews. Census covers every individual/unit belonging to the population. Sample is a smaller group selected from the population from which the relevant information would be sought. In a random sampling, every individual is given an equal chance of being selected for providing information. Sampling error is due to the difference between the value of the sample estimate and the value of the corresponding population parameter. Non-sampling errors can arise in data acquisition, by non-response or by bias in selection. Census of India and National Sample Survey are two important agencies at the national level, which collect, process and tabulate data on many important economic and social issues. EXERCISES 1. Frame at least four appropriate multiple-choice options for following questions: (i) Which of the following is the most important when you buy a new dress? (ii) How often do you use computers? (iii) Which of the newspapers do you read regularly? (iv) Rise in the price of petrol is justified. (v) What is the monthly income of your family? 2. Frame five two-way questions (with ‘Yes’ or ‘No’). 3. State whether the following statements are True or False. (i) There are many sources of data. (ii) Telephone survey is the most suitable method of collecting data, when the population is literate and spread over a large area. (iii) Data collected by investigator is called the secondary data. (iv) There is a certain bias involved in the non-random selection of samples. (v) Non-sampling errors can be minimised by taking large samples. 4. What do you think about the following questions? Do you find any 2024-25 COLLECTION OF DATA 21 problem with these questions? Describe. (i) How far do you live from the closest market? (ii) If plastic bags are only 5 per cent of our garbage, should it be banned? (iii) Wouldn’t you be opposed to increase in price of petrol? (iv) Do you agree with the use of chemical fertilisers? (v) Do you use fertilisers in your fields? (vi) What is the yield per hectare in your field? 5. You want to do a research on the popularity of Vegetable Atta Noodles among children. Design a suitable questionnaire for collecting this information. 6. In a village of 200 farms, a study was conducted to find the cropping pattern. Out of the 50 farms surveyed, 50% grew only wheat. What is the population and the sample size? 7. Give two examples each of sample, population and variable. 8. Which of the following methods give better results and why? (a) Census (b) Sample 9. Which of the following errors is more serious and why? (a) Sampling error (b) Non-Sampling error 10. Suppose there are 10 students in your class. You want to select three out of them. How many samples are possible? 11. Discuss how you would use the lottery method to select 3 students out of 10 in your class. 12. Does the lottery method always give you a random sample? Explain. 13. Explain the procedure for selecting a random sample of 3 students out of 10 in your class by using random number tables. 14. Do samples provide better results than surveys? Give reasons for your answer. 2024-25