Business Statistics - S.C. Gupta and Indira Gupta PDF
Document Details
Uploaded by Deleted User
2013
S.C. Gupta and Indira Gupta
Tags
Summary
This is a textbook on Business Statistics by S.C. Gupta and Indira Gupta. Published in 2013, it covers statistical methods and applications for B.Com. programs. The book includes theoretical discussions, examples and exercises.
Full Transcript
BUSINESS STATISTICS OUR OUTSTANDING PUBLICATIONS Fundamentals of Statistics — S.C. Gupta Practical Statistics — S.C. Gupta & Indra Gupta Business Statistics (UPTU) — S.C. Gupta & Indra Gupta O;olkf;d lkaf[;dh...
BUSINESS STATISTICS OUR OUTSTANDING PUBLICATIONS Fundamentals of Statistics — S.C. Gupta Practical Statistics — S.C. Gupta & Indra Gupta Business Statistics (UPTU) — S.C. Gupta & Indra Gupta O;olkf;d lkaf[;dh — S.C. Gupta & Arvind Kumar Singh (Hindi Version of Business Statistics) Consumer Behaviour in Indian Perspective — Nair, Suja Consumer Behaviour and Marketing Research — Nair, S.R. Consumer Behaviour—Text and Cases — Nair, Suja Communication — Rayudu, C.S. Investment Management — Avadhani, V.A. Management of Indian Financial Institutions — Srivastava & Nigam Investment Management — Singh, Preeti Personnel Management — Mamoria & Rao Dynamics of Industrial Relations in India — Mamoria, Mamoria & Gankar A Textbook of Human Resource Management — Mamoria & Gankar International Trade and Export Management — Cherunilam, Francis International Business (Text and Cases) — Subba Rao, P. Production and Operations Management — Aswathappa & Sridhara Bhatt Total Quality Management (Text and Cases) — Bhatt, S.K. Quantitative Techniques for Decision Making — Sharma Anand Operations Research — Sharma Anand Advanced Accountancy — Arulanandam & Raman Cost and Management Accounting — Arora, M.N. Indian Economy — Misra & Puri Advanced Accounting — Gowda, J.M. Management Accounting — Gowda, J.M. Accounting for Management — Jawaharlal Accounting Theory — Jawaharlal Managerial Accounting — Jawaharlal Production & Operations Management — Aswathappa & Sridhara Bhatt Business Environment ( Text and Cases) — Cherunilam, Francis Business Laws — Maheshwari & Maheshwari Business Communication — Rai & Rai Business Law for Management — Bulchandani, K.R. Organisational Behaviour — Aswathappa, K. BUSINESS STATISTICS For B.Com. (Pass and Honours) ; B.A. (Economics Honours) ; M.B.A./M.M.S. of Indian Universities S.C. GUPTA M.A. (Statistics) ; M.A. (Mathematics) ; M.S. (U.S.A.) Associate Professor in Statistics (Retired) Hindu College, University of Delhi Delhi-110007 Mrs. INDRA GUPTA (iv) © Author No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording and/or otherwise without the prior written permission of the publisher. First Edition : June 1988 Second Edition : 2013 Published by : Mrs. Meena Pandey for Himalaya Publishing House Pvt. Ltd., “Ramdoot”, Dr. Bhalerao Marg, Girgaon, Mumbai - 400 004. Phone: 022-23860170/23863863, Fax: 022-23877178 E-mail: [email protected]; Website: www.himpub.com Branch Offices : New Delhi : “Pooja Apartments”, 4-B, Murari Lal Street, Ansari Road, Darya Ganj, New Delhi - 110 002. Phone: 011-23270392/23278631; Fax: 011-23256286 Nagpur : Kundanlal Chandak Industrial Estate, Ghat Road, Nagpur - 440 018. Phone: 0712-2738731/3296733; Telefax: 0712-2721216 Bengaluru : Plot No. 91-33, 2nd Main Road, Seshadripuram, Behind Nataraja Theatre, Bengaluru - 560020. Phone: 08041138821; Mobile: 9379847017, 9379847005 Hyderabad : No. 3-4-184, Lingampally, Besides Raghavendra Swamy Matham, Kachiguda, Hyderabad - 500 027. Phone: 040-27560041/27550139 Chennai : New-20, Old-59, Thirumalai Pillai Road, T. Nagar, Chennai - 600 017. Mobile: 9380460419 Pune : First Floor, “Laksha” Apartment, No. 527, Mehunpura, Shaniwarpeth (Near Prabhat Theatre), Pune - 411 030. Phone: 020-24496323/24496333; Mobile: 09370579333 Lucknow : House No. 731, Shekhupura Colony, Near B.D. Convent School, Aliganj, Lucknow - 226 022. Phone: 0522-4012353; Mobile: 09307501549 Ahmedabad : 114, “SHAIL”, 1st Floor, Opp. Madhu Sudan House, C.G. Road, Navrang Pura, Ahmedabad - 380 009. Phone: 079-26560126; Mobile: 09377088847 Ernakulam : 39/176 (New No.: 60/251), 1st Floor, Karikkamuri Road, Ernakulam, Kochi - 682011. Phone: 0484-2378012, 2378016’ Mobile: 09387122121 Bhubaneswar : 5 Station Square, Bhubaneswar - 751 001 (Odisha). Phone: 0674-2532129; Mobile: 09338746007 Kolkata : 108/4, Beliaghata Main Road, Near ID Hospital, Opp. SBI Bank, Kolkata - 700 010, Phone: 033-32449649; Mobile: 7439040301 DTP by : Times Printographic Printed at : (v) Dedicated to Our Parents (vi) (vii) Preface TO THE SIXTH EDITION The book originally written over 20 years ago has been revised and reprinted several times during the intervening period. It is very heartening to note that there has been an increasing response for the book from the students of B.A. (Economics Honours), B.Com. (Pass and Honours); M.B.A./M.M.S. and other management courses, in spite of the fact that the book has not been revised for quite a long time. I take great pleasure in presenting to the readers, the sixth thoroughly revised and enlarged edition of the book. The book has been revised in the light of the valuable criticism, suggestions and the feedback received from the teachers, students and other readers of the book from all over the country. Some salient features of the new edition are : The theoretical discussion throughout has been refined, restructured, rewritten and updated. During the course of rewriting, a sincere attempt has been made to retain the basic features of the earlier editions viz., the simplicity of presentation, lucidity of style and the analytical approach, which have been appreciated by the teachers and the students all over India. Several new topics have been added at appropriate places to make the treatment of the subject matter more exhaustive and up-to-date. Some of the additions are given below : Remark , page 5·57 : Effect of Change of Scale on Harmonic Mean. Remark 4 , page 6·3 : Effect of Change of Origin and Scale on Range. Remark 6 , page 6·10 : Effect of Change of Origin and Scale on Mean Deviation about Mean. Remark 6 , page 7·3 : Xmax and X min in terms of Mean and Range. Remark 2 , page 8·12 : Some Results on Covariance. § 8·10 , page 8·45 : Lag and Lead Correlation. Remark 1 ⎤ , page 9·4 : Necessary and Sufficient Condition for Minima of E. and ⎥ ⎥ Theorem ⎥⎦ Remark , page 9·24 : Limits for r. § 11·9 , page 11·54 : Time Series Analysis in Forecasting. § 13·9 , page 13·9 : Covariance In Terms of Expectation. § 13·10 , page 13·14 : Var (ax + by) = a2 Var (x) + b2 Var (y) + 2ab Cov (x, y) and Remark Equations (14·29e) — and (14·29f) , page 14·31 : Distribution of the Mean (X ) of n i.i.d. N (μ, σ2) variates. § 15·11·2 , page 15·18 : Sampling Distribution of Mean. – 15·19 A number of solved examples, selected from the latest examination papers of various universities and professional institutes, have been added. These are bound to assist understanding and provide greater variety. (viii) Exercise sets containing questions and unsolved problems at the end of each Chapter have been substantially reorganised and rewritten by deleting old problems and adding new problems, selected from the latest examination papers of various universities, C.A., I.C.W.A., and other management courses. All the problems have been very carefully graded and answers to the problems are given at the end of each problem. An attempt has been made to rectify the errors in the last edition. It is hoped that all these changes, additions and improvements will enhance the value of the book. We are confident, that the book in its present form, will prove to be of much greater utility to the students as well as teachers of the subject. We express our deep sense of thanks and gratitude to our publishers M/s Himalaya Publishing House and Type-setters, M/s Times Printographics, Darya Ganj, New Delhi, for their untiring efforts, unfailing courtesy, and co-operation in bringing out the book in such an elegant form. We strongly believe that the road to improvement is never-ending. Suggestions and criticism for further improvement of the book will be very much appreciated and most gratefully acknowledged. January, 2013 S.C. GUPTA INDRA GUPTA (ix) Preface TO THE FIRST EDITION In the ancient times Statistics was regarded only as the science of statecraft and was used to collect information relating to crimes, military strength, population, wealth, etc., for devising military and fiscal policies. But today, Statistics is not merely a by-product of the administrative set-up of the State but it embraces all sciences-social, physical and natural, and is finding numerous applications in various diversified fields such as agriculture, industry, sociology, biometry, planning, economics, business, management, insurance, accountancy and auditing, and so on. Statistics (theory and methods) is used extensively by the government or business or management organisations in planning future programmes and formulating policy decisions. It is rather impossible to think of any sphere of human activity where Statistics does not creep in. The subject of Statistics has acquired tremendous progress in the recent past so much so that an elementary knowledge of statistical methods has become a part of the general education in the curricula of many academic and professional courses. This book is a modest though determined bid to serve as a text-book, for B.Com. (Pass and Hons.); B.A. Economics (Hons.) courses of Indian Universities. The main aim in writing this book is to present a clear, simple, systematic and comprehensive exposition of the principles, methods and techniques of Statistics in various disciplines with special reference to Economics and Business. The stress is on the applications of techniques and methods most commonly used by statisticians. The lucidity of style and simplicity of expression have been our twin objectives in preparing this text. Mathematical complexity has been avoided as far as possible. Wherever desirable, the notations and terminology have been clearly explained and then all the mathematical steps have been explained in detail. An attempt has been made to start with the explanation of the elementaries of a topic and then the complexities and the intricacies of the advanced problems have been explained and solved in a lucid manner. A number of typical problems mostly from various university examination papers have been solved as illustrations so as to expose the students to different techniques of tackling the problems and enable them to have a better and throughful understanding of the basic concepts of the theory and its various applications. At many places explanatory remarks have been given to widen readers’ horizon. Moreover, in order to enable the readers to have a proper appreciation of the subject-matter and to fortify their confidence in the understanding and application of methods, a large number of carefully graded problems, mostly drawn from various university examination papers, have been given as exercise sets in each chapter. Answers to all the problems in the exercise sets are given at the end of each problem. The book contains 16 Chapters. We will not enumerate the topics discussed in the text since an idea of these can be obtained from a cursory glance at the table of contents. Chapters 1 to 11 are devoted to ‘Descriptive Statistics’ which consists in describing some characteristics like averages, dispersion, skewness, kurtosis, correlation, etc., of the numerical data. In spite of many latest developments in statistical techniques, the old topics like ‘Classification and Tabulation’ (Chapter 3) and ‘Diagrammatic and Graphic Representation’ (Chapter 4) have been discussed in details, since they still constitute the bulk of statistical work in government and business organisations. The use of statistical methods as scientific tools in the analysis of economic and business data has been explained in Chapter 10 (Index Numbers) and Chapter 11 (Times Series Analysis). Chapters 12 to 14 relate to advanced topics like Probability, Random (x) Variable, Mathematical Expectation and Theoretical Distributions. An attempt has been made to give a detailed discussion of these topics on modern lines through the concepts of ‘Sample Space’ and ‘Axiomatic Approach’ in a very simple and lucid manner. Chapter 15 (Sampling and Design of Sample Surveys), explains the various techniques of planning and executing statistical enquiries so as to arrive at valid conclusions about the population. Chapter 16 (Interpolation and Extrapolation) deals with the techniques of estimating the value of a function y = f (x) for any given intermediate value of the variable x. We must unreservedly acknowledge our deep debt of gratitude we owe to the numerous authors whose great and masterly works we have consulted during the preparation of the manuscript. We take this opportunity to express our sincere gratitude to Prof. Kanwar Sen, Shri V.K. Kapoor and a number of students for their valuable help and suggestions in the preparation of this book. Last but not least, we express our deep sense of gratitude to our Publishers M/s Himalaya Publishing House for their untiring efforts and unfailing courtesy and co-operation in bringing out the book in time in such an elegant form. Every effort has been made to avoid printing errors though some might have crept in inadvertently. We shall be obliged if any such errors are brought to our notice. Valuable suggestions and criticism for the improvement of the book from our colleagues (who are teaching this course) and students will be highly appreciated and duly incorporated in subsequent editions. June, 1988 S.C. GUPTA Mrs. INDRA GUPTA (xi) Contents 1. INTRODUCTION — MEANING AND SCOPE 1·1 – 1·16 1·1. ORIGIN AND DEVELOPMENT OF STATISTICS 1·1 1·2. DEFINITION OF STATISTICS 1·2 1·3. IMPORTANCE AND SCOPE OF STATISTICS 1·5 1·4. LIMITATIONS OF STATISTICS 1·11 1·5. DISTRUST OF STATISTICS 1·12 EXERCISE 1.1. 1·13 2. COLLECTION OF DATA 2·1 – 2·21 2·1. INTRODUCTION 2·1 2·1·1. Objectives and Scope of the Enquiry. 2·1 2·1·2. Statistical Units to be Used. 2·2 2·1·3. Sources of Information (Data). 2·4 2·1·4. Methods of Data Collection. 2·4 2·1·5. Degree of Accuracy Aimed at in the Final Results. 2·4 2·1·6. Type of Enquiry. 2·5 2·2. PRIMARY AND SECONDARY DATA 2·6 2·2·1. Choice Between Primary and Secondary Data. 2·7 2·3. METHODS OF COLLECTING PRIMARY DATA 2·7 2·3·1. Direct Personal Investigation. 2·8 2·3·2. Indirect Oral Investigation. 2·8 2·3·3. Information Received Through Local Agencies. 2·9 2·3·4. Mailed Questionnaire Method. 2·10 2·3·5. Schedules Sent Through Enumerators. 2·11 2·4. DRAFTING OR FRAMING THE QUESTIONNAIRE 2·12 2·5. SOURCES OF SECONDARY DATA 2·16 2·5·1. Published Sources. 2·16 2·5·2. Unpublished Sources. 2·18 2·6. PRECAUTIONS IN THE USE OF SECONDARY DATA 2·18 EXERCISE 2·1. 2·20 3. CLASSIFICATION AND TABULATION 3·1 – 3·40 3·1. INTRODUCTION – ORGANISATION OF DATA 3·1 3·2. CLASSIFICATION 3·1 3·2·1. Functions of Classification. 3·2 3·2·2. Rules for Classification. 3·2 (xii) 3·2·3. Bases of Classification. 3·3 3·3. FREQUENCY DISTRIBUTION 3·6 3·3·1. Array 3·6 3·3·2. Discrete or Ungrouped Frequency Distribution. 3·6 3·3·3. Grouped Frequency Distribution. 3·7 3·3·4. Continuous Frequency Distribution. 3·8 3·4. BASIC PRINCIPLES FOR FORMING A GROUPED FREQUENCY DISTRIBUTION 3·8 3·4·1. Types of Classes. 3·8 3·4·2. Number of Classes. 3·8 3·4·3. Size of Class Intervals. 3·10 3·4·4. Types of Class Intervals. 3·11 3·5. CUMULATIVE FREQUENCY DISTRIBUTION 3·17 3·5·1. Less Than Cumulative Frequency. 3·18 3·5·2. More Than Cumulative Frequency. 3·18 3·6. BIVARIATE FREQUENCY DISTRIBUTION 3·20 EXERCISE 3·1. 3·22 3·7. TABULATION – MEANING AND IMPORTANCE 3·27 3·7·1. Parts of a Table. 3·27 3·7·2. Requisites of a Good Table. 3·29 3·7·3. Types of Tabulation. 3·30 EXERCISE 3·2. 3·38 4. DIAGRAMMATIC AND GRAPHIC REPRESENTATION 4·1 – 4·57 4·1. INTRODUCTION 4·1 4·2. DIFFERENCE BETWEEN DIAGRAMS AND GRAPHS 4·1 4·3. DIAGRAMMATIC PRESENTATION 4·2 4·3·1. General Rules for Constructing Diagrams. 4·2 4·3·2. Types of Diagrams. 4·3 4·3·3. One-dimensional Diagrams. 4·3 4·3·4. Two-dimensional Diagrams. 4·12 4·3·5. Three-Dimensional Diagrams. 4·20 4·3·6. Pictograms 4·22 4·3·7. Cartograms 4·24 4·3·8. Choice of a Diagram. 4·24 EXERCISE 4·1 4·24 4·4. GRAPHIC REPRESENTATION OF DATA 4·27 4·4·1. Technique of Construction of Graphs. 4·27 4·4·2. General Rules for Graphing. 4·28 4·4·3. Graphs of Frequency Distributions. 4·29 4·4·4. Graphs of Time Series or Historigrams. 4·40 4·4·5. Semi-Logarithmic Line Graphs or Ratio Charts. 4·47 4·5. LIMITATIONS OF DIAGRAMS AND GRAPHS 4·53 EXERCISE 4·2 4·53 (xiii) 5. AVERAGES OR MEASURES OF CENTRAL TENDENCY 5·1 – 5·68 5·1. INTRODUCTION 5·1 5·2. REQUISITES OF A GOOD AVERAGE OR MEASURE OF CENTRAL TENDENCY 5·2 5·3. VARIOUS MEASURES OF CENTRAL TENDENCY 5·2 5·4. ARITHMETIC MEAN 5·2 5·4·1. Step Deviation Method for Computing Arithmetic Mean. 5·3 5·4·2. Mathematical Properties of Arithmetic Mean. 5·5 5·4·3. Merits and Demerits of Arithmetic Mean. 5·8 5·5. WEIGHTED ARITHMETIC MEAN 5·14 EXERCISE 5·1 5·17 5·6. MEDIAN 5·22 5·6·1. Calculation of Median. 5·22 5·6·2. Merits and Demerits of Median. 5·24 5·6·3. Partition Values. 5·26 5·6·4. Graphic Method of Locating Partition Values. 5·28 EXERCISE 5·2 5·31 5·7. MODE 5·35 5·7·1. Computation of Mode. 5·36 5·7·2. Merits and Demerits of Mode. 5·37 5·7·3. Graphic Location of Mode. 5·38 5·8. EMPIRICAL RELATION BETWEEN MEAN (M), MEDIAN (Md) AND MODE (Mo) 5·38 EXERCISE 5·3 5·45 5·9. GEOMETRIC MEAN 5·49 5·9·1. Merits and Demerits of Geometric Mean. 5·50 5·9·2. Compound Interest Formula. 5·51 5·9·3. Average Rate of a Variable Which Increases by Different Rates at Different Periods. 5·51 5·9·4. Wrong Observations and Geometric Mean. 5·52 5·9·5. Weighted Geometric Mean. 5·56 5·10. HARMONIC MEAN 5·57 5·10·1. Merits and Demerits of Harmonic Mean. 5·57 5·10·2. Weighted Harmonic Mean. 5·61 5·11. RELATION BETWEEN ARITHMETIC MEAN, GEOMETRIC MEAN AND HARMONIC MEAN 5·61 5·12. SELECTION OF AN AVERAGE 5·62 5·13. LIMITATIONS OF AVERAGES 5·63 EXERCISE 5·4 5·63 6. MEASURES OF DISPERSION 6·1 – 6·53 6·1. INTRODUCTION AND MEANING 6·1 6·1·1. Objectives or Significance of the Measures of Dispersion. 6·2 6·2. CHARACTERISTICS FOR AN IDEAL MEASURE OF DISPERSION 6·2 6·3. ABSOLUTE AND RELATIVE MEASURES OF DISPERSION 6·2 6·4. MEASURES OF DISPERSION 6·2 6·5. RANGE 6·3 (xiv) 6·5·1. Merits and Demerits of Range. 6·4 6·5·2. Uses. 6·4 6·6. QUARTILE DEVIATION OR SEMI INTER-QUARTILE RANGE 6·5 6·6·1. Merits and Demerits of Quartile Deviation. 6·5 6·7. PERCENTILE RANGE 6·6 EXERCISE 6·1 6·7 6·8. MEAN DEVIATION OR AVERAGE DEVIATION 6·9 6·8·1. Computation of Mean Deviation. 6·9 6·8·2. Short-cut Method of Computing Mean Deviation. 6·10 6·8·3. Merits and Demerits of Mean Deviation. 6·11 6·8·4. Uses. 6·11 6·8·5. Relative Measure of Mean Deviation. 6·11 EXERCISE 6·2 6·15 6·9. STANDARD DEVIATION 6·16 6·9·1. Mathematical Properties of Standard Deviation. 6·18 6·9·2. Merits and Demerits of Standard Deviation 6·18 6·9·3. Variance and Mean Square Deviation. 6·19 6·9·4. Different Formulae for Calculating Variance. 6·19 EXERCISE 6·3 6·29 6·10. STANDARD DEVIATION OF THE COMBINED SERIES 6·33 6·11. COEFFICIENT OF VARIATION 6·36 6·12. RELATIONS BETWEEN VARIOUS MEASURES OF DISPERSION 6·41 EXERCISE 6·4 6·42 6·13. LORENZ CURVE 6·47 EXERCISE 6·5 6·49 EXERCISE 6·6 6·50 7. SKEWNESS, MOMENTS AND KURTOSIS 7·1 – 7·34 7·1. INTRODUCTION 7·1 7·2. SKEWNESS 7·1 7·2·1. Measures of Skewness. 7·2 7·2·2. Karl Pearson’s Coefficient of Skewness. 7·2 EXERCISE 7·1 7·9 7·2·3. Bowley’s Coefficient of Skewness. 7·12 7·2·4. Kelly’s Measure of Skewness. 7·13 7·2·5. Coefficient of Skewness based on Moments. 7·13 EXERCISE 7·2 7·16 7·3. MOMENTS 7·18 7·3·1. Moments about Mean. 7·19 7·3·2. Moments about Arbitrary Point A. 7·19 7·3·3. Relation between Moments about Mean and Moments about Arbitrary Point ‘A’. 7·19 7·3·4. Effect of Change of Origin and Scale on Moments about Mean. 7·21 7·3·5. Sheppard’s Correction for Moments. 7·21 (xv) 7·3·6. Charlier Checks. 7·21 7·4. KARL PEARSON’S BETA (β) AND GAMMA (γ) COEFFICIENTS BASED ON MOMENTS 7·22 7·5. COEFFICIENT OF SKEWNESS BASED ON MOMENTS 7·22 7·6. KURTOSIS 7·23 EXERCISE 7·3. 7·30 8. CORRELATION ANALYSIS 8·1 – 8·46 8·1. INTRODUCTION 8·1 8·1·1. Types of Correlation 8·1 8·1·2. Correlation and Causation. 8·2 8·2. METHODS OF STUDYING CORRELATION 8·3 8·3. SCATTER DIAGRAM METHOD 8·3 EXERCISE 8·1 8·5 8·4. KARL PEARSON’S COEFFICIENT OF CORRELATION (COVARIANCE METHOD) 8·7 8·4·1. Properties of Correlation Coefficient 8·11 8·4·2. Assumptions Underlying Karl Pearson’s Correlation Coefficient. 8·17 8·4·3. Interpretation of r. 8·18 8·5. PROBABLE ERROR 8·18 EXERCISE 8·2 8·20 8·6. CORRELATION IN BIVARIATE FREQUENCY TABLE 8·25 EXERCISE 8·3 8·30 8·7. RANK CORRELATION METHOD 8·31 8·7·1. Limits for ρ. 8·31 8·7·2. Computation of Rank Correlation Coefficient (ρ). 8·32 8·7·3. Remarks on Spearman’s Rank Correlation Coefficient 8·38 EXERCISE 8·4 8·39 8·8. METHOD OF CONCURRENT DEVIATIONS 8·41 EXERCISE 8·5 8·43 8·9. COEFFICIENT OF DETERMINATION 8·43 EXERCISE 8·6 8·44 8.10. LAG AND LEAD CORRELATION 8.45 9. LINEAR REGRESSION ANALYSIS 9·1 – 9·33 9·1. INTRODUCTION 9·1 9·2. LINEAR AND NON-LINEAR REGRESSION 9·2 9·3. LINES OF REGRESSION 9·2 9·3·1. Derivation of Line of Regression of y on x. 9·2 9·3·2. Line of Regression of x on y. 9·4 9·3·3. Angle Between the Regression Lines. 9·5 9·4. COEFFICIENTS OF REGRESSION 9·6 9·4·1. Theorems on Regression Coefficients 9·7 EXERCISE 9·1 9·15 (xvi) — — 9·5. TO FIND THE MEAN VALUES (X , Y ) FROM THE TWO LINES OF REGRESSION 9·20 9·6. TO FIND THE REGRESSION COEFFICIENTS AND THE CORRELATION COEFFICIENT FROM THE TWO LINES OF REGRESSION 9·20 9·7. STANDARD ERROR OF AN ESTIMATE 9·23 9·8. REGRESSION EQUATIONS FOR A BIVARIATE FREQUENCY TABLE 9·26 9·9. CORRELATION ANALYSIS vs. REGRESSION ANALYSIS 9·28 EXERCISE 9·2 9·29 EXERCISE 9·3 9·32 10. INDEX NUMBERS 10·1 – 10·65 10·1. INTRODUCTION 10·1 10·2. USES OF INDEX NUMBERS 10·1 10·3. TYPES OF INDEX NUMBERS 10·3 10·4. PROBLEMS IN THE CONSTRUCTION OF INDEX NUMBERS 10·3 10·5. METHODS OF CONSTRUCTING INDEX NUMBERS 10·7 10·5·1. Simple (Unweighted) Aggregate Method. 10·7 10·5·2. Weighted Aggregate Method. 10·8 10·5·3. Simple Average of Price Relatives. 10·15 10·5·4. Weighted Average of Price Relatives 10.17 EXERCISE 10·1 10·20 10·6. TESTS OF CONSISTENCY OF INDEX NUMBER FORMULAE 10·26 10·6·1. Unit Test. 10·26 10·6·2. Time Reversal Test. 10·26 10·6·3. Factor Reversal Test. 10·27 10·6·4. Circular Test. 10·28 EXERCISE 10·2 10·33 10·7. CHAIN INDICES OR CHAIN BASE INDEX NUMBERS 10·35 10·7·1. Uses of Chain Base Index Numbers. 10·36 10·7·2. Limitations of Chain Base Index Numbers. 10·36 EXERCISE 10·3 10·38 10·8. BASE SHIFTING, SPLICING AND DEFLATING OF INDEX NUMBERS 10·40 10·8·1. Base Shifting. 10·40 10·8·2. Splicing. 10·41 10·8·3. Deflating of Index Numbers. 10·45 EXERCISE 10·4 10·48 10·9. COST OF LIVING INDEX NUMBER 10·51 10·9·1. Main Steps in the Construction of Cost of Living Index Numbers 10·52 10·9·2. Construction of Cost of Living Index Numbers. 10·53 10·9·3. Uses of Cost of Living Index Numbers. 10·53 10·10. LIMITATIONS OF INDEX NUMBERS 10·59 EXERCISE 10·5 10·60 (xvii) 11. TIME SERIES ANALYSIS 11·1 – 11·60 11·1. INTRODUCTION 11·1 11·2. COMPONENTS OF A TIME SERIES 11·1 11·2·1. Secular Trend. 11·2 11·2·2. Short-Term Variations. 11·3 11·2·3. Random or Irregular Variations. 11·4 11·3. ANALYSIS OF TIME SERIES 11·5 11·4. MATHEMATICAL MODELS FOR TIME SERIES 11·5 11·5. MEASUREMENT OF TREND 11·6 11·5·1. Graphic or Free Hand Curve Fitting Method. 11·6 11·5·2. Method of Semi-Averages. 11·7 11·5·3. Method of Curve Fitting by the Principle of Least Squares. 11·10 11·5·4. Conversion of Trend Equation. 11·22 11·5·5. Selection of the Type of Trend. 11·25 EXERCISE 11·1 11·25 11·5·6. Method of Moving Averages. 11·30 EXERCISE 11·2 11·38 11·6. MEASUREMENT OF SEASONAL VARIATIONS 11·39 11·6·1. Method of Simple Averages. 11·40 11·6·2. Ratio to Trend Method. 11·42 11·6·3. ‘Ratio to Moving Average’ Method. 11·44 11·6·4. Method of Link Relatives. 11·47 11·6·5. Deseasonalisation of Data. 11·49 11·7. MEASUREMENT OF CYCLICAL VARIATIONS 11·52 11·8. MEASUREMENT OF IRREGULAR VARIATIONS 11·53 11.9. TIME SERIES ANALYSIS IN FORECASTING 11.54 EXERCISE 11·3 11·54 EXERCISE 11·4 11·59 12. THEORY OF PROBABILITY 12·1 – 12·52 12·1. INTRODUCTION 12·1 12·2. SHORT HISTORY 12·1 12·3. TERMINOLOGY 12·2 12·4. MATHEMATICAL PRELIMINARIES 12·4 12·4·1. Set Theory. 12·4 12·4·2. Permutation and Combination. 12·6 12·5. MATHEMATICAL OR CLASSICAL OR ‘A PRIORI’ PROBABILITY 12·8 12·6. STATISTICAL OR EMPIRICAL PROBABILITY 12·9 EXERCISE 12·1 12·14 12·7. AXIOMATIC PROBABILITY 12·17 12·8. ADDITION THEOREM OF PROBABILITY 12·19 12·8·1. Addition Theorem of Probability for Mutually Exclusive Events. 12·20 12·8·2. Generalisation of Addition Theorem of Probability. 12·20 (xviii) 12·9. THEOREM OF COMPOUND PROBABILITY OR MULTIPLICATION THEOREM OF PROBABILITY 12·21 Generalisation of Multiplication Theorem of Probability. 12·22 12·9·1. Independent Events. 12·22 12·9·2. Multiplication Theorem for Independent Events. 12·22 EXERCISE 12·2 12·35 OBJECTIVE TYPE QUESTIONS 12·41 12·10. INVERSE PROBABILITY 12·43 Bayes’s Theorem (Rule for the Inverse Probability) 12·43 EXERCISE 12·3 12·49 13. RANDOM VARIABLE, PROBABILITY DISTRIBUTIONS AND MATHEMATICAL EXPECTATION 13·1 – 13·19 13·1. RANDOM VARIABLE 13·1 13·2. PROBABILITY DISTRIBUTION OF A DISCRETE RANDOM VARIABLE 13·2 13·3. PROBABILITY DISTRIBUTION OF A CONTINUOUS RANDOM VARIABLE 13·2 13·3·1. Probability Density Function (p.d.f.) of Continuous random Variable 13·2 13·4. DISTRIBUTION FUNCTION OR CUMULATIVE PROBABILITY FUNCTION 13·3 13·5. MOMENTS 13·4 EXERCISE 13·1 13·6 13·6. MATHEMATICAL EXPECTATION 13·7 Physical Interpretation of E(X). 13·7 13·7. THEOREMS ON EXPECTATION 13·8 13·8. VARIANCE OF X IN TERMS OF EXPECTATION 13·9 13.9. COVARIANCE IN TERMS OF EXPECTATION 13·9 13·10. VARIANCE OF LINEAR COMBINATION 13·14 13·11. JOINT AND MARGINAL PROBABILITY DISTRIBUTIONS 13·14 EXERCISE 13·2 13·16 14. THEORETICAL DISTRIBUTIONS 14·1 – 14·52 14·1. INTRODUCTION 14·1 14·2. BINOMIAL DISTRIBUTION 14·1 14·2·1. Probability Function of Binomial Distribution. 14·2 14·2·2. Constants of Bionomial Distribution 14·3 14·2·3. Mode of Binomial Distribution. 14·5 14·2·4. Fitting of Binomial Distribution. 14·10 EXERCISE 14·1 14·11 14·3. POISSON DISTRIBUTION (AS A LIMITING CASE OF BINOMIAL DISTRIBUTION) 14·16 14·3·1. Utility or Importance of Poisson Distribution. 14·18 14·3·2. Constants of Poisson Distribution 14·18 14·3·3. Mode of Poisson Distribution 14·19 14·3·4. Fitting of Poisson Distribution. 14·23 EXERCISE 14·2 14·25 (xix) 14·4. NORMAL DISTRIBUTION 14·28 14·4·1. Equation of Normal Probability Curve. 14·28 14·4·2. Standard Normal Distribution. 14·29 14·4·3. Relation between Binomial and Normal Distributions. 14·29 14·4·4. Relation between Poisson and Normal Distributions. 14·30 14·4·5. Properties of Normal Distribution. 14·30 14·4·6. Areas Under Standard Normal Probability Curve 14·33 14·4·7. Importance of Normal Distribution. 14·36 EXERCISE 14·3 14·45 15. SAMPLING THEORY AND DESIGN OF SAMPLE SURVEYS 15·1 – 15·26 15·1. INTRODUCTION 15·1 15·2. UNIVERSE OR POPULATION 15·1 15·3. SAMPLING 15·2 15·4. PARAMETER AND STATISTIC 15·2 15·4·1. Sampling Distribution. 15·3 15·4·2. Standard Error. 15·3 15·5. PRINCIPLES OF SAMPLING 15·4 15·5·1. Law of Statistical Regularity. 15·4 15·5·2. Principle of Inertia of Large Numbers. 15·5 15·5·3. Principle of Persistence of Small Numbers. 15·5 15·5·4. Principle of Validity. 15·5 15·5·5. Principle of Optimisation. 15·5 15·6. CENSUS VERSUS SAMPLE ENUMERATION 15·5 15·7. LIMITATIONS OF SAMPLING 15·7 15·8. PRINCIPAL STEPS IN A SAMPLE SURVEY 15·8 15·9. ERRORS IN STATISTICS 15·10 15·9·1. Sampling and Non-Sampling Errors. 15·10 15·9·2. Biased and Unbiased Errors. 15·13 15·9·3. Measures of Statistical Errors (Absolute and Relative Errors). 15·14 15·10. TYPES OF SAMPLING 15·14 15·10·1. Purposive or Subjective or Judgment Sampling. 15·14 15·10·2. Probability Sampling. 15·15 15·10·3. Mixed Sampling. 15·15 15·11. SIMPLE RANDOM SAMPLING 15·15 15·11·1. Selection of a Simple Random Sample. 15·16 15·11·2. Sampling Distribution of Mean 15·18 15·11·3. Merits and Limitations of Simple Random Sampling 15·19 15·12. STRATIFIED RANDOM SAMPLING 15·19 15·12·1. Allocation of Sample Size in Stratified Sampling. 15·20 15·12·2. Merits and Demerits of Stratified Random Sampling. 15·21 15·13. SYSTEMATIC SAMPLING 15·22 15·13·1. Merits and Demerits 15·23 (xx) 15·14. CLUSTER SAMPLING 15·23 15·15. MULTISTAGE SAMPLING 15·23 15·16. QUOTA SAMPLING 15·24 EXERCISE 15·1. 15·24 16. INTERPOLATION AND EXTRAPOLATION 16·1 – 16·28 16·1. INTRODUCTION 16·1 16·1·1. Assumptions. 16·1 16·1·2. Uses of Interpolation. 16·2 16·2. METHODS OF INTERPOLATION 16·2 16·3. GRAPHIC METHOD 16·2 16·4. ALGEBRAIC METHOD 16·3 16·5. METHOD OF PARABOLIC CURVE FITTING 16·3 16·6. METHOD OF FINITE DIFFERENCES 16·5 16·7. NEWTON’S FORWARD DIFFERENCE FORMULA 16·7 16·8. NEWTON’S BACKWARD DIFFERENCE FORMULA 16·11 EXERCISE 16·1 16·12 16·9. BINOMIAL EXPANSION METHOD FOR INTERPOLATING MISSING VALUES 16·15 EXERCISE 16·2 16·19 16·10. INTERPOLATION WITH ARGUMENTS AT UNEQUAL INTERVALS 16·20 16·11. DIVIDED DIFFERENCES 16·21 16·11·1. Newton’s Divided Difference Formula. 16·22 16·12. LAGRANGE’S FORMULA 16·24 16·13. INVERSE INTERPOLATION 16·26 EXERCISE 16·3 16·27 17. INTERPRETATION OF DATA AND STATISTICAL FALLACIES 17·1 – 17·14 17·1. INTRODUCTION 17·1 17·2. INTERPRETATION OF DATA AND STATISTICAL FALLACIES – MEANING AND NEED 17·1 17.3. FACTORS LEADING TO MIS-INTERPRETATION OF DATA OR STATISTICAL FALLACIES 17·2 17·3·1. Bias. 17·2 17·3·2. Inconsistencies in Definitions. 17·2 17·3·3. Faulty Generalisations. 17·3 17·3·4. Inappropriate Comparisons. 17·3 17·3·5. Wrong Interpretation of Statistical Measures. 17·4 17·3·6. (a) Wrong Interpretation of Index Numbers. 17·10 17·3·6. (b) Wrong Interpretation of Components of Time Series – (Trend, Seasonal and Cyclical Variations). 17·10 17·3·7. Technical Errors 17·11 17·4. EFFECT OF WRONG INTERPRETATION OF DATA – DISTRUST OF STATISTICS 17·11 EXERCISE 17·1 17·11 (xxi) 18. STATISTICAL DECISION THEORY 18·1 – 18·35 18·1. INTRODUCTION 18·1 18·2. INGREDIENTS OF DECISION PROBLEM 18·2 18·2·1. Acts. 18·2 18·2·2. States of Nature or Events. 18·2 18·2·3. Payoff Table. 18·2 18·2·4. Opportunity Loss (O.L.). 18·3 18·2·5. Decision Making Environment 18·3 18·2·6. Decision Making Under Certainty. 18·4 18·2·7. Decision Making Under Uncertainty. 18·4 18·3. OPTIMAL DECISION 18·5 18·3·1. Maximax Criterion. 18·5 18·3·2. Maximin Criterion. 18·6 18·3·3. Minimax Criterion. 18·6 18·3·4. Laplace Criterion of Equal Likelihoods. 18·6 18·3·5. Hurwicz Criterion of Realism. 18·7 18·3·6. Expected Monetary Value (EMV). 18·10 18·3·7. Expected Opportunity Loss (EOL) Criterion. 18·11 18·3·8. Expected Value of Perfect Information (EVPI). 18·12 18·4. DECISION TREE 18·23 18·4·1. Roll Back Technique of Analysing a Decision Tree. 18·24 EXERCISE 18·1. 18·29 19. THEORY OF ATTRIBUTES 19·1 – 19·27 19·1. INTRODUCTION 19·1 19·2. NOTATIONS 19·1 19·3. CLASSES AND CLASS FREQUENCIES 19·1 19·3·1. Order of Classes and Class Frequencies. 19·2 19·3·2. Ultimate Class Frequency. 19·2 19·3·3. Relation Between Class Frequencies. 19·3 EXERCISE 19·1 19·8 19·4. INCONSISTENCY OF DATA 19·10 19·4·1. Conditions for Consistency of Data. 19·10 19·4·2. Incomplete Data. 19·11 EXERCISE 19·2 19·13 19·5. INDEPENDENCE OF ATTRIBUTES 19·15 19·5·1. Criteria of Independence of Two Attributes. 19·15 19·6. ASSOCIATION OF ATTRIBUTES 19·18 19·6·1. (Criterion 1). Proportion Method. 19·18 19·6·2. (Criterion 2). Comparison of Observed and Expected Frequencies. 19·18 (xxii) 19·6·3. (Criterion 3) Yule’s Coefficient of Association. 19·18 19·6·4. (Criterion 4). Coefficient of Colligation. 19·19 EXERCISE 19·3 19·24 Appendix I : NUMERICAL TABLES T·1 – T· 9 Appendix II : BIBLIOGRAPHY B·1 INDEX I·1 – I·6 Introduction — 1 Meaning & Scope 1·1. ORIGIN AND DEVELOPMENT OF STATISTICS The subject of Statistics, as it seems, is not a new discipline but it is as old as the human society itself. It has been used right from the existence of life on this earth, although the sphere of its utility was very much restricted. In the old days, Statistics was regarded as the ‘Science of Statecraft’ and was the by- product of the administrative activity of the State. The word Statistics seems to have been derived from the Latin word ‘status’ or the Italian word ‘statista’ or the German word ‘statistik’ or the French word ‘statistique’, each of which means a political state. In the ancient times the scope of Statistics was primarily limited to the collection of the following data by the governments for framing military and fiscal policies : (i) Age and sex-wise population of the country ; (ii) Property and wealth of the country ; the former enabling the government to have an idea of the manpower of the country (in order to safeguard itself against any outside aggression) and the latter providing it with information for the introduction of new taxes and levies. Perhaps one of the earliest censuses of population and wealth was conducted by the Pharaohs (Emperors) of Egypt in connection with the construction of famous ‘Pyramids’. Such censuses were later held in England, Germany and other western countries in the middle ages. In India, an efficient system of collecting official and administrative statistics existed even 2000 years ago - in particular during the reign of Chandragupta Maurya (324 – 300 B.C.). Historical evidences about the prevalence of a very good system of collecting vital statistics and registration of births and deaths even before 300 B.C. are available in Kautilya’s ‘Arthashastra’. The records of land, agriculture and wealth statistics were maintained by Todermal, the land and revenue minister in the reign of Akbar (1556 – 1605 A.D.). A detailed account of the administrative and statistical surveys conducted during Akbar’s reign is available in the book ‘Ain-e- Akbari’ written by Abul Fazl (in 1596 – 97), one of the nine gems of Akbar. In Germany, the systematic collection of official statistics originated towards the end of the 18th century when, in order to have an idea of the relative strength of different German States, information regarding population and output—industrial and agricultural—was collected. In England, statistics were the outcome of Napoleonic wars. The wars necessitated the systematic collection of numerical data to enable the government to assess the revenues and expenditure with greater precision and then to levy new taxes in order to meet the cost of war. Sixteenth century saw the applications of Statistics for the collection of the data relating to the movements of heavenly bodies – stars and planets – to know about their position and for the prediction of eclipses. J. Kepler made a detailed study of the information collected by Tycko Brave (1554 – 1601) regarding the movements of the planets and formulated his famous three laws relating to the movements of heavenly bodies. These laws paved the way for the discovery of Newton’s law of gravitation. Seventeenth century witnessed the origin of Vital Statistics. Captain John Graunt of London (1620 – 1674), known as the Father of Vital Statistics, was the first man to make a systematic study of the birth and death statistics. Important contributions in this field were also made by prominent persons like Casper Newman (in 1691), Sir William Petty (1623 – 1687), James Dodson, Thomas Simpson and Dr. Price. The computation of mortality tables and the calculation of expectation of life at different ages by 1·2 BUSINESS STATISTICS these persons led to the idea of ‘Life Insurance’ and Life Insurance Institution was founded in London in 1698. William Petty wrote the book ‘Essay on Political Arithmetic’. In those days Statistics was regarded as Political Arithmetic. This concept of Statistics as Political Arithmetic continued even in early 18th century when J.P. Sussmilch (1707 – 1767), a Prussian Clergyman, formulated his doctrine that the ratio of births and deaths more or less remains constant and gave statistical explanation to the theory of ‘Natural Order of Physiocratic School’. The backbone of the so-called modern theory of Statistics is the ‘Theory of Probability’ or the ‘Theory of Games and Chance’ which was developed in the mid-seventeenth century. Theory of probability is the outcome of the prevalence of gambling among the nobles of England and France while estimating the chances of winning or losing in the gamble, the chief contributors being the mathematicians and gamblers of France, Germany and England. Two French mathematicians Pascal (1623 – 1662) and P. Fermat (1601 – 1665), after a lengthy correspondence between themselves ultimately succeeded in solving the famous ‘Problem of Points’ posed by the French gambler Chevalier de-Mere and this correspondence laid the foundation stone of the science of probability. Next stalwart in this field was, J. Bernoulli (1654 – 1705) whose great treatise on probability ‘Ars Conjectandi’ was published posthumously in 1713, eight years after his death by his nephew Daniel Bernoulli (1700 – 1782). This contained the famous ‘Law of Large Numbers’ which was later discussed by Poisson, Khinchine and Kolmogorov. De-Moivre (1667 – 1754) also contributed a lot in this field and published his famous ‘Doctrine of Chance’ in 1718 and also discovered the Normal probability curve which is one of the most important contributions in Statistics. Other important contributors in this field are Pierra Simon de Laplace (1749 – 1827) who published his monumental work ‘Theoric Analytique de’s of Probabilities’, on probability in 1782; Gauss (1777 – 1855) who gave the principle of Least Squares and established the ‘Normal Law of Errors’ independently of De- Moivre; L.A.J. Quetlet (1798 – 1874) discovered the principle of ‘Constancy of Great Numbers’ which forms the basis of sampling; Euler, Lagrange, Bayes, etc. Russian mathematicians also have made very outstanding contributions to the modern theory of probability, the main contributors to mention only a few of them are : Chebychev (1821 – 1894), who founded the Russian School of Statisticians ; A. Markov (Markov Chains) ; Liapounoff (Central Limit Theorem); A. Khinchine (Law of Large Numbers) ; A Kolmogorov (who axiomised the calculus of probability) ; Smirnov, Gnedenko and so on. Modern stalwarts in the development of the subject of Statistics are Englishmen who did pioneering work in the application of Statistics to different disciplines. Francis Galton (1822 – 1921) pioneered the study of ‘Regression Analysis’ in Biometry; Karl Pearson (1857 – 1936) who founded the greatest statistical laboratory in England pioneered the study of ‘Correlation Analysis’. His Chi-Square test (χ2-test) of Goodness of Fit is the first and most important of the tests of significance in Statistics ; W.S. Gosset with his t-test ushered in an era of exact (small) sample tests. Perhaps most of the work in the statistical theory during the past few decades can be attributed to a single person Sir Ronald A. Fisher (1890 – 1962) who applied Statistics to a variety of diversified fields such as genetics, biometry, psychology and education, agriculture, etc., and who is rightly termed as the Father of Statistics. In addition to enhancing the existing statistical theory he is the pioneer in Estimation Theory (Point Estimation and Fiducial Inference); Exact (small) Sampling Distributions ; Analysis of Variance and Design of Experiments. His contributions to the subject of Statistics are described by one writer in the following words : “R.A. Fisher is the real giant in the development of the theory of Statistics.” It is only the varied and outstanding contributions of R.A. Fisher that put the subject of Statistics on a very firm footing and earned for it the status of a full-fledged science. Indian statisticians also did not lag behind in making significant contributions to the development of Statistics in various diversified fields. The valuable contributions of C.R. Rao (Statistical Inference); Parthasarathy (Theory of Probability); P.C. Mahalanobis and P.V. Sukhatme (Sample Surveys) ; S.N. Roy (Multivariate Analysis) ; R.C. Bose, K.R. Nair, J.N. Srivastava (Design of Experiments), to mention only a few, have placed India’s name in the world map of Statistics. 1·2. DEFINITION OF STATISTICS Statistics has been defined differently by different writers from time to time so much so that scholarly articles have collected together hundreds of definitions, emphasizing precisely the meaning, scope and limitations of the subject. The reasons for such a variety of definitions may be broadly classified as follows : INTRODUCTION — MEANING & SCOPE 1·3 (i) The field of utility of Statistics has been increasing steadily and thus different people defined it differently according to the developments of the subject. In old days, Statistics was regarded as the ‘science of statecraft’ but today it embraces almost every sphere of natural and human activity. Accordingly, the old definitions which were confined to a very limited and narrow field of enquiry were replaced by the new definitions which are more exhaustive and elaborate in approach. (ii) The word Statistics has been used to convey different meanings in singular and plural sense. When used as plural, statistics means numerical set of data and when used in singular sense it means the science of statistical methods embodying the theory and techniques used for collecting, analysing and drawing inferences from the numerical data. It is practically impossible to enumerate all the definitions given to Statistics both as ‘Numerical Data’ and ‘Statistical Methods’ due to limitations of space. However, we give below some selected definitions. WHAT THEY SAY ABOUT STATISTICS — SOME DEFINITIONS ”STATISTICS AS NUMERICAL DATA” 1. “Statistics are the classified facts representing the conditions of the people in a State…specially those facts which can be stated in number or in tables of numbers or in any tabular or classified arrangement.”—Webster. 2. “Statistics are numerical statements of facts in any department of enquiry placed in relation to each other.”— Bowley. 3. “By statistics we mean quantitative data affected to a marked extent by multiplicity of causes”.—Yule and Kendall. 4. “Statistics may be defined as the aggregate of facts affected to a marked extent by multiplicity of causes, numerically expressed, enumerated or estimated according to a reasonable standard of accuracy, collected in a systematic manner, for a predetermined purpose and placed in relation to each other.”—Prof. Horace Secrist. Remarks and Comments. 1. According to Webster’s definition only numerical facts can be termed Statistics. Moreover, it restricts the domain of Statistics to the affairs of a State i.e., to social sciences. This is a very old and narrow definition and is inadequate for modern times since today, Statistics embraces all sciences – social, physical and natural. 2. Bowley’s definition is more general than Webster’s since it is related to numerical data in any department of enquiry. Moreover it also provides for comparative study of the figures as against mere classification and tabulation of Webster’s definition. 3. Yule and Kendall’s definition refers to numerical data affected by a multiplicity of causes. This is usually the case in social, economic and business phenomenon. For example, the prices of a particular commodity are affected by a number of factors viz., supply, demand, imports, exports, money in circulation, competitive products in the market and so on. Similarly, the yield of a particular crop depends upon multiplicity of factors like quality of seed, fertility of soil, method of cultivation, irrigation facilities, weather conditions, fertilizer used and so on. 4. Secrist’s definition seems to be the most exhaustive of all the four. Let us try to examine it in details. (i) Aggregate of Facts. Simple or isolated items cannot be termed as Statistics unless they are a part of aggregate of facts relating to any particular field of enquiry. For instance, the height of an individual or the price of a particular commodity do not form Statistics as such figures are unrelated and uncomparable. However, aggregate of the figures of births, deaths, sales, purchase, production, profits, etc., over different times, places, etc., will constitute Statistics. (ii) Affected by Multiplicity of Causes. Numerical figures should be affected by multiplicity of factors. This point has already been elaborated in remark 3 above. In physical sciences, it is possible to isolate the effect of various factors on a single item but it is very difficult to do so in social sciences, particularly when the effect of some of the factors cannot be measured quantitatively. However, statistical techniques have been devised to study the joint effect of a number of a factors on a single item (Multiple Correlation) or the isolated effect of a single factor on the given item (Partial Correlation) provided the effect of each of the factors can be measured quantitatively. 1·4 BUSINESS STATISTICS (iii) Numerically Expressed. Only numerical data constitute Statistics. Thus the statements like ‘the standard of living of the people in Delhi has improved’ or ‘the production of a particular commodity is increasing’ do not constitute Statistics. In particular, the qualitative characteristics which cannot be measured quantitatively such as intelligence, beauty, honesty, etc., cannot be termed as Statistics unless they are numerically expressed by assigning particular scores as quantitative standards. For example, intelligence is not Statistics but the intelligence quotients which may be interpreted as the quantitative measure of the intelligence of individuals could be regarded as Statistics. (iv) Enumerated or Estimated According to Reasonable Standard of Accuracy. The numerical data pertaining to any field of enquiry can be obtained by completely enumerating the underlying population. In such a case data will be exact and accurate (but for the errors of measurement, personal bias, etc.). However, if complete enumeration of the underlying population is not possible (e.g., if population is infinite, or if testing is destructive i.e., if the item is destroyed in the course of inspection just like in testing explosives, light bulbs, etc.), and even if possible it may not be practicable due to certain reasons (such as population being very large, high cost of enumeration per unit and our resources being limited in terms of time and money, etc.), then the data are estimated by using the powerful techniques of Sampling and Estimation theory. However, the estimated values will not be as precise and accurate as the actual values. The degree of accuracy of the estimated values largely depends on the nature and purpose of the enquiry. For example, while measuring the heights of individuals accuracy will be aimed in terms of fractions of an inch whereas while measuring distance between two places it may be in terms of metres and if the places are very distant, e.g., say Delhi and London, the difference of few kilometres may be ignored. However, certain standards of accuracy must be maintained for drawing meaningful conclusions. (v) Collected in a Systematic Manner. The data must be collected in a very systematic manner. Thus, for any socio-economic survey, a proper schedule depending on the object of enquiry should be prepared and trained personnel (investigators) should be used to collect the data by interviewing the persons. An attempt should be made to reduce the personal bias to the minimum. Obviously, the data collected in a haphazard way will not conform to the reasonable standards of accuracy and the conclusions based on them might lead to wrong or misleading decisions. (vi) Collected for a Pre-determined Purpose. It is of utmost importance to define in clear and concrete terms the objectives or the purpose of the enquiry and the data should be collected keeping in view these objectives. An attempt should not be made to collect too many data some of which are never examined or analysed i.e., we should not waste time in collecting the information which is irrelevant for our enquiry. Also it should be ensured that no essential data are omitted. For example, if the purpose of enquiry is to measure the cost of living index for low income group people, we should select only those commodities or items which are consumed or utilised by persons belonging to this group. Thus for such an index, the collection of the data on the commodities like scooters, cars, refrigerators, television sets, high quality cosmetics, etc., will be absolutely useless. (vii) Comparable. From practical point of view, for statistical analysis the data should be comparable. They may be compared with respect to some unit, generally time (period) or place. For example, the data relating to the population of a country for different years or the population of different countries in some fixed year constitute Statistics, since they are comparable. However, the data relating to the size of the shoe of an individual and his intelligence quotient (I.Q.) do not constitute Statistics as they are not comparable. In order to make valid comparisons the data should be homogeneous i.e., they should relate to the same phenomenon or subject. 5. From the definition of Horace Secrist and its discussion in remark 4 above, we may conclude that : “All Statistics are numerical statements of facts but all numerical statements of facts are not Statistics”. 6. We give below the definitions of Statistics used in singular sense i.e., Statistics as Statistical Methods. WHAT THEY SAY ABOUT STATISTICS — SOME DEFINITIONS “STATISTICS AS STATISTICAL METHODS” 1. Statistics may be called the science of counting. —Bowley A.L. 2. Statistics may rightly be called the science of averages. —Bowley A.L. INTRODUCTION — MEANING & SCOPE 1·5 3. Statistics is the science of the measurement of social organism, regarded as a whole in all its manifestations. —Bowley A.L. 4. “Statistics is the science of estimates and probabilities.” — Boddington 5. “The science of Statistics is the method of judging collective, natural or social phenomenon from the results obtained from the analysis or enumeration or collection of estimates.” —King 6. Statistics is the science which deals with classification and tabulation of numerical facts as the basis for explanation, description and comparison of phenomenon.”—Lovin 7. “Statistics is the science which deals with the methods of collecting, classifying, presenting, comparing and interpreting numerical data collected to throw some light on any sphere of enquiry.”—Selligman 8. “Statistics may be defined as the science of collection, presentation, analysis and interpretation of numerical data.” —Croxton and Cowden 9. “Statistics may be regarded as a body of methods for making wise decisions in the face of uncertainty.”—Wallis and Roberts 10. “Statistics is a method of decision making in the face of uncertainty on the basis of numerical data and calculated risks.”—Prof. Ya-Lun-Chou 11. “The science and art of handling aggregate of facts—observing, enumeration, recording, classifying and otherwise systematically treating them.”—Harlow Some Comments and Remarks. 1. The first three definitions due to Bowley are inadequate. 2. Boddington’s definition also fails to describe the meaning and functions of Statistics since it is confined to only probabilities and estimates which form only a part of the modern statistical tools and do not describe the science of Statistics in all its manifestations. 3. King’s definition is also inadequate since it confines Statistics only to social sciences. Lovitt’s definition is fairly satisfactory, though incomplete. Selligman’s definition, though very short and simple is quite comprehensive. However, the best of all the above definitions seems to be one given by Croxton and Cowden. 4. Wallis and Roberts’ definition is quite modern since statistical methods enable us to arrive at valid decisions. Prof. Chou’s definition in number 10 is a modified form of this definition. 5. Harlow’s definition describes Statistics both as a science and an art—science, since it provides tools and laws for the analysis of the numerical information collected from the source of enquiry and art, since it undeniably has its basis upon numerical data collected with a view to maintain a particular balance and consistency leading to perfect or nearly perfect conclusions. A statistician like an artist will fail in his job if he does not possess the requisite skill, experience and patience while using statistical tools for any problem. 1·3. IMPORTANCE AND SCOPE OF STATISTICS In the ancient times Statistics was regarded only as the science of Statecraft and was used to collect information relating to crimes, military strength, population, wealth, etc., for devising military and fiscal policies. But with the concept of Welfare State taking roots almost all over the world, the scope of Statistics has widened to social and economic phenomenon. Moreover, with the developments in the statistical techniques during the last few decades, today, Statistics is viewed not only as a mere device for collecting numerical data but as a means of sound techniques for their handling and analysis and drawing valid inferences from them. Accordingly, it is not merely a by-product of the administrative set up of the State but it embraces all sciences—social, physical, and natural, and is finding numerous applications in various diversified fields such as agriculture, industry, sociology, biometry, planning, economics, business, management, psychometry, insurance, accountancy and auditing, and so on. It is rather impossible to think of any sphere of human activity where Statistics does not creep in. It will not be exaggeration to say that Statistics has assumed unprecedented dimensions these days and statistical thinking is becoming more and more indispensable every day for an able citizenship. In fact to a very striking degree, the modern culture has become a statistical culture and the subject of Statistics has acquired tremendous progress in the recent 1·6 BUSINESS STATISTICS past so much so that an elementary knowledge of statistical methods has become a part of the general education in the curricula of many universities all over the world. The importance of Statistics is amply explained in the following words of Carrol D. Wright (1887), United States Commissioner of the Bureau of Labour : “To a very striking degree our culture has become a Statistical culture. Even a person who may never have heard of an index number is affected…by … of those index numbers which describe the cost of living. It is impossible to understand Psychology, Sociology, Economics, Finance or a Physical Science without some general idea of the meaning of an average, of variation, of concomitance, of sampling, of how to interpret charts and tables.” There is no ground for misgivings regarding the practical realisation of the dream of H.G. Wells viz., “Statistical thinking will one day be as necessary for effective citizenship as the ability to read and write.” Statistics has become so much indispensable in all phases of human endeavour that it is often remarked, “Statistics is what statisticians do” and it appears that Bowley was right when he said, “A knowledge of Statistics is like a knowledge of foreign language or of algebra; it may prove of use at any time under any circumstances.” Let us now discuss briefly the importance of Statistics in some different disciplines. Statistics in Planning. Statistics is imdispensable in planning – may it be in business, economics or government level. The modern age is termed as the ‘age of planning’ and almost all organisations in the government or business or management are resorting to planning for efficient working and for formulating policy decisions. To achieve this end, the statistical data relating to production, consumption, prices, investment, income, expenditure and so on and the advanced statistical techniques such as index numbers, time series analysis, demand analysis and forecasting techniques for handling such data are of paramount importance. Today efficient planning is a must for almost all countries, particularly the developing economies for their economic development and in order that planning is successful, it must be based on a correct and sound analysis of complex statistical data. For instance, in formulating a five-year plan, the government must have an idea of the age and sex-wise break up of the population projections of the country for the next five years in order to develop its various sectors like agriculture, industry, textiles, education and so on. This is achieved through the powerful statistical tool of forecasting by making use of the population data for the previous years. Even for making decisions concerning the day to day policy of the country, an accurate statistical knowledge of the age and sex-wise composition of the population is imperative for the government. In India, the use of Statistics in planning was well visualised long back and the National Sample Survey (N.S.S.) was primarily set up in 1950 for the collection of statistical data for planning in India. Statistics in State. As has already been pointed out, in the old days Statistics was the science of State- craft and its objective was to collect data relating to manpower, crimes, income and wealth, etc., for formulating suitable military and fiscal policies. With the inception of the idea of Welfare State and its taking deep roots in almost all the countries, today statistical data relating to prices, production, consumption, income and expenditure, investments and profits, etc., and statistical tools of index numbers, time series analysis, demand analysis, forecasting, etc., are extensively used by the governments in formulating economic policies. (For details see Statistics in Economics). Moreover as pointed out earlier (Statistics in planning), statistical data and techniques are indispensable to the government for planning future economic programmes. The study of population movement i.e., population estimates, population projections and other allied studies together with birth and death statistics according to age and sex distribution provide any administration with fundamental tools which are indispensable for overall planning and evaluation of economic and social development programmes. The facts and figures relating to births, deaths and marriages are of extreme importance to various official agencies for a variety of administrative purposes. Mortality (death) statistics serve as a guide to the health authorities for sanitary improvements, improved medical facilities and public cleanliness. The data on the incidence of diseases together with the number of deaths by age and nature of diseases are of paramount importance to health authorities in taking appropriate remedial action to prevent or control the spread of the disease. The use of statistical data and statistical techniques is so wide in government functioning that today, almost all ministries and the departments in the government have a separate statistical unit. In fact, today, in most countries the State (government) is the single unit which is the biggest collector and user of statistical data. In addition to the INTRODUCTION — MEANING & SCOPE 1·7 various statistical bureaux in all the ministries and the government departments in the Centre and the States, the main Statistical Agencies in India are Central Statistical Organisation (C.S.O.) ; National Sample Survey (N.S.S.), now called National Sample Survey Organisation (N.S.S.O.) and the Registrar General of India (R.G.I.). Statistics in Economics. In old days, Economic Theories were based on deductive logic only. Moreover, the statistical techniques were not that much advanced for applications in other disciplines. It gradually dawned upon economists of the Deductive School to use Statistics effectively by making empirical studies. In 1871, W.S. Jevons, wrote that : “The deductive science of economy must be verified and rendered useful from the purely inductive science of Statistics. Theory must be invested with the reality of life and fact.” These views were supported by Roscher, Kines and Hildebrand of the Historical School (1843 – 1883), Alfred Marshall, Pareto, Lord Keynes. The following quotation due to Prof. Alfred Marshall in 1890 amply illustrates the role of Statistics in Economics : “Statistics are the straws out of which I, like every other economist, have to make bricks.” Statistics plays a very vital role in Economics so much so that in 1926, Prof. R.A. Fisher complained of “the painful misapprehension that Statistics is a branch of Economics.” Statistical data and advanced techniques of statistical analysis have proved immensely useful in the solution of a variety of economic problems such as production, consumption, distribution of income and wealth, wages, prices, profits, savings, expenditure, investment, unemployment, poverty, etc. For example, the studies of consumption statistics reveal the pattern of the consumption of the various commodities by different sections of the society and also enable us to have some idea about their purchasing capacity and their standard of living. The studies of production statistics enable us to strike a balance between supply and demand which is provided by the laws of supply and demand. The income and wealth statistics are mainly helpful in reducing the disparities of income. The statistics of prices are needed to study the price theories and the general problem of inflation through the construction of the cost of living and wholesale price index numbers. The statistics of market prices, costs and profits of different individual concerns are needed for the studies of competition and monopoly. Statistics pertaining to some macro-variables like production, income, expenditure, savings, investments, etc., are used for the compilation of National Income Accounts which are indispensable for economic planning of a country. Exchange statistics reflect upon the commercial development of a nation and tell us about the money in circulation and the volume of business done in the country. Statistical techniques have also been used in determining the measures of Gross National Product and Input-Output Analysis. The advanced and sound statistical techniques have been used successfully in the analysis of cost functions, production functions and consumption functions. Use of Statistics in Economics has led to the formulation of many economic laws some of which are mentioned below for illustration : A detailed and systematic study of the family budget data which gives a detailed account of the family budgets showing expenditure on the main items of family consumption together with family structure and composition, family income and various other social, economic and demographic characteristics led to the famous Engel’s Law of Consumption in 1895. Vilfredo Pareto in 19th-20th century propounded his famous Law of Distribution of Income by making an empirical study of the income data of various countries of the world at different times. The study of the data pertaining to the actual observation of the behaviour of buyers in the market resulted in the Revealed Preference Analysis of Prof. Samuelson. Time Series Analysis, Index Numbers, Forecasting Techniques and Demand Analysis are some of the very powerful statistical tools which are used immensely in the analysis of economic data and also for economic planning. For instance, time series analysis is extremely used in Business and Economic Statistics for the study of the series relating to prices, production and consumption of commodities, money in circulation, bank deposits and bank clearings, sales in a departmental store, etc., (i) to identify the forces or components at work, the net effect of whose interaction is exhibited by the movement of the time series; (ii) to isolate, study, analyse and measure them independently. 1·8 BUSINESS STATISTICS The index numbers which are also termed as ‘economic barometers’ are the numbers which reflect the changes over specified period of time in (i) prices of different commodities, (ii) industrial/agricultural production, (iii) sales, (iv) imports and exports, (v) cost of living, etc., and are extremely useful in economic planning. For instance, the cost of living index numbers are used for (i) the calculation of real wages and for determining the purchasing power of the money; (ii) the deflation of income and value series in national accounts; (iii) grant of dearness allowance (D.A.) or bonus to the workers in order to enable them to meet the increased cost of living and so on. The demand analysis consists in making an economic study of the market data to determine the relation between : (i) the prices of a given commodity and its absorption capacity for the market i.e., demand; and (ii) the price of a commodity and its output i.e., supply. Forecasting techniques based on the method of curve fitting by the principle of least squares and exponential smoothing are indispensable tools for economic planning. The increasing interaction of mathematics and statistics with economics led to the development of a new discipline called Econometrics—and the first Econometric Society was founded in U.S.A. in 1930 for “the advancement of economic theory in its relation to mathematics and statistics…” Econometrics aimed at making Economics a more realistic, precise, logical and practical science. Econometric models based on sound statistical analysis are used for maximum exploitation of the available resources. In other words, an attempt is made to obtain optimum results subject to a number of constraints on the resources at our disposal, say, of production capacity, capital, technology, precision, etc., which are determined statistically. Statistics in Business and Management. Prior to the Industrial Revolution, when the production was at the handicraft stage, the business activities were very much limited and were confined only to small units operating in their own areas. The owner of the concern personally looked after all the departments of business activity like sales, purchase, production, marketing, finance and so on. But after the Industrial Revolution, the developments in business activities have taken such unprecedented dimensions both in the size and the competition in the market that the activities of most of the business enterprises and firms are confined not only to one particular locality, town or place but to larger areas. Some of the leading houses have the network of their business activities in almost all the leading towns and cities of the country and even abroad. Accordingly it is impossible for a single person (the owner of the concern) to look after its activities and management has become a specialised job. The manager and a team of management executives is imperative for the efficient handling of the various operations like sales, purchase, production, marketing, control, finance, etc., of the business house. It is here that statistical data and the powerful statistical tools of probability, expectation, sampling techniques, tests of significance, estimation theory, forecasting techniques and so on play an indispensable role. According to Wallis and Roberts : “Statistics may be regarded as a body of methods for making wise decisions in the face of uncertainty.” A refinement over this definition is provided by Prof. Ya-Lun-Chou as follows : “Statistics is a method of decision making in the face of uncertainty on the basis of numerical data and calculated risks.” These definitions reflect the applications of Statistics in business since modern business has its roots in the accuracy and precision of the estimates and statistical forecasting regarding the future demand for the product, market trends and so on. Business forecasting techniques which are based on the compilation of useful statistical information on lead and lag indicators are very useful for obtaining estimates which serve as a guide to future economic events. Wrong expectations which might be the result of faulty and inaccurate analysis of various factors affecting a particular phenomenon might lead to his disaster. The time series analysis is a very important statistical tool which is used in business for the study of : (i) Trend (by method of curve fitting by the principle of least squares) in order to obtain the estimates of the probable demand of the goods; and (ii) Seasonal and Cyclical movements in the phenomenon, for determining the ‘Business Cycle’ which may also be termed as the four-phase cycle composed of prosperity (period of boom), recession, depression and recovery. The upswings and downswings in business depend on the cumulative nature of the economic forces (affecting the equilibrium of supply and demand) and the interaction between them. Most of the business and commercial series e.g., series relating to prices, production, consumption, profits, investments, wages, etc., are affected to a great extent by business cycles. Thus the study of business cycles is of INTRODUCTION — MEANING & SCOPE 1·9 paramount importance in business and a businessman who ignores the effects of booms and depression is bound to fail since his estimates and forecasts will definitely be faulty. The studies of Economic Barometers (Index Numbers of Prices) enable the businessman to have an idea about the purchasing power of money. The statistical tools of demand analysis enable the businessman to strike a balance between supply and demand. [For details, see Statistics in Economics]. The technique of Statistical Quality Control, through the powerful tools of ‘Control Charts’ and ‘Inspection Plans’ is indispensable to any business organisation for ensuring that the quality of the manufactured product is in conformity with the consumer’s specifications. (For details see Statistics in Industry). Statistical tools are used widely by business enterprises for the promotion of new business. Before embarking upon any production process, the business house must have an idea about the quantum of the product to be manufactured, the amount of the raw material and labour needed for it, the quality of the finished product, marketing avenues for the product, the competitive products in the market and so on. Thus the formulation of a production plan is a must and this cannot be achieved without collecting the statistical information on the above items without resorting to the powerful technique of ‘Sample Surveys’. As such, most of the leading business and industrial concerns have full-fledged statistical units with trained and efficient statisticians for formulating such plans and arriving at valid decisions in the face of uncertainty with calculated risks. These units also carry on research and development programmes for the improvement of the quality of the existing products (in the light of the competitive products in the market), introduction of new products and optimisation of the profits with existing resources at their disposal. Statistical tools of probability and expectation are extremely useful in Life Insurance which is one of the pioneer branches of Business and Commerce to use Statistics since the end of the seventeenth century. Statistical techniques have also been used very widely by business organisations in : (i) Marketing Decisions (based on the statistical analysis of consumer preference studies – demand analysis). (ii) Investment (based on sound study of individual shares and debentures). (iii) Personnel Administration (for the study of statistical data relating to wages, cost of living, incentive plans, effect of labour dispute/unrest on the production, performance standards, etc.). (iv) Credit policy. (v) Inventory Control (for co-ordination between production and sales). (vi) Accounting (for evaluation of the assets of the business concerns). (For details see Statistics in Accountancy and Auditing). (vii) Sales Control (through the statistical data pertaining to market