Podcast
Questions and Answers
Which of the following best describes statistics?
Which of the following best describes statistics?
- Data that is only applicable to mathematics.
- Exclusively quantitative data.
- Both quantitative data, and a method of dealing with quantitative information. (correct)
- A method of dealing with qualitative data.
How did Adolphe Quetelet contribute to the field of statistics?
How did Adolphe Quetelet contribute to the field of statistics?
- Applied statistical methods to solve problems in biology, medicine, and sociology. (correct)
- Pioneered regression analysis.
- Founded a statistical laboratory in England.
- Discovered the Chi-square test.
What is the primary difference between descriptive and inferential statistics?
What is the primary difference between descriptive and inferential statistics?
- Descriptive statistics summarize and present data, while inferential statistics draw conclusions about a population based on a sample. (correct)
- Descriptive statistics are more complex than inferential statistics.
- Descriptive statistics are used to draw conclusions about a population, while inferential statistics organize and summarize data.
- Inferential statistics are used only in biological sciences, while descriptive statistics are used in all fields.
Which of the following is an example of a qualitative variable?
Which of the following is an example of a qualitative variable?
How does a 'parameter' differ from a 'statistic' in statistics?
How does a 'parameter' differ from a 'statistic' in statistics?
Which of the following statements is true regarding primary data?
Which of the following statements is true regarding primary data?
What is a key advantage of using indirect observation for collecting data?
What is a key advantage of using indirect observation for collecting data?
Which of the following is a primary consideration when framing a questionnaire for data collection?
Which of the following is a primary consideration when framing a questionnaire for data collection?
What is the main disadvantage of using secondary data?
What is the main disadvantage of using secondary data?
Which of the following describes 'ungrouped data'?
Which of the following describes 'ungrouped data'?
Which of the following is the most important reason for data classification?
Which of the following is the most important reason for data classification?
What is the purpose of 'captions' and 'stubs' in a statistical table?
What is the purpose of 'captions' and 'stubs' in a statistical table?
What characteristic defines a 'simple table' in statistics?
What characteristic defines a 'simple table' in statistics?
What is the primary purpose of creating a frequency distribution?
What is the primary purpose of creating a frequency distribution?
What differentiates 'grouped' from 'ungrouped' frequency distribution?
What differentiates 'grouped' from 'ungrouped' frequency distribution?
What is the 'class interval' in the context of frequency distribution?
What is the 'class interval' in the context of frequency distribution?
When constructing a frequency distribution, what is the main guideline for determining the number of classes?
When constructing a frequency distribution, what is the main guideline for determining the number of classes?
In a statistical table, what information does the 'head note' typically provide?
In a statistical table, what information does the 'head note' typically provide?
What is the key advantage of using a chart or diagram to present data, rather than a table?
What is the key advantage of using a chart or diagram to present data, rather than a table?
What are the key components of a graph?
What are the key components of a graph?
In a graph, where are positive data values typically measured in relation to the origin?
In a graph, where are positive data values typically measured in relation to the origin?
Which of the following is the primary difference between a histogram and a frequency polygon?
Which of the following is the primary difference between a histogram and a frequency polygon?
Which type of chart is best suited for showing the components of a whole, where each component is represented as a percentage?
Which type of chart is best suited for showing the components of a whole, where each component is represented as a percentage?
Which type of diagram is most suitable for representing geographical data like rainfall distribution across a country?
Which type of diagram is most suitable for representing geographical data like rainfall distribution across a country?
What is a key limitation of using diagrams for data presentation?
What is a key limitation of using diagrams for data presentation?
In statistics, what is the difference between 'census method' and 'sampling method'?
In statistics, what is the difference between 'census method' and 'sampling method'?
Which of the following statements best describes a ‘random sample’?
Which of the following statements best describes a ‘random sample’?
What is the first step in selecting a systematic sample?
What is the first step in selecting a systematic sample?
What is the main purpose of stratification in stratified random sampling?
What is the main purpose of stratification in stratified random sampling?
How does cluster sampling differ from stratified sampling?
How does cluster sampling differ from stratified sampling?
Which of the following best describes 'multistage sampling'?
Which of the following best describes 'multistage sampling'?
Which of the following is an example of convenience sampling?
Which of the following is an example of convenience sampling?
What is the primary goal of 'purposive' or 'judgment' sampling?
What is the primary goal of 'purposive' or 'judgment' sampling?
What is the purpose of calculating an ‘average’ in statistics?
What is the purpose of calculating an ‘average’ in statistics?
Which is a key objective of calculating averages?
Which is a key objective of calculating averages?
What characterizes a 'mathematical average'?
What characterizes a 'mathematical average'?
Why is ''weighted mean'' used over simple arithmetic mean under statistical circumstances?
Why is ''weighted mean'' used over simple arithmetic mean under statistical circumstances?
Which of the following statement makes Arithmetic Mean a bad measure?
Which of the following statement makes Arithmetic Mean a bad measure?
Which of the following measures describes Geometric Mean appropriately?
Which of the following measures describes Geometric Mean appropriately?
Harmonic Mean is useful in all of the following circumstances EXCEPT
Harmonic Mean is useful in all of the following circumstances EXCEPT
Median is preferred over mean because
Median is preferred over mean because
For finding out typically value which measure will be the most useful?
For finding out typically value which measure will be the most useful?
Flashcards
What does statistics involve?
What does statistics involve?
Refining numerical and non-numerical 'data' into usable 'information'.
Origin of the word 'Statistics'?
Origin of the word 'Statistics'?
Italian “Statistica,” Latin “Status,” German “Statistik,” French “Statistique,” meaning 'political state' or 'government'.
What is bio-statistics?
What is bio-statistics?
Term for statistics applied to biological sciences.
What type of statistics was used in India before 300 BC?
What type of statistics was used in India before 300 BC?
Signup and view all the flashcards
Who is Captain John Graunt?
Who is Captain John Graunt?
Signup and view all the flashcards
Who was Francis Galton?
Who was Francis Galton?
Signup and view all the flashcards
What are the functions of statistics?
What are the functions of statistics?
Signup and view all the flashcards
What is descriptive statistics?
What is descriptive statistics?
Signup and view all the flashcards
What is inferential statistics?
What is inferential statistics?
Signup and view all the flashcards
What is a population?
What is a population?
Signup and view all the flashcards
What is a sample?
What is a sample?
Signup and view all the flashcards
What is a variable?
What is a variable?
Signup and view all the flashcards
Quantitative variable
Quantitative variable
Signup and view all the flashcards
Qualitative variable
Qualitative variable
Signup and view all the flashcards
Discrete variable
Discrete variable
Signup and view all the flashcards
Continuous variable
Continuous variable
Signup and view all the flashcards
What is a constant?
What is a constant?
Signup and view all the flashcards
What is a parameter?
What is a parameter?
Signup and view all the flashcards
What is a statistic?
What is a statistic?
Signup and view all the flashcards
What does the term primary data describe?
What does the term primary data describe?
Signup and view all the flashcards
Secondary data?
Secondary data?
Signup and view all the flashcards
Classification
Classification
Signup and view all the flashcards
Numerical classification
Numerical classification
Signup and view all the flashcards
Descriptive classification
Descriptive classification
Signup and view all the flashcards
Spatial classification
Spatial classification
Signup and view all the flashcards
Temporal classification
Temporal classification
Signup and view all the flashcards
What are the objectives of tabulation?
What are the objectives of tabulation?
Signup and view all the flashcards
Table basics
Table basics
Signup and view all the flashcards
What is a simple table?
What is a simple table?
Signup and view all the flashcards
Complex Table
Complex Table
Signup and view all the flashcards
What does the term 'Essentials of a good Table' mean?
What does the term 'Essentials of a good Table' mean?
Signup and view all the flashcards
Frequency Distribution?
Frequency Distribution?
Signup and view all the flashcards
Ungrouped Frequency Distribution
Ungrouped Frequency Distribution
Signup and view all the flashcards
Grouped Frequency Distribution?
Grouped Frequency Distribution?
Signup and view all the flashcards
Relative Frequency
Relative Frequency
Signup and view all the flashcards
Percentage Frequency
Percentage Frequency
Signup and view all the flashcards
Frequency distribution?
Frequency distribution?
Signup and view all the flashcards
Cumulative Frequency Distribution
Cumulative Frequency Distribution
Signup and view all the flashcards
Less than cumulative frequency?
Less than cumulative frequency?
Signup and view all the flashcards
More than cumulative frequency
More than cumulative frequency
Signup and view all the flashcards
Data Presentation
Data Presentation
Signup and view all the flashcards
Study Notes
- Biostatistics involves refining numerical and non-numerical data into usable information.
- The word "statistics" may originate from Italian, Latin, German, or French terms meaning 'political state' or 'government'.
- Statistics applies quantitative data and statistical methods to biological sciences.
Origin and Growth of Statistics
- Statistics is an age-old subject used by early humans to record populations, births, deaths, taxes, and crop production.
- In India, a system for official and administrative statistics collection existed 2000 years ago during Chandragupta Mauriya's reign.
- Collection of "Vital Statistics" and registration of births/deaths was in vogue even before 300 BC.
- During Akbar's reign (1556-1605 A.D), Todarmal maintained land and agricultural statistics records.
- A census of lands occurred in Egypt around 1400 BC.
- Captain John Graunt (1620-1674) studied birth and death statistics and is known as "Father of Vital Statistics".
- Theoretical developments in modern statistics occurred during the mid-17th century with the introduction of probability theory and game theory.
- Gottfried Achenwall first used the term "statistics" in 1749, defining it as 'the political science of the several countries'.
- Adolphe Quetelet (1796-1874) used statistical methods to solve problems in biology, medicine, and sociology.
- Francis Galton (1822-1921) pioneered statistical methods in biometry and is called the father of biostatistics.
- Karl Pearson (1857-1936) founded the statistical laboratory in England (1911) and laid foundations for descriptive and correlational statistics.
- W.S. Gosset discovered Student's "t" distribution in 1908.
- Ronald A. Fisher (1890–1962) applied statistics to diversified fields such as genetics, biometry, education, and agriculture.
Definition of Statistics
- Statistics involves statistical data (plural sense) and statistical methods (singular sense).
- Bowley defined statistics as the science of counting, averages, and measurement.
- Boddington defined statistics as the science of estimates and probabilities.
- Livitt defined statistics as dealing with collection, classification, and tabulation of numerical facts.
- ‘Statistics is the study of methods and procedures for collection, classification, presentation, analysis and interpretation of data to make scientific inferences from it’.
- Bio-statistics applies statistical methods to biological sciences problems.
Functions of Statistics
- Statistics presents facts in a definite form.
- Statistics simplifies mass of figures.
- Statistics facilitates data comparison.
- Statistics formulates and tests hypotheses.
- Statistics helps in prediction.
- Statistics helps in formulation of policies.
Applications of Statistics
- Statistics is a tool for decision-making, especially in uncertain situations.
- Statistics is used across a spectrum of subjects and fields including physics, chemistry, biology, education, medicine, economics, engineering, and sociology.
- Statistical methods have seen increased usage due to the availability of high-speed computers with large storage and processing capabilities.
Broad Classification of Statistics
- Statistics is classified into descriptive and inferential statistics.
- Descriptive Statistics: Collection, organization, summarization, analysis, and presentation of data in a convenient form.
- Inferential Statistics: A body of methods and procedures for drawing conclusions about population characteristics based on a sample.
Concepts Used in Statistics
- Population: A set of individuals or objects studied through a statistical enquiry and referred to as universe, population, or census.
- Sample: A small group or portion of a population selected for investigation.
- Variable: Quantitative (numerical) or qualitative (categorical) characteristics of data.
- Quantitative variable: Characteristic measured on a scale.
- Qualitative variable: Expressed in qualities called attributes.
- Discrete variable: A quantitative variable taking only integral values.
- Continuous variable: A quantitative variable taking intermediate values.
- Constant: A numerical value that remains the same across the population.
- Parameter: A descriptive measure of a population.
- Statistic: A descriptive measure of a sample.
Collection of Data
- Observations are raw materials that researchers handle.
- Observations in animal husbandry experiments include body weight and milk yield.
Primary vs Secondary Data
- A statistical investigation starts with data collection, either directly or from available records.
- Data collected by the investigator directly from the sample is primary data.
- Primary data can be gathered through personal surveys, investigators, or questionnaires distributed and collected by post.
- The primary source is where primary data is collected from.
Methods of Collecting Primary Data
- Direct personal observation collects data personally from the data sources, with an interview schedule.
- Indirect observation involves trained investigators/enumerators collecting data and submitting schedules to the chief investigator.
- Data collection via questionnaires involves preparing questionnaires, sending them to respondents, including a letter explaining the data collection's aim, and a stamped return envelope.
- Questionnaire questions should be simple, small, easy to understand, and logically arranged.
Merits and Demerits of Different Methods of Primary Data Collection
Direct Personal Observation:
- Merits: First-hand data, reliable, accurate, good response, can be used even if respondents are illiterate.
- Demerits: Time and money expensive, not suitable for big surveys.
Indirect Personal Observation:
- Merits: Saves time, useful in large-scale surveys, yields reliable results, can be used even if the respondents are illiterate.
- Demerits: Requires knowledgeable enumerators, requires more money and time.
Data Collection Through Questionnaires:
- Merits: Economical, good for extensive inquiry, eliminates personal biases.
- Demerits: Poor response, possibility of incomplete/ inaccurate data, not suited to illiterate respondents.
Secondary Data
- Secondary data comes from available print and non-print sources like reports, journals, CDs, and websites; collected from a secondary source.
- Secondary data sources include publications from national and international agencies, government, research organizations, educational institutions, books, journals, newspapers, and private firms.
- While primary data is collected for a specific purpose, secondary data is gathered from sources which were done for other purpose.
Merits of Secondary Data
- Convenient and quicker to use.
- Saves time, labour and money.
- Sometimes the only possibility.
Demerits of Secondary Data
- Data may not be accurate or may be missing.
- Data may not be reliable.
Classification of Data
- Classification of data is the next step after data collection
- Involves arranging the primary data in a definite pattern and presenting it in a systematic way that may initially be crude and unorganized
- Ungrouped data may be difficult to study and draw inferences from and hard to handle.
Objectives of Classification
- Remove unnecessary details.
- Better understand the data.
- Detect errors in the data.
- Bring out explicitly significant features in the data
- Enable quick comparisons and drawing inferences.
Types of Classification
- Numerical Classification (by size): Data classification by quantitative characteristics.
- Descriptive Classification (by attribute): Classification by qualitative characteristics.
- Spatial Classification (by space): Classification of data by location of occurrence
- Temporal Classification (by time): Classification of data by time of occurrence
Tabulation of Data
- Aims to simplify complex data.
- Economizes space and depicts trend.
- Facilitates comparison.
- Aids in statistical analysis.
Parts of a Table
- Table number, title, caption, stub, body, head note, and footnote.
- The table number appears in a numbered sequence.
- The title should be brief and self-explanatory.
- Captions and stubs are column and row headings.
- The body of the table contains numerical data.
- The head note explains the contents of the table.
- The footnote gives specific information about the content in the table's body.
General Structure of a Table
- Includes the table number, title, head note (if any), stub heading, main caption, sub-caption, tertiary items, and footnotes.
Types of Tables
- Simple table: Based on one quality or characteristic.
- Complex table: Formed on the basis of more than one quality or characteristic.
Types of Characteristics
- Two qualities make a two-way table.
- Three qualities is a three-way table.
- More than three qualities is the manifold table.
Essentials of a Good Table
- Designed according to the objective.
- Should not be overloaded with detail.
- Attractive and well balanced.
- Complete in itself.
- Units of measurement stated clearly.
- Appropriate size.
Classification of Data According to Class Interval (Frequency Distribution)
- Frequency distribution: simple and effective method of organising and presenting numerical data
- Frequency distribution is a table that groups data into non-overlapping intervals called classes.
Class Interval
- Class interval is the limit within which the class limits lie.
- Each class interval has two limits: upper and lower.
- Width of length of the class interval is the difference between the upper and lower boundaries of the class.
- Frequency: Number of observations in the each class.
Formation of Frequency Distribution
Ungrouped Frequency Distribution
- Observations are arranged systematically, and frequency distribution for each value is obtained.
Grouped Frequency Distribution
- Uses the Method of tally marks to Choose the Number of Classes
- Since there is no strict rule, there is no strict rule for the no. of classes to have for data - by looking at the data
- The range of classes should cover the entire range of data, and the classes must be continuous.
- Classes should not be too large or too small. The number of classes will be between 5–20.
- Smaller or larger number of classes will be used depending on the dataset.
Choosing the Class Interval
- The class interval is the limit within which each class limits lie.
- Equal width and specific size to display data characteristic features.
- Divide the difference between maximum and minimum values in data by the number of required classes, as decided above.
Rules to Determine Number of Required Classes
- Can be calculated using formula suggested either by Sturge's rule or Yule's rule:
- By Sturge's rule, K=1+3.322xlogN, as per Yule's rule, K=2.5x(N)^¼, K is the number of required classes and N observations.
Different types of class intervals are followed
- Type "c" refers to the cases where end classes are open
- Type "d" refers to the case where there are unequal class intervals
Tally Marks
- This is done after forming the class intervals by making tallies.
- After forming the class interval, each class is written one below the other. The number of measurements belonging to each class is counted and recorded.
- After four strokes, the fifth item is indicated by striking through the previous four strokes and this is called formation of frequency distribution by the method of tally marks.
Array Method
- An array shows the orderly arrangement of the data by magnitude in ascending/descending order.
- We form the class interval as in the previous method.
- This method is not easy when the number of observation is large. We can adopt this method in the cases where the number of observations is less than 50.
Advantage of Presenting the Data on a Table of Frequency Distribution
- The data is presented economically in tabular form.
- The tabular method offers clear expressions.
- The form facilitates data comparison.
- Summarizing collected information in an attractive manner.
Relative Frequency Distribution
- The relative frequency is fraction/ proportion of the total items belonging to the class.
- Relative frequency of a class = Frequency of the class / Total number of observations = Class frequency / Total frequency.
Percentage Frequency
- Percentage frequency is calculated by multiplying relative frequency by 100.
Cumulative Frequency Distribution
- The cumulative frequency distribution gives number of frequencies below upper class boundary of a given class.
Two Types of Cumulative Frequency Distributions
- Less than cumulative distribution: adding successively the frequencies of all the previous classes including against which it is written.
- More than cumulative frequency distribution: finding the cumulate total of frequencies starting from the highest to the lowest class.
Graphical and Diagrammatic Presentation of Data
- Data collected by the investigator are usually raw and less organized. Tabular, graphic and diagrammatic forms present data in the most understandable way.
- Diagrams and graphs are drawn to present data to make it more understandable.
Graphical Presentation
- Graphs of frequency: Histogram, frequency polygon, smoothed frequency curve, Ogive (Cumulative frequency curve)
- Graphs of time series: One variable graph, 2 or more variables graph, Graph of different units
Diagrammatic Presentation
- One dimensional diagram: e.g. Line Diagram, Bar Diagram: Simple bar, Subdivided, Multiple bar, and Percentage bars
- Two-dimension diagrams: Rectangle, Square, Circle, and Pie charts
- Three-dimensional diagram, Pictogram (Ideograph), and Cartogram (Map)
Advantages of Diagrams and Graphs
- They attract the viewer's attention and can create lasting impressions.
- They save time and space.
- They facilitate comparisons.
Limitations of Diagrams and Graphs
- Approximate indicators.
- Fail to disclose small differences when large figures are involved.
Graphical Representation of Data
- A graph consists of two lines: A horizontal line (X axis/Abscissa) and a vertical line ( Y axis/Ordinate) to present
Studying That Suits You
Use AI to generate personalized quizzes and flashcards to suit your learning preferences.