Introduction to Biostatistics

Choose a study mode

Play Quiz
Study Flashcards
Spaced Repetition
Chat to Lesson

Podcast

Play an AI-generated podcast conversation about this lesson
Download our mobile app to listen on the go
Get App

Questions and Answers

Which of the following best describes statistics?

  • Data that is only applicable to mathematics.
  • Exclusively quantitative data.
  • Both quantitative data, and a method of dealing with quantitative information. (correct)
  • A method of dealing with qualitative data.

How did Adolphe Quetelet contribute to the field of statistics?

  • Applied statistical methods to solve problems in biology, medicine, and sociology. (correct)
  • Pioneered regression analysis.
  • Founded a statistical laboratory in England.
  • Discovered the Chi-square test.

What is the primary difference between descriptive and inferential statistics?

  • Descriptive statistics summarize and present data, while inferential statistics draw conclusions about a population based on a sample. (correct)
  • Descriptive statistics are more complex than inferential statistics.
  • Descriptive statistics are used to draw conclusions about a population, while inferential statistics organize and summarize data.
  • Inferential statistics are used only in biological sciences, while descriptive statistics are used in all fields.

Which of the following is an example of a qualitative variable?

<p>Color of a flower (C)</p> Signup and view all the answers

How does a 'parameter' differ from a 'statistic' in statistics?

<p>A parameter describes a population, while a statistic describes a sample. (A)</p> Signup and view all the answers

Which of the following statements is true regarding primary data?

<p>Primary data are collected for a specific purpose directly from the source. (B)</p> Signup and view all the answers

What is a key advantage of using indirect observation for collecting data?

<p>It is less time-consuming compared to direct personal observation. (B)</p> Signup and view all the answers

Which of the following is a primary consideration when framing a questionnaire for data collection?

<p>Questions should be logically arranged and easy to understand. (A)</p> Signup and view all the answers

What is the main disadvantage of using secondary data?

<p>The method and purpose of data collection may be unknown. (D)</p> Signup and view all the answers

Which of the following describes 'ungrouped data'?

<p>Data in its raw and unorganized form collected from an experiment or survey. (A)</p> Signup and view all the answers

Which of the following is the most important reason for data classification?

<p>To enable quick comparisons and drawing inferences. (C)</p> Signup and view all the answers

What is the purpose of 'captions' and 'stubs' in a statistical table?

<p>To provide headings for columns and rows, respectively. (B)</p> Signup and view all the answers

What characteristic defines a 'simple table' in statistics?

<p>It is based on just one quality or characteristic. (A)</p> Signup and view all the answers

What is the primary purpose of creating a frequency distribution?

<p>To condense quantitative data in a meaningful way. (D)</p> Signup and view all the answers

What differentiates 'grouped' from 'ungrouped' frequency distribution?

<p>Grouped distribution involves organizing data into intervals, while ungrouped lists individual values. (D)</p> Signup and view all the answers

What is the 'class interval' in the context of frequency distribution?

<p>The range of values within which a class falls. (C)</p> Signup and view all the answers

When constructing a frequency distribution, what is the main guideline for determining the number of classes?

<p>The number of classes should be between 5 and 20, depending on the data. (D)</p> Signup and view all the answers

In a statistical table, what information does the 'head note' typically provide?

<p>Explanations or qualifications about the data in the table. (D)</p> Signup and view all the answers

What is the key advantage of using a chart or diagram to present data, rather than a table?

<p>Charts and diagrams make it easier to understand complex relationships in the data. (B)</p> Signup and view all the answers

What are the key components of a graph?

<p>X-axis (abscissa), Y-axis (ordinate), and origin. (A)</p> Signup and view all the answers

In a graph, where are positive data values typically measured in relation to the origin?

<p>To the right of the origin on the X axis and above the origin on the Y axis. (B)</p> Signup and view all the answers

Which of the following is the primary difference between a histogram and a frequency polygon?

<p>A histogram represents frequencies as areas of rectangles, while a frequency polygon uses points connected by lines. (D)</p> Signup and view all the answers

Which type of chart is best suited for showing the components of a whole, where each component is represented as a percentage?

<p>Pie chart. (C)</p> Signup and view all the answers

Which type of diagram is most suitable for representing geographical data like rainfall distribution across a country?

<p>Cartogram. (A)</p> Signup and view all the answers

What is a key limitation of using diagrams for data presentation?

<p>Diagrams can misrepresent facts if not supported by tables. (D)</p> Signup and view all the answers

In statistics, what is the difference between 'census method' and 'sampling method'?

<p>Census method collects data from all individuals, while sampling method collects from a portion of individuals. (C)</p> Signup and view all the answers

Which of the following statements best describes a ‘random sample’?

<p>A sample where every element has an equal chance of inclusion. (D)</p> Signup and view all the answers

What is the first step in selecting a systematic sample?

<p>Randomly selecting the first element (B)</p> Signup and view all the answers

What is the main purpose of stratification in stratified random sampling?

<p>To improve the precision of the estimate by reducing variance within each stratum. (B)</p> Signup and view all the answers

How does cluster sampling differ from stratified sampling?

<p>Cluster sampling involves selecting all units within randomly chosen clusters, whereas stratified sampling selects units from every stratum. (D)</p> Signup and view all the answers

Which of the following best describes 'multistage sampling'?

<p>Selecting samples in stages, where smaller sampling units are chosen from larger ones. (A)</p> Signup and view all the answers

Which of the following is an example of convenience sampling?

<p>Selecting participants who are easily accessible to the researcher. (C)</p> Signup and view all the answers

What is the primary goal of 'purposive' or 'judgment' sampling?

<p>To eliminate sources of data distortion based on the investigator's subjective choice. (D)</p> Signup and view all the answers

What is the purpose of calculating an ‘average’ in statistics?

<p>To condense a dataset into a single, representative number. (C)</p> Signup and view all the answers

Which is a key objective of calculating averages?

<p>To compare the distribution. (B)</p> Signup and view all the answers

What characterizes a 'mathematical average'?

<p>The average calculates all terms in the series. (D)</p> Signup and view all the answers

Why is ''weighted mean'' used over simple arithmetic mean under statistical circumstances?

<p>To adjust for relative importance when items vary in significance. (B)</p> Signup and view all the answers

Which of the following statement makes Arithmetic Mean a bad measure?

<p>Affected by fluctuations of sampling. (A)</p> Signup and view all the answers

Which of the following measures describes Geometric Mean appropriately?

<p>Calculating percentage increase. (B)</p> Signup and view all the answers

Harmonic Mean is useful in all of the following circumstances EXCEPT

<p>For giving same weights to bigger items (B)</p> Signup and view all the answers

Median is preferred over mean because

<p>Affected by extreme value. (D)</p> Signup and view all the answers

For finding out typically value which measure will be the most useful?

<p>Mode (A)</p> Signup and view all the answers

Flashcards

What does statistics involve?

Refining numerical and non-numerical 'data' into usable 'information'.

Origin of the word 'Statistics'?

Italian “Statistica,” Latin “Status,” German “Statistik,” French “Statistique,” meaning 'political state' or 'government'.

What is bio-statistics?

Term for statistics applied to biological sciences.

What type of statistics was used in India before 300 BC?

System for collecting vital statistics and registration of births/deaths.

Signup and view all the flashcards

Who is Captain John Graunt?

First studied the statistics of births and deaths.

Signup and view all the flashcards

Who was Francis Galton?

UK, regression, pioneered statistical methods in Biometry, father of Biostatistics.

Signup and view all the flashcards

What are the functions of statistics?

Presents facts, simplifies figures, facilitates comparison, tests hypotheses, helps in prediction/policies.

Signup and view all the flashcards

What is descriptive statistics?

Collecting, organising, summarising, analysing and presenting data in a convenient form.

Signup and view all the flashcards

What is inferential statistics?

Methods/procedures for drawing conclusions about population characteristics from a sample.

Signup and view all the flashcards

What is a population?

Set of individuals/objects studied through a statistical enquiry.

Signup and view all the flashcards

What is a sample?

Small group/portion of a population selected for investigation.

Signup and view all the flashcards

What is a variable?

Quantitative (numerical) or qualitative (categorical) characteristic of data.

Signup and view all the flashcards

Quantitative variable

Characteristic measured on a scale in units.

Signup and view all the flashcards

Qualitative variable

Expressed through 'attributes'.

Signup and view all the flashcards

Discrete variable

Quantitative variable with only integral values.

Signup and view all the flashcards

Continuous variable

Quantitative variable with intermediate values between integral values.

Signup and view all the flashcards

What is a constant?

Numerical value that does not vary within a population.

Signup and view all the flashcards

What is a parameter?

Descriptive measure of a population.

Signup and view all the flashcards

What is a statistic?

Descriptive measure of a sample.

Signup and view all the flashcards

What does the term primary data describe?

Data collected directly by the investigator from the source.

Signup and view all the flashcards

Secondary data?

Data collected from available sources (print & non-print).

Signup and view all the flashcards

Classification

Arranging data in a definite pattern and presenting it in a systematic way.

Signup and view all the flashcards

Numerical classification

Classification by quantitative characters (e.g. weight)

Signup and view all the flashcards

Descriptive classification

Classification by qualitative characters (e.g. breeds)

Signup and view all the flashcards

Spatial classification

Classification by location (e.g. district)

Signup and view all the flashcards

Temporal classification

Classification by the time of occurrence (e.g. different years)

Signup and view all the flashcards

What are the objectives of tabulation?

Simplify complex data, economise space, depict trends, facilitate comparison, statistical analysis

Signup and view all the flashcards

Table basics

Table's parts.

Signup and view all the flashcards

What is a simple table?

Basis: one quality or characteristic.

Signup and view all the flashcards

Complex Table

Formed on qualities or characteristics.

Signup and view all the flashcards

What does the term 'Essentials of a good Table' mean?

Should be well-balanced, clearly made

Signup and view all the flashcards

Frequency Distribution?

Arrangement/table that groups data into non-overlapping intervals (classes).

Signup and view all the flashcards

Ungrouped Frequency Distribution

Observations arranged in a systematic way, frequency obtained for each value.

Signup and view all the flashcards

Grouped Frequency Distribution?

Distribution using class intervals and tally marks to count observations.

Signup and view all the flashcards

Relative Frequency

Fraction/proportion of total items belonging to the class (frequency/total observations)

Signup and view all the flashcards

Percentage Frequency

Relative frequency multiplied by 100

Signup and view all the flashcards

Frequency distribution?

Number of observations that lie in any class interval

Signup and view all the flashcards

Cumulative Frequency Distribution

Number of frequencies below any mark/above any given mark.

Signup and view all the flashcards

Less than cumulative frequency?

Adding successive frequencies of all previous classes.

Signup and view all the flashcards

More than cumulative frequency

Cumulate total of frequencies starting from the highest to the lowest class.

Signup and view all the flashcards

Data Presentation

Tabular, graphic and diagrammatic of Data by the investigator

Signup and view all the flashcards

Study Notes

  • Biostatistics involves refining numerical and non-numerical data into usable information.
  • The word "statistics" may originate from Italian, Latin, German, or French terms meaning 'political state' or 'government'.
  • Statistics applies quantitative data and statistical methods to biological sciences.

Origin and Growth of Statistics

  • Statistics is an age-old subject used by early humans to record populations, births, deaths, taxes, and crop production.
  • In India, a system for official and administrative statistics collection existed 2000 years ago during Chandragupta Mauriya's reign.
  • Collection of "Vital Statistics" and registration of births/deaths was in vogue even before 300 BC.
  • During Akbar's reign (1556-1605 A.D), Todarmal maintained land and agricultural statistics records.
  • A census of lands occurred in Egypt around 1400 BC.
  • Captain John Graunt (1620-1674) studied birth and death statistics and is known as "Father of Vital Statistics".
  • Theoretical developments in modern statistics occurred during the mid-17th century with the introduction of probability theory and game theory.
  • Gottfried Achenwall first used the term "statistics" in 1749, defining it as 'the political science of the several countries'.
  • Adolphe Quetelet (1796-1874) used statistical methods to solve problems in biology, medicine, and sociology.
  • Francis Galton (1822-1921) pioneered statistical methods in biometry and is called the father of biostatistics.
  • Karl Pearson (1857-1936) founded the statistical laboratory in England (1911) and laid foundations for descriptive and correlational statistics.
  • W.S. Gosset discovered Student's "t" distribution in 1908.
  • Ronald A. Fisher (1890–1962) applied statistics to diversified fields such as genetics, biometry, education, and agriculture.

Definition of Statistics

  • Statistics involves statistical data (plural sense) and statistical methods (singular sense).
  • Bowley defined statistics as the science of counting, averages, and measurement.
  • Boddington defined statistics as the science of estimates and probabilities.
  • Livitt defined statistics as dealing with collection, classification, and tabulation of numerical facts.
  • ‘Statistics is the study of methods and procedures for collection, classification, presentation, analysis and interpretation of data to make scientific inferences from it’.
  • Bio-statistics applies statistical methods to biological sciences problems.

Functions of Statistics

  • Statistics presents facts in a definite form.
  • Statistics simplifies mass of figures.
  • Statistics facilitates data comparison.
  • Statistics formulates and tests hypotheses.
  • Statistics helps in prediction.
  • Statistics helps in formulation of policies.

Applications of Statistics

  • Statistics is a tool for decision-making, especially in uncertain situations.
  • Statistics is used across a spectrum of subjects and fields including physics, chemistry, biology, education, medicine, economics, engineering, and sociology.
  • Statistical methods have seen increased usage due to the availability of high-speed computers with large storage and processing capabilities.

Broad Classification of Statistics

  • Statistics is classified into descriptive and inferential statistics.
  • Descriptive Statistics: Collection, organization, summarization, analysis, and presentation of data in a convenient form.
  • Inferential Statistics: A body of methods and procedures for drawing conclusions about population characteristics based on a sample.

Concepts Used in Statistics

  • Population: A set of individuals or objects studied through a statistical enquiry and referred to as universe, population, or census.
  • Sample: A small group or portion of a population selected for investigation.
  • Variable: Quantitative (numerical) or qualitative (categorical) characteristics of data.
  • Quantitative variable: Characteristic measured on a scale.
  • Qualitative variable: Expressed in qualities called attributes.
  • Discrete variable: A quantitative variable taking only integral values.
  • Continuous variable: A quantitative variable taking intermediate values.
  • Constant: A numerical value that remains the same across the population.
  • Parameter: A descriptive measure of a population.
  • Statistic: A descriptive measure of a sample.

Collection of Data

  • Observations are raw materials that researchers handle.
  • Observations in animal husbandry experiments include body weight and milk yield.

Primary vs Secondary Data

  • A statistical investigation starts with data collection, either directly or from available records.
  • Data collected by the investigator directly from the sample is primary data.
  • Primary data can be gathered through personal surveys, investigators, or questionnaires distributed and collected by post.
  • The primary source is where primary data is collected from.

Methods of Collecting Primary Data

  • Direct personal observation collects data personally from the data sources, with an interview schedule.
  • Indirect observation involves trained investigators/enumerators collecting data and submitting schedules to the chief investigator.
  • Data collection via questionnaires involves preparing questionnaires, sending them to respondents, including a letter explaining the data collection's aim, and a stamped return envelope.
  • Questionnaire questions should be simple, small, easy to understand, and logically arranged.

Merits and Demerits of Different Methods of Primary Data Collection

Direct Personal Observation:

  • Merits: First-hand data, reliable, accurate, good response, can be used even if respondents are illiterate.
  • Demerits: Time and money expensive, not suitable for big surveys.

Indirect Personal Observation:

  • Merits: Saves time, useful in large-scale surveys, yields reliable results, can be used even if the respondents are illiterate.
  • Demerits: Requires knowledgeable enumerators, requires more money and time.

Data Collection Through Questionnaires:

  • Merits: Economical, good for extensive inquiry, eliminates personal biases.
  • Demerits: Poor response, possibility of incomplete/ inaccurate data, not suited to illiterate respondents.

Secondary Data

  • Secondary data comes from available print and non-print sources like reports, journals, CDs, and websites; collected from a secondary source.
  • Secondary data sources include publications from national and international agencies, government, research organizations, educational institutions, books, journals, newspapers, and private firms.
  • While primary data is collected for a specific purpose, secondary data is gathered from sources which were done for other purpose.

Merits of Secondary Data

  • Convenient and quicker to use.
  • Saves time, labour and money.
  • Sometimes the only possibility.

Demerits of Secondary Data

  • Data may not be accurate or may be missing.
  • Data may not be reliable.

Classification of Data

  • Classification of data is the next step after data collection
  • Involves arranging the primary data in a definite pattern and presenting it in a systematic way that may initially be crude and unorganized
  • Ungrouped data may be difficult to study and draw inferences from and hard to handle.

Objectives of Classification

  • Remove unnecessary details.
  • Better understand the data.
  • Detect errors in the data.
  • Bring out explicitly significant features in the data
  • Enable quick comparisons and drawing inferences.

Types of Classification

  • Numerical Classification (by size): Data classification by quantitative characteristics.
  • Descriptive Classification (by attribute): Classification by qualitative characteristics.
  • Spatial Classification (by space): Classification of data by location of occurrence
  • Temporal Classification (by time): Classification of data by time of occurrence

Tabulation of Data

  • Aims to simplify complex data.
  • Economizes space and depicts trend.
  • Facilitates comparison.
  • Aids in statistical analysis.

Parts of a Table

  • Table number, title, caption, stub, body, head note, and footnote.
  • The table number appears in a numbered sequence.
  • The title should be brief and self-explanatory.
  • Captions and stubs are column and row headings.
  • The body of the table contains numerical data.
  • The head note explains the contents of the table.
  • The footnote gives specific information about the content in the table's body.

General Structure of a Table

  • Includes the table number, title, head note (if any), stub heading, main caption, sub-caption, tertiary items, and footnotes.

Types of Tables

  • Simple table: Based on one quality or characteristic.
  • Complex table: Formed on the basis of more than one quality or characteristic.

Types of Characteristics

  • Two qualities make a two-way table.
  • Three qualities is a three-way table.
  • More than three qualities is the manifold table.

Essentials of a Good Table

  • Designed according to the objective.
  • Should not be overloaded with detail.
  • Attractive and well balanced.
  • Complete in itself.
  • Units of measurement stated clearly.
  • Appropriate size.

Classification of Data According to Class Interval (Frequency Distribution)

  • Frequency distribution: simple and effective method of organising and presenting numerical data
  • Frequency distribution is a table that groups data into non-overlapping intervals called classes.

Class Interval

  • Class interval is the limit within which the class limits lie.
  • Each class interval has two limits: upper and lower.
  • Width of length of the class interval is the difference between the upper and lower boundaries of the class.
  • Frequency: Number of observations in the each class.

Formation of Frequency Distribution

Ungrouped Frequency Distribution

  • Observations are arranged systematically, and frequency distribution for each value is obtained.

Grouped Frequency Distribution

  • Uses the Method of tally marks to Choose the Number of Classes
  • Since there is no strict rule, there is no strict rule for the no. of classes to have for data - by looking at the data
  • The range of classes should cover the entire range of data, and the classes must be continuous.
  • Classes should not be too large or too small. The number of classes will be between 5–20.
  • Smaller or larger number of classes will be used depending on the dataset.

Choosing the Class Interval

  • The class interval is the limit within which each class limits lie.
  • Equal width and specific size to display data characteristic features.
  • Divide the difference between maximum and minimum values in data by the number of required classes, as decided above.

Rules to Determine Number of Required Classes

  • Can be calculated using formula suggested either by Sturge's rule or Yule's rule:
  • By Sturge's rule, K=1+3.322xlogN, as per Yule's rule, K=2.5x(N)^¼, K is the number of required classes and N observations.

Different types of class intervals are followed

  • Type "c" refers to the cases where end classes are open
  • Type "d" refers to the case where there are unequal class intervals

Tally Marks

  • This is done after forming the class intervals by making tallies.
  • After forming the class interval, each class is written one below the other. The number of measurements belonging to each class is counted and recorded.
  • After four strokes, the fifth item is indicated by striking through the previous four strokes and this is called formation of frequency distribution by the method of tally marks.

Array Method

  • An array shows the orderly arrangement of the data by magnitude in ascending/descending order.
  • We form the class interval as in the previous method.
  • This method is not easy when the number of observation is large. We can adopt this method in the cases where the number of observations is less than 50.

Advantage of Presenting the Data on a Table of Frequency Distribution

  • The data is presented economically in tabular form.
  • The tabular method offers clear expressions.
  • The form facilitates data comparison.
  • Summarizing collected information in an attractive manner.

Relative Frequency Distribution

  • The relative frequency is fraction/ proportion of the total items belonging to the class.
  • Relative frequency of a class = Frequency of the class / Total number of observations = Class frequency / Total frequency.

Percentage Frequency

  • Percentage frequency is calculated by multiplying relative frequency by 100.

Cumulative Frequency Distribution

  • The cumulative frequency distribution gives number of frequencies below upper class boundary of a given class.

Two Types of Cumulative Frequency Distributions

  • Less than cumulative distribution: adding successively the frequencies of all the previous classes including against which it is written.
  • More than cumulative frequency distribution: finding the cumulate total of frequencies starting from the highest to the lowest class.

Graphical and Diagrammatic Presentation of Data

  • Data collected by the investigator are usually raw and less organized. Tabular, graphic and diagrammatic forms present data in the most understandable way.
  • Diagrams and graphs are drawn to present data to make it more understandable.

Graphical Presentation

  • Graphs of frequency: Histogram, frequency polygon, smoothed frequency curve, Ogive (Cumulative frequency curve)
  • Graphs of time series: One variable graph, 2 or more variables graph, Graph of different units

Diagrammatic Presentation

  • One dimensional diagram: e.g. Line Diagram, Bar Diagram: Simple bar, Subdivided, Multiple bar, and Percentage bars
  • Two-dimension diagrams: Rectangle, Square, Circle, and Pie charts
  • Three-dimensional diagram, Pictogram (Ideograph), and Cartogram (Map)

Advantages of Diagrams and Graphs

  • They attract the viewer's attention and can create lasting impressions.
  • They save time and space.
  • They facilitate comparisons.

Limitations of Diagrams and Graphs

  • Approximate indicators.
  • Fail to disclose small differences when large figures are involved.

Graphical Representation of Data

  • A graph consists of two lines: A horizontal line (X axis/Abscissa) and a vertical line ( Y axis/Ordinate) to present

Studying That Suits You

Use AI to generate personalized quizzes and flashcards to suit your learning preferences.

Quiz Team

Related Documents

More Like This

Use Quizgecko on...
Browser
Browser