Data Presentation Methods

Choose a study mode

Play Quiz
Study Flashcards
Spaced Repetition
Chat to Lesson

Podcast

Play an AI-generated podcast conversation about this lesson
Download our mobile app to listen on the go
Get App

Questions and Answers

Which of the following is NOT a method of data presentation?

  • Inferential analysis (correct)
  • Summary indices
  • Graphical presentation
  • Tabular presentation

In a frequency table, what does 'CLASS FREQUENCY (CF)' represent?

  • The range of values within a class
  • The total number of observations in the dataset
  • The midpoint of a class interval
  • The number of observations falling in a particular class (correct)

Which of the following is LEAST suited for summarizing nominal data?

  • Mean (correct)
  • Frequency distribution
  • Median
  • Mode

What is the primary characteristic of measures of central tendency?

<p>They indicate typical values for a group. (A)</p> Signup and view all the answers

Which measure of central tendency is most affected by extreme values?

<p>Mean (A)</p> Signup and view all the answers

For which type of variable is the mean a suitable measure?

<p>Numeric variables measured on interval or ratio scales (A)</p> Signup and view all the answers

What does the summation symbol '$\Sigma x$' represent in the context of calculating the mean?

<p>The sum of the observations (C)</p> Signup and view all the answers

What is the first step in calculating the median of a dataset?

<p>Arrange the data in ascending order (A)</p> Signup and view all the answers

When is the median a more appropriate measure of central tendency than the mean?

<p>When the distribution is badly skewed (C)</p> Signup and view all the answers

Which of the following is a disadvantage of using the median as a measure of central tendency?

<p>It does not use all information in the distribution. (C)</p> Signup and view all the answers

What does the mode represent?

<p>The most frequent value (D)</p> Signup and view all the answers

Which type of data is the mode best suited for?

<p>Nominal data (B)</p> Signup and view all the answers

What is a key advantage of using the mode?

<p>It is easy to compute. (D)</p> Signup and view all the answers

What is the primary disadvantage of using the mode as a measure of central tendency?

<p>It may not exist or may not be unique. (C)</p> Signup and view all the answers

How is the midrange calculated?

<p>By averaging the maximum and minimum values (A)</p> Signup and view all the answers

Why is the midrange considered a less robust measure of central tendency?

<p>It is highly sensitive to extreme values. (C)</p> Signup and view all the answers

What is the impact on the mode if several observations in a dataset share the same highest frequency?

<p>The dataset is considered bimodal or multimodal. (B)</p> Signup and view all the answers

Which measure of central tendency is generally preferred for statistical analyses and tests of significance and why?

<p>Mean, because it includes information from all observations (C)</p> Signup and view all the answers

What does it indicate if the mean is greater than the median in a dataset?

<p>The distribution is skewed to the right (positive skew). (A)</p> Signup and view all the answers

In calculating the mean for grouped data, what does finding the 'class-mid mark' achieve?

<p>It approximates the average observation within each interval. (B)</p> Signup and view all the answers

How are class boundaries determined?

<p>They lie midway between the data values. (D)</p> Signup and view all the answers

Given the following dataset: 10, 12, 15, 18, 20. What is the midrange?

<p>15 (A)</p> Signup and view all the answers

In a positively skewed distribution, which of the following relationships between mean, median, and mode is typically observed?

<p>Mean &gt; Median &gt; Mode (B)</p> Signup and view all the answers

Consider a dataset with the values: 2, 2, 3, 4, 5, 5, 5, 6. What is the mode?

<p>5 (C)</p> Signup and view all the answers

What is the median of the following data set: 5, 8, 12, 3, 6?

<p>6 (D)</p> Signup and view all the answers

Given a data set with extreme outliers, which measure of central tendency should be used?

<p>Median (A)</p> Signup and view all the answers

What is the cumulative frequency?

<p>A running total of frequencies (C)</p> Signup and view all the answers

In a data set, how do you calculate the range?

<p>Max - Min (B)</p> Signup and view all the answers

CLASS RELATIVE FREQUENCY is also known as what?

<p>Percentage of that class data relative to the whole (B)</p> Signup and view all the answers

For grouped data, what is the modal class?

<p>Class with the highest Frequency (B)</p> Signup and view all the answers

Advantages of the Mean include which of the following?

<p>It is easy to calculate (D)</p> Signup and view all the answers

What measure is the Median of grouped data?

<p>bL (A)</p> Signup and view all the answers

Measures of central tendency do what?

<p>Summarize Data (C)</p> Signup and view all the answers

Which is not a consideration for a Choice of a measure?

<p>Whether or not it is a continuous variation (D)</p> Signup and view all the answers

What is the difference between class limits and class boundaries?

<p>Class limits are made up of the lower a nd upper, boundaries are what sets separation (A)</p> Signup and view all the answers

Which of the following is most reliable?

<p>Mean (A)</p> Signup and view all the answers

Which of the following is most fashionable?

<p>Mode (B)</p> Signup and view all the answers

A1 of the mode with the grouped data is

<p>difference between the frequency of modal class and the class before the class model. (B)</p> Signup and view all the answers

What does bL mean in Median?

<p>Lower boundary of median (C)</p> Signup and view all the answers

Flashcards

Data grouping

Grouping or collating data into frequencies to determine the rate of occurrence.

Tabular Presentation

Data presented in rows and columns.

Graphical Presentation

Visual representation of data using charts and diagrams.

Histogram

Quantitative data presented as connected bars

Signup and view all the flashcards

Frequency Polygon

Graph that uses lines to display quantitative data frequency.

Signup and view all the flashcards

Bar chart

A way to show qualitative data using rectangles

Signup and view all the flashcards

Pie Chart

Circular chart that shows relative sizes of data

Signup and view all the flashcards

Summary Indices

Numerical values that summarize data sets.

Signup and view all the flashcards

Class

A group into which data is classified.

Signup and view all the flashcards

Class Frequency

The number of observations in a particular class.

Signup and view all the flashcards

Measures of Location

Ways of describing data location.

Signup and view all the flashcards

Measures of Dispersion

How spread out or variable the data are

Signup and view all the flashcards

Measures of Partition

Values that divide data into equal parts.

Signup and view all the flashcards

Central Tendency

Describe middle location of numerical data.

Signup and view all the flashcards

Mean

The arithmetic average of the observations in a data set.

Signup and view all the flashcards

Suitable variable for mean.

A numeric variable that is measured on interval or ratio scales.

Signup and view all the flashcards

Calculating Mean

The procedure for calculating the average of grouped data in intervals.

Signup and view all the flashcards

Mean properties

The average distance from the mean

Signup and view all the flashcards

Mean disadvantage

The vulnerability of being affected by extreme values in a dataset.

Signup and view all the flashcards

Median

The middle value in a sorted dataset.

Signup and view all the flashcards

Median data types

Used when data is ratio, interval, or ordinal

Signup and view all the flashcards

Median properties

value above or below which half (50%) of the observations fall

Signup and view all the flashcards

Lower Boundary

The class boundary that is used to calculate median in grouped data.

Signup and view all the flashcards

Median advantages

Easily calculate and understand, Better representations when outliers

Signup and view all the flashcards

Mode

The value that appears most often in a dataset.

Signup and view all the flashcards

Modal Class

The class with the highest frequency in the dataset.

Signup and view all the flashcards

Mode advantages

Easy to compute, Main use is in clustered values

Signup and view all the flashcards

Mid-Range

Minimum plus maximum, divided by 2

Signup and view all the flashcards

Choice of measures

Shape/nature of the distribution, the scale of measurement (ordinal/numeric)

Signup and view all the flashcards

Study Notes

Data Presentation Methods

  • Data is grouped or collated into frequencies to determine the rate of occurrence.
  • Tabular presentation uses frequency tables
  • Useful for both quantitative and qualitative data
  • Graphical or diagrammatic presentation can use:
  • Histograms or frequency polygons for quantitative or numerical data
  • Bar or pie charts for qualitative or categorical data
  • Dot maps for geographical mapping
  • Summary indices include measures such as mean, median, and mode.

Tabular Presentation Definitions

  • Class refers to one of the groups into which data can be classified
  • Class Frequency (CF) is the number of observations (NOB) in the data set that falls into a particular class

Measures of Summarizing Data

  • Measures of location or central tendency summarize data
  • Measures of dispersion, spread, or variation summarize data
  • Measures of partition summarize data and can be measures of dispersion

Measures of Central Tendency

  • Describes the location of the center of a distribution of numerical and ordinal measurements
  • Indicates what is typical for a group
  • Identifies data values around which other values are distributed
  • Locates the midpoint of a distribution
  • Common examples are mean, median, mode, and midrange

The Mean (x-bar)

  • Arithmetic average of the observations
  • It is the most widely used average measure
  • It is considered the most reliable and trustworthy measure
  • It locates the center of gravity of a distribution
  • It reveals where values for a group are centered
  • It is suitable for numeric variables measured on interval or ratio scales when numbers can be added
  • It cannot be used for nominal or ordinal variables because they cannot be added

Calculating the Mean

  • Add up all the individual observations (summation ∑) and divide by the number of observations.
  • Formula: Xbar = (x1 + x2 + x3 + x4 …+ xn) / N
  • Simplified formula: Xbar = ∑x / N
  • ∑x = summation of observation
  • N = number of observations

Calculating the Mean for Grouped Data

  • Find the class-mid mark for each interval
  • Multiply the class mid-mark in each interval by corresponding frequencies, using the average of the class limits.
  • Add the results across all intervals.
  • Divide the results by the number of observations or total frequency.

Properties of the Mean (X)

  • Affected by extremes of values
  • All the other observations lie about it
  • Makes use of all information
  • The sum of the deviations of the values from the mean is always zero; derived by subtracting the mean from each value to form deviations (x minus xbar).
  • ∑ (x- Xbar) = 0
  • (x1 – Xbar)+ (x2 – Xbar) + (x3 – Xbar)…= 0

Advantages of the Mean

  • Easy to calculate
  • Uses all information in the distribution, making it reliable and accurate
  • Amenable to statistical procedures and testing

Disadvantages of the Mean

  • It can be unduly influenced by abnormal values in the distribution
  • It should not be used with badly skewed distribution

The Median

  • The middle number in an array of the data when the number of observations is odd
  • The arithmetic mean of the middle numbers in an array of data when the number of observations is even
  • It is the value above or below which half (50%) of the observations fall
  • Is the bisector of histogram/ polygon
  • Used for interval, ratio, and ordinal scales, but not nominal scales

Calculating the Median for Grouped Data

  • Sample size (n) = 80
  • Median position (n/2) = 40th
  • Median class = 75-79
  • Lower boundary (bL) = 74.5 (for median class)
  • Frequency in median class, f = 20
  • Cumulative above median class (F) = 36
  • Class-width ( c ) = 5
  • Formula: Median = bL + ((n/2 – F) / fmed) * c

Class Boundaries & Intervals

  • A class or group boundary lies midway between the data values
  • For data in the class or group is labelled 7.1 – 7.3
  • The class values 7.1 and 7.3 are the lower and upper limits of the class.
  • The different gives the class width.
  • Class boundaries are 0.05 below the lower class limit and 0.05 above the upper class limit (because the figures are in 1Decimal place)
  • The class interval/ width is the difference between the upper and lower class boundaries

Advantages of the Median

  • Can be used with distribution of any shape, especially when data are skewed
  • Easy to calculate and understand
  • Offers better representations when there are outliers
  • Not affected by extreme values, "the middle remains the middle”.

Disadvantages of the Median

  • It does not use all information in the distribution
  • Only takes into account one or two observations
  • Provides no information about other observations.

The Mode

  • The value/observation that occurs most frequently when observations are arranged in an array
  • Most fashionable
  • There can be several modes: bimodal, multimodal

Formula for Grouped Mode

  • Mode is the value that has the highest frequency in a data set.
  • For grouped data, class mode (or, modal class) is the class with the highest frequency.
  • Formula: Mode= Lb + (∆1 / (∆1 + ∆2)) * XC
  • Lb is the lower boundary of class mode
  • ∆1 is the difference between the frequency of class mode and the frequency of the class before the class mode.
  • ∆2 is the difference between the frequency of class mode and the frequency of the class after the class mode.
  • C is the class width

Advantages of the Mode

  • Easy to compute
  • Not affected by extreme values
  • Main usefulness is for calling attention to distribution in which the values cluster at several places
  • Only average available for nominal scale

Disadvantages of the Mode

  • It is not the best for biological or medical statistics
  • Can result in several observations with the same frequency (multimodal)
  • Least valuable

Mid-Range

  • Calculated as: (Minimum + Maximum) / 2
  • Not affected by extreme values
  • Does not consider all information in the distribution

Choice of Measure of Central Tendency

  • Depends on the shape/nature of the distribution is whether skewed to the left or right or a normal distribution
  • Depends on the scale of measurement, is whether ordinal or numerical
  • For continuous variation with a unimodal and symmetric distribution, the mean, median, and mode will be identical and lie on the same plane
  • When the distribution is skewed, the median may be a more informative descriptive measure to use than the mean, as it is not affected by extreme values
  • For statistical analyses and tests of significance, the mean is better whenever possible since it includes information from all observations, and its theoretic properties provide for more powerful statistical tests

Notes on Mean vs. Median

  • If mean = median, the distribution is symmetrical
  • If mean > median, the distribution is skewed to the right (positive)
  • If mean < median, the distribution is skewed to the left (negative)

Fundamentals of Biostatistics

Studying That Suits You

Use AI to generate personalized quizzes and flashcards to suit your learning preferences.

Quiz Team

Related Documents

More Like This

ryjhtdgsfd
62 questions

ryjhtdgsfd

TalentedParody avatar
TalentedParody
Use Quizgecko on...
Browser
Browser