Descriptive Statistics: Data & Measures

Choose a study mode

Play Quiz
Study Flashcards
Spaced Repetition
Chat to Lesson

Podcast

Play an AI-generated podcast conversation about this lesson
Download our mobile app to listen on the go
Get App

Questions and Answers

Which aspect of data is NOT directly addressed by the definition of statistics?

  • Collecting data
  • Automating data entry (correct)
  • Analyzing data
  • Organizing data

In what way does statistics aid in decision-making for businessmen and industrialists?

  • By guaranteeing market success.
  • By eliminating all uncertainties.
  • By preparing for future uncertainties through the analysis of historical data. (correct)
  • By reducing the need for market research.

How are statistical principles applied in biological sciences to enhance resource application?

  • By determining the chemical composition of fertilizers.
  • By analyzing crop yields and animal responses to optimize resource allocation. (correct)
  • By predicting weather patterns.
  • By controlling environmental conditions.

What role does statistical metrology play in physical sciences?

<p>It aids findings in various fields, such as astronomy, chemistry, and geology. (C)</p> Signup and view all the answers

Why is 'up-to-date knowledge of expenditure pattern' important for Governments?

<p>To enhance effective decision-making on critical issues such as health and defense. (B)</p> Signup and view all the answers

What is the main difference between primary and secondary data?

<p>Primary data is obtained first-hand, while secondary data is obtained from existing sources. (D)</p> Signup and view all the answers

Which of the following is NOT an advantage of collecting primary data?

<p>It is less time-consuming compared to secondary data. (C)</p> Signup and view all the answers

What is a primary disadvantage of using secondary data?

<p>It may not conform to the investigator's specific needs or be of determinable reliability. (B)</p> Signup and view all the answers

In what capacity does statistics clarify data representation?

<p>By making data representation easy and clear. (A)</p> Signup and view all the answers

What values can a discrete random variable assume?

<p>Certain fixed whole number values. (C)</p> Signup and view all the answers

How does a continuous random variable differ from a discrete random variable?

<p>It assumes an infinite number of values between any two points. (B)</p> Signup and view all the answers

What is the primary characteristic of nominal data?

<p>It is used only as labels or categories. (A)</p> Signup and view all the answers

Based on the information provided, what distinguishes ordinal data from nominal data?

<p>Ordinal data has magnitude and ordered categories. (D)</p> Signup and view all the answers

What operation are you allowed to perform on interval data that you cannot on ordinal data?

<p>Addition and subtraction (A)</p> Signup and view all the answers

What is a key condition of ratio data that distinguishes it from interval data?

<p>Knowledge of a true zero point. (D)</p> Signup and view all the answers

What is a critical consideration when drawing bar charts?

<p>The bars should not touch each other. (B)</p> Signup and view all the answers

What type of data is suitable for representation in a bar chart?

<p>Discrete, categorical, nominal, and ordinal data. (C)</p> Signup and view all the answers

In a pie chart, what does the size of each sector represent?

<p>Relative magnitudes or frequencies. (D)</p> Signup and view all the answers

What is mainly criticized about pie charts by statisticians?

<p>The difficulty in comparing different sections or comparing data across different charts. (C)</p> Signup and view all the answers

When is a pie chart considered a reasonable way of displaying information?

<p>When the intent is to compare the size of a slice with the whole pie, specifically when the slice is about 25% or 50% of the total (A)</p> Signup and view all the answers

A histogram is most similar to which of the following charts?

<p>Bar chart (B)</p> Signup and view all the answers

What type of data does a histogram represent?

<p>Continuous data. (D)</p> Signup and view all the answers

How do the bars appear in a histogram as contrasted to bars in a bar chart?

<p>They touch each other. (C)</p> Signup and view all the answers

Which of the following is an accurate definition for measures of central tendency?

<p>Describe how data gathers around a central value. (C)</p> Signup and view all the answers

What term is used to refer to the `center' of a data set?

<p>Average (B)</p> Signup and view all the answers

Using the assumed mean method, which of the following options define the final step?

<p>Calculating the original mean using deviation from the assumed mean. (D)</p> Signup and view all the answers

How is the arithmetic mean calculated for a series of data?

<p>By taking the ratio of the total sum of data to the number of data points. (A)</p> Signup and view all the answers

Under what condition is the trimmed mean robust measure of central tendency?

<p>When a small fraction of anomalous measurements shows abnormally large deviation. (B)</p> Signup and view all the answers

How is the median of ungrouped data defined?

<p>It is the value that divides the data set into two equal halves when the data is in order. (D)</p> Signup and view all the answers

If n is the number of observations and is odd, how can you find the median?

<p>$(X_{(n+1)/2})$ (C)</p> Signup and view all the answers

In finding the median of grouped data, what indicates the 'sum of all frequencies before Lm'?

<p>$f_c$ (D)</p> Signup and view all the answers

In which scenario is the 'graphical estimate of the median' most useful?

<p>When using the cumulative frequency curve to see the value ‘x’ at the 50% point. (D)</p> Signup and view all the answers

Under what conditions the median value coincides with one of the items?

<p>There is an odd number of items in an array. (C)</p> Signup and view all the answers

What is the mode of an ungrouped data set?

<p>The observation that occurs most frequently. (A)</p> Signup and view all the answers

A distribution having multi-modes is called what?

<p>Multi modal (D)</p> Signup and view all the answers

What is 'frequency of modal class' also known as?

<p>$F_m$ (D)</p> Signup and view all the answers

How is the mode determined graphically with grouped data?

<p>Using the histogram. (A)</p> Signup and view all the answers

Quantiles are defined as:

<p>Splitting or partitioning a distribution into a number of equal portions (B)</p> Signup and view all the answers

Quantiles can be obtained with what?

<p>By formula or by using the cumulative frequency curve (D)</p> Signup and view all the answers

Measures of dispersion indicates what?

<p>How well or poorly measures of central tendency represent a particular distribution (B)</p> Signup and view all the answers

The range, (R), of an ungrouped series can be determined by what?

<p>Suppose $X_L - X_S$ (C)</p> Signup and view all the answers

Under which category does the quartile deviation fall under?

<p>Measures of dispersion (D)</p> Signup and view all the answers

The mean deviation (M.D) is defined as what?

<p>$\dfrac{\sum |x - \overline{x}|}{n}$ (D)</p> Signup and view all the answers

Flashcards

What is Statistics?

A scientific method of collecting, organizing, summarizing, presenting, and analyzing data to draw valid conclusions.

Statistics in Industry

Making decision in the face of uncertainties is a unique problem faced by businessmen and industrialist. Analysis of history data enables the businessman to prepare well in advance for the uncertainties of the future.

Statistics in Biological Science

Helps analyze crop yields under varying conditions and enhances medical and public health advancements.

Statistics in Physical Science

Aid astronomy, geology, and more by examining resources.

Signup and view all the flashcards

Statistics in Government

Used by the government to collect data for effective decision-making, covering expenditure, population, and health.

Signup and view all the flashcards

Primary Data

Data collected firsthand.

Signup and view all the flashcards

Secondary Data

Cheap; data collection is less time consuming as compare to primary source, and the data are easily available from division in various ministries, banks, insurance companies,print media, and research institutions.

Signup and view all the flashcards

Uses of Statistical Data

Summarizes data, permits deductions, enables planning, reveals patterns, and makes representation clear.

Signup and view all the flashcards

Quantitative Random Variable

Data that can be expressed numerically; can be discrete or continuous.

Signup and view all the flashcards

Discrete Random Variable

Assume certain fixed whole number values, such as number of cars in a park.

Signup and view all the flashcards

Continuous Random Variable

Assumes an infinite number of values between any two points like weight, length or height.

Signup and view all the flashcards

Nominal Data

Data used only as labels, with no direction or properties beyond classification.

Signup and view all the flashcards

Ordinal Data

Data with magnitude placed in order, preserving group structure.

Signup and view all the flashcards

Interval Data

Specifies magnitude with equal intervals, but lacks a true zero point.

Signup and view all the flashcards

Ratio Data

Highest measurement scale with rank order, equal intervals, and a true zero point.

Signup and view all the flashcards

What is a Bar Chart?

A chart with bars representing categories. Bar should be neither too short and nor very long and narrow. Bar should be separated by spaces which are about one and half of the width of a bar.

Signup and view all the flashcards

Multiple Bar Chart

A bar chart that enables comparisons of more than one variable to be made the same time.

Signup and view all the flashcards

Component Bar Chart

Enables comparisons of more than one variable to be made the same time like age and sex.

Signup and view all the flashcards

What is a Pie Chart?

A circular chart divided into sectors, illustrating relative magnitudes or frequencies also called a circle graph.

Signup and view all the flashcards

What is a Histogram?

Graphical presentation of frequency distribution. In histogram the bars have to touch each other unlike in bar chart. is applicable only to continuous data. Example Such as height, weight and so on.

Signup and view all the flashcards

Measure of Central Tendency

A measure of how data tends to a central value such that each individual value in the distribution tends to cluster around it.

Signup and view all the flashcards

The Mean

The ratio of the total (sum) of all the data in the series to the number of data points in the series can be arithmetic, geometric and harmonic.

Signup and view all the flashcards

Mean

Data = X

Signup and view all the flashcards

Coding Method

Assume a value within the data set as the mean to derive the original mean called assume mean method

Signup and view all the flashcards

Trimmed Mean

Sort values, discard same % smallest and largest, compute mean of remaining.

Signup and view all the flashcards

The Median

Divides data set into equal halves and can be calculated formula way and graphically.

Signup and view all the flashcards

The Mode

Most frequent observation in a data set. The mode of grouped data can be obtained using the histogram.

Signup and view all the flashcards

Quantiles

Partitioning or splitting a distribution into a number of equal portions like Quartiles, Deciles and Percentiles.

Signup and view all the flashcards

Quartiles

Split a distribution into four equal parts.

Signup and view all the flashcards

Deciles

Nine quantities split a distribution into ten equal parts.

Signup and view all the flashcards

Measure of Dispersion

Measure of the tendency of individual values to differ in size, specifying clustering extent.

Signup and view all the flashcards

Range

Difference between the largest and smallest numbers in a set.

Signup and view all the flashcards

Quartile Deviation

Half the difference between the third and first quartile.

Signup and view all the flashcards

Mean Deviation

Absolute value of the difference between items and the mean.

Signup and view all the flashcards

Variance

Measure of squared deviations from mean.

Signup and view all the flashcards

Standard Deviation

Square root of the variance.

Signup and view all the flashcards

Box Plot

Graphically shows numerical data groups via summaries. Box plots displays difference populations

Signup and view all the flashcards

Skewness

Symmetry is when it is possible to cut graph into two mirror halves. Skweness is the degree of from symmetry.

Signup and view all the flashcards

Q-Q Plot

A graphical technique for determining statistical populations using data estimates using quantiles.

Signup and view all the flashcards

P-P Plot

Probability for assessing data set agreement with cumulative distribution functions plots

Signup and view all the flashcards

Probability Sampling

Methods where every item has known chance of being chosen, selection is independent which includes random, stratified, systematic and cluster.

Signup and view all the flashcards

Study Notes

  • Descriptive Statistics 1 covers key concepts, including statistical data and measures
  • Version 1 of the material was released in July 2009

Week One

  • Statistics involves scientific methods for collection, organization, summarization, presentation, and analysis of data
  • Key goal is to draw valid conclusions based on this data analysis
  • Statistics is a tool applicable in natural, applied, social sciences, and any field with numerical data
  • Statistics includes data collection, computation, comparison, analysis, and interpretation
  • Numbers from statistics are referred to as data
  • Statistical analysis helps make decisions in the face of uncertainties in business and industry
  • Statistical analysis helps prepare for market/product research, feasibility studies, and economic forecasting
  • Statistical analysis is applied to crop yield analysis, animal diet optimization, and medicine advancements
  • Statistical metrology aids astronomy, chemistry, geology, meteorology, and oil explorations
  • Governments collect vast continuous data for decision-making related to expenditure, revenue, and population
  • Government is the primary user and producer of statistical data
  • Two main types of statistical data: primary and secondary
  • Primary data provides first hand information on the topic
  • Collection of primary data represents a difficult but important task for statisticians
  • Investigator has greater confidence
  • Investigator appreciates challenges because of direct involvement from data collection
  • Primary survey reports are usually comprehensive and include term definitions
  • Primary data advantages include researcher confidence, comprehensive reports, and clear definitions
  • Primary data disadvantages include being time-consuming, expensive, and requiring manpower
  • Primary data may also be obsolete by the publication time
  • Secondary data is obtained from existing sources such as ministries, banks, and research institutions
  • Secondary data sources are often collected as part of routine jobs
  • Secondary data advantages include low cost, less time expenditure, and easy access
  • Secondary data disadvantages include potential for misuse, restricted access due to protocol, and unknown precision
  • Potential disadvantages with using secondary data; may not conform to needs, and may contain errors
  • Statistics summarizes raw numerical data through measures like mean, standard deviation and coefficient of variation
  • Statistical planning relies on historical data and enables one to plan for future
  • Important for planning with analysis of historical data
  • Quantitative random variables can be expressed numerically and are discrete or continuous
  • Discrete variables assume fixed whole number values
  • Continuous variables can take infinite values between two points and are often related to measuring devices
  • Four types of data measurement: nominal, ordinal, interval, and ratio
  • Nominal data uses numerals for labels and has no implied direction or properties
  • Ordinal data ranks items for order of magnitude
  • It does not show specific maginitude of the differences
  • Interval data specifies observation/item magnitude and has equal intervals
  • Arithmetic operations like addition/subtraction are possible
  • Ratio data is the highest measurement scale with equality of order, intervals, and ratios, plus a knowledge of the true zero point

Week Two

  • Bar charts are visual tools for comparisons among categories using bars of uniform width
  • Bar charts are applicable only to discrete, categorical, nominal and ordinal data
  • Bars in a bar chart should have appropriate dimensions and spacing
  • Bar length corresponds to category frequencies, always with a label for clarity
  • Multiple bar charts make comparisons of more than one variable
  • Multiple bar charts can further consider other variables like age and sex
  • Component bar charts can further consider other variables like age and sex

Week Three

  • Pie charts uses sectors to represent relative magnitudes of data
  • Arc length is proportional to its slice value
  • Sector makes up a slice
  • Pie charts display a part of a group with its whole
  • Pie charts work best when the goal is to compare a category between 25% to 50% to the whole
  • Pie charts work best when goal is single graph comparison not comparison between different pie charts
  • Pie charts may become better by figure insertion or supporting tables are important to consider
  • The pie chart will need to show the proportion of the whole by listing what each part is
  • Provide a chart title with sex of study respondents as an example
  • Calculation of degree share for each category with sex example shown

Week Four

  • A histogram is a graph for frequency distribution that extends from a simple bar chart
  • Histograms applies only to continuous data such as height and weight
  • Bars have to touch in a histogram and unlike bars in a bar chart
  • Area of bar relates directly to frequency of that class
  • Rectangular bar is constructed to cover the intended class range
  • Considerations for histogram creation is that you decide on interval classes

Week Five

  • Central tendency measures how data clusters with typical distributions that values that individuals tend to cluster
  • The measure describes the concentration in the middle
  • Average refers to the center of set data by average mean median or mode
  • Mean can be arithmetic, geometric, or harmonic
  • Arithmetic mean is the primary focus
  • Arithmetric mean involves ratio of the sum of data points in the data point series
  • Arithmetic mean simply is the representative value that all elements obtain if the total were equal
  • Mean can be found in ungrouped data where x is a read x Bar as an example
  • In the coding statistical method, assumes a value within as the mean for the mean values
  • For data point X1, X2...the mid point is (x) and its respective frequency is represented by f1, f2

Week Six

  • In an array, the median is whatever value dives the data set into equal haves
  • A median can follow the following procedures
  • Arrange data from least to greatest
  • label a group data point with x1, x2 ... xn
  • if observation n is odd, use median formulas

Week Seven

  • Quantiles split distributions into equal portions, including quartiles, deciles, and percentiles
  • The three quartiles split data into four parts
  • Nine deciles divide data into ten equal parts
  • Ninety-nine percentiles divide the range into 100 even parts
  • Quartiles are obtained with formulas or using the cumulative frequency curve
  • Quartile calculation is similar between grouped and ungrouped data with respective modifications
  • The equations follow L1 + (n/a-fc) / f1 * C as an example

Week Eight

  • Dispersion measures the difference in size within a variable that indicates clustering around average
  • Variance provides how well central tendency can represent particular distribution
  • Some measures of dispersion is a range, simi inter quartile range, meand eviation variance and standard deviation
  • Range R is a set of numbers as the largeset number to the smallest
  • Quartile deviation is the sum as the difference between the third and first quadtile as explained in chapter 3
  • Mean deviation = Σ/ x−x / , x = Σx/n = absolute value of the difference between x₁ and x N

Week Nine

  • Statistical boxplots, also called box-and-whisker diagrams, show numerical summaries with 5 total entries
  • Smallest sample minimum with first data and second quartile observations
  • The plot aids what indicates observations that can be outliers
  • Boxplot help show different poplulations without assumping underlying statistical data
  • Help idenify data disperson and sketch
  • Box plot is known with some conventions with drawing by creating set data horizontially
  • A line on the numberline shows with Q1 ,median and and Q25
  • Interquartile range calculated with first from the third
  • Box lies with first the one listed and on its right
  • The median must the box symbol

Week Ten

  • Variance is related to standard deviation or standard average
  • It is denoted by o²
  • The sample variaince follows the equations Σ(x−x)² / n-1
  • And standard deviation follows standard equation o²

Week Eleven

  • Distribution are symmetric if capable of being cut with two symmetric image halves.
  • Symmetric distributions yield bell shape curves
  • Curve frequencies are always similar with equidistant from maximum
  • Skew is asymmetry of distributions
  • Asymmetric will have different tails relative on center
  • For Skew presence: values will not coincide
  • Freqencies arent similar at various positions
  • Otherwise all must be satisfied symmetrical distribution
  • Skewness calculates with Σ(x - X)³

Week Twelve

  • Describes the q-q plot to access if 2 sets of data come from a common distribuation
  • Is determined for 2 data sets to come from sample with a common distribution. q-q is fraction of values that has one percent
  • Sample size is not always
  • Axis axis is a estimated quantile

Week 13

  • P-P Plot assesses the agreeance of two data sets, which charts 2 cumulative distribution functions to each set
  • P-P Plot, a probability method to measure models with general or no comparable model
  • By comparing if N numbers, can be given continous distribution that shows what is empirical
  • The P p

Week Fourteen

  • Probability sampling implies that every group in the population has a certain ability to be taken
  • Probablity methods include random, startified, and clustering
  • Selection procedure should have some random lottery or method without bias
  • Stratiifed: divides set units internally to each sex type.
  • Sympathtic: start random then keep the same order

Week Fifteen

  • Data collection is intended to be both useful and insightful that data collections should be well chosen and have clear objectived
  • Some methods of data collections include documentary, interview, questionnaire and observarion
  • A questionnaire can help gather a certain objective and should allow some logical question/answer
  • The survey can then sent to source
  • It includes a general wide cost that is very timely
  • Some disadvantage is ambiguous with no high cost given
  • Interview method has the personal contact of the respondent
  • Data collection needs careful planning to maximize effectiveness
  • Data must be collected with respect to various objectives
  • There are several advantages listed under the 15.2 heading
  • Observation method includes certain systematic scientific methods to collect a lot of data that takes time Documentary records have the most time
  • Documentary collects known info that has a lot of time and money cost given

Studying That Suits You

Use AI to generate personalized quizzes and flashcards to suit your learning preferences.

Quiz Team

Related Documents

More Like This

Use Quizgecko on...
Browser
Browser