Statistical Analysis Chapter 1

Podcast

Play an AI-generated podcast conversation about this lesson

Download our mobile app to listen on the go

Get App

Questions and Answers

Discrete data cannot be ratio data.

False (B)

Even ratio data can be measured using an ordinal scale.

True (A)

Inferential statistics deal with making inferences about a population.

True (A)

A population can be considered a set of numbers.

True (A)

Signup and view all the answers

The daily sleep duration of housewives in Korea cannot be considered a population.

False (B)

Signup and view all the answers

Qualitative data refer to data that can be quantified.

False (B)

Signup and view all the answers

A frequency distribution table consists of class intervals and frequencies.

True (A)

Signup and view all the answers

Relative frequency is calculated by dividing the class frequency by the total frequency.

True (A)

Signup and view all the answers

Cumulative frequency is the sum of all frequencies up to and including a specific class.

True (A)

Signup and view all the answers

Every frequency distribution table can be represented using a graph.

True (A)

Signup and view all the answers

The sum of relative frequencies in a frequency distribution table is always 0.

False (B)

Signup and view all the answers

The first step in creating a frequency distribution table is determining the class width.

False (B)

Signup and view all the answers

Class width is determined by dividing the range of data by the number of class intervals.

True (A)

Signup and view all the answers

The mode is the value located at the center of a dataset arranged in ascending order.

False (B)

Signup and view all the answers

Variance is always greater than or equal to zero.

True (A)

Signup and view all the answers

The interquartile range can have a negative value.

False (B)

Signup and view all the answers

To standardize ordinal data, the percentile should be calculated.

True (A)

Signup and view all the answers

The mean of standardized values (Z-scores) of any population is always 0.

True (A)

Signup and view all the answers

Standardization refers to subtracting the mean from a data value and dividing it by the variance.

False (B)

Signup and view all the answers

Degrees of freedom refer to the actual number of observations used in calculating a sample statistic.

True (A)

Signup and view all the answers

The standard deviation has a different unit of measurement than the original data.

False (B)

Signup and view all the answers

If the sample size is sufficiently large, the sample variance becomes nearly equal to the population variance.

True (A)

Signup and view all the answers

Bivariate data refers to data obtained by simultaneously examining two variables.

True (A)

Signup and view all the answers

A contingency table cannot be considered a frequency table that displays bivariate data.

False (B)

Signup and view all the answers

The correlation coefficient is the covariance divided by the standard deviations of the two variables.

True (A)

Signup and view all the answers

The sample correlation coefficient is a type of sample statistic.

True (A)

Signup and view all the answers

The values of the sample correlation coefficient and the population correlation coefficient always match.

True (A)

Signup and view all the answers

If the correlation coefficient is 0, there is no relationship between the two populations.

True (A)

Signup and view all the answers

The correlation coefficient takes values between 0 and 1.

False (B)

Signup and view all the answers

Conditional probability can be expressed using joint probability and marginal probability.

True (A)

Signup and view all the answers

"No correlation" and "statistical independence" are different concepts.

False (B)

Signup and view all the answers

In bivariate data where samples are drawn simultaneously from two populations, it can be considered a compound event.

True (A)

Signup and view all the answers

If the joint probability of two events equals the product of their marginal probabilities, the two events are considered independent.

True (A)

Signup and view all the answers

If a conditional probability can be expressed as the product of a joint probability and a marginal probability, the two events are considered independent.

False (B)

Signup and view all the answers

A probability distribution consists of the sample space of a random variable and the associated probabilities.

True (A)

Signup and view all the answers

Any random variable, when standardized, has a variance of 0.

False (B)

Signup and view all the answers

The expected value of a random variable is conceptually same as its mean.

True (A)

Signup and view all the answers

The expected value of a random variable is calculated by multiplying each possible value by its corresponding probability and summing the results.

True (A)

Signup and view all the answers

The standard deviation of a random variable is the square root of the average of the squared deviations from the mean.

True (A)

Signup and view all the answers

The variance of a new random variable created by adding two random variables is equal to the sum of their individual variances.

False (B)

Signup and view all the answers

A binomial random variable can be expressed as the sum of Bernoulli random variables.

True (A)

Signup and view all the answers

The variance of a binomial distribution is the mean multiplied by the failure probability.

True (A)

Signup and view all the answers

Independent trials mean that the outcome of one trial does not affect the others.

True (A)

Signup and view all the answers

The standard deviation of a binomial distribution is $np(1-p)$.

False (B)

Signup and view all the answers

A discrete random variable takes only integer values.

True (A)

Signup and view all the answers

A binomial random variable is a continuous random variable.

False (B)

Signup and view all the answers

A Bernoulli trial is an independent trial with exactly two mutually exclusive outcomes.

True (A)

Signup and view all the answers

If the probability of success is less than 0.5, the binomial distribution has a long tail to the left.

False (B)

Signup and view all the answers

Even if the success probability is 0.5, the binomial distribution may not be symmetric.

False (B)

Signup and view all the answers

The total of all binomial probabilities always sums to 1.

True (A)

Signup and view all the answers

Given the data: 6, 8, 10, 10, 10, 12, 14. Calculate the mean and median.

Mean = 10, Median = 10

Signup and view all the answers

Given the data: 6, 8, 10, 10, 10, 12, 14. Determine the range and interquartile range (IQR).

Range = 8, IQR = 4

Signup and view all the answers

Given the data: 6, 8, 10, 10, 10, 12, 14. Standardize the value 10 using Z-scores (assuming this data represents the population).

Z-score = 0

Signup and view all the answers

In calculating the sample mean for the grouped data in table C3-2, the sum of (x * Frequency) is 670. What divisor is used to find the average $ar{x}$? $ar{x} = 670 / (_____)$

100

Signup and view all the answers

In calculating the sample variance for the grouped data in table C3-2, the sum of (Deviation * Frequency) is 1017. Assuming a sample size n=100, what divisor is used to find the sample variance $s^2$? $s^2 = 1017 / (_____)$

99

Signup and view all the answers

Given the sample variance $s^2 = 1017 / 99$ from table C3-2, calculate the sample standard deviation (s). Round to three decimal places.

3.205

Signup and view all the answers

Calculate the sample mean $ar{x}$ for the March scores (x) from table C4.

800

Signup and view all the answers

Calculate the sample mean $ar{y}$ for the September scores (y) from table C4.

820

Signup and view all the answers

Calculate the sample variance $s_x^2$ for the March scores (x) using the sum of squared deviations (850) from table C4.

212.5

Signup and view all the answers

Calculate the sample variance $s_y^2$ for the September scores (y) using the sum of squared deviations (350) from table C4.

87.5

Signup and view all the answers

Calculate the sample covariance $s_{xy}$ using the sum of the products of deviations (225) from table C4.

56.25

Signup and view all the answers

Calculate the sample standard deviation $s_x$ for the March scores (x) based on $s_x^2 = 212.5$. Round to two decimal places.

14.58

Signup and view all the answers

Calculate the sample standard deviation $s_y$ for the September scores (y) based on $s_y^2 = 87.5$. Round to two decimal places.

9.35

Signup and view all the answers

Calculate the sample correlation coefficient using $s_{xy}=56.25$, $s_x=14.58$, and $s_y=9.35$. Round to three decimal places.

0.413

Signup and view all the answers

According to the first contingency table (Gender vs Opinion), are 'Opinion' and 'Gender' related? Justify using probabilities.

Yes, they are related.

Signup and view all the answers

According to the second contingency table (AD type vs Decision), are 'Decision' and 'AD type' related? Justify using probabilities.

No, they are not related (independent).

Signup and view all the answers

Medical stats: 10% of the population has Disease A. A test is 90% accurate (gives correct positive if disease is present, correct negative if disease is absent). If a person receives a positive result, what is the probability they actually have Disease A?

0.5 or 50%

Signup and view all the answers

Peter bets $100. Probabilities: P(Lose $100) = 0.5$, P(Break Even $0) = 0.3$, P(Win $100) = 0.2$. Let X be the gain. What is the expected value E(X)?

-$30

Signup and view all the answers

Peter bets $100. Probabilities: P(Lose $100) = 0.5$, P(Break Even $0) = 0.3$, P(Win $100) = 0.2$. E(X) = -30. What is the variance V(X)?

6100

Signup and view all the answers

If Peter bets $1,000 (Y=10X) in a single race, using E(X)=-30 from the previous question, what is the expected value E(Y)?

-$300

Signup and view all the answers

If Peter bets $1,000 (Y=10X)$ in a single race, using V(X)=6100 from the previous question, what is the variance V(Y)?

610,000

Signup and view all the answers

Assuming races are independent, what is the expected value of Peter's total outcome if he bets $100 in each of 10 races ($Y = X_1 + ... + X_{10}$)? Use E(X) = -30.

-$300

Signup and view all the answers

Assuming races are independent, what is the variance of Peter's total outcome if he bets $100 in each of 10 races ($Y = X_1 + ... + X_{10}$)? Use V(X) = 6100.

61,000

Signup and view all the answers

Given X ~ Binomial(n=25, p=0.3), use the provided binomial probability table to find $P(X \le 12)$.

0.983

Signup and view all the answers

Given X ~ Binomial(n=25, p=0.3), use the provided binomial probability table to find $P(8 \le X \le 12)$.

0.471

Signup and view all the answers

Given X ~ Binomial(n=25, p=0.3), use the provided binomial probability table to find $P(X \ge 12)$.

0.044

Signup and view all the answers

Given X ~ Binomial(n=25, p=0.3), use the provided binomial probability table to find $P(X = 12)$.

0.027

Signup and view all the answers

Store expects 20,000 customers. 25% make a purchase. What is the expected number of purchasing customers?

5000

Signup and view all the answers

Store expects 20,000 customers, 25% make a purchase. What is the variance in the number of purchasing customers?

3750

Signup and view all the answers

Store expects 5000 purchasing customers (E(X)=5000). Average purchase is $50. What is the expected total sales revenue?

$250,000

Signup and view all the answers

Store's variance in number of purchasing customers is 3750 (V(X)=3750). Average purchase is $50. What is the standard deviation of the total sales revenue? Round to two decimal places.

$3061.86

Signup and view all the answers

Flashcards

Qualitative Data

Data described by non-numerical categories.

Quantitative Data

Data that can be counted and expressed numerically.

Discrete Data

Data with distinct, separate values.

Continuous Data

Data that can take any value within a range.

Signup and view all the flashcards

Nominal Data

Data with categories that cannot be ordered.

Signup and view all the flashcards

Ordinal Data

Data with ranked and ordered categories.

Signup and view all the flashcards

Interval Data

Data with equal intervals and an arbitrary zero point.

Signup and view all the flashcards

Ratio Data

Data with a true zero point and equal intervals.

Signup and view all the flashcards

Frequency Distribution Table

A summary of data that consists of class intervals and their frequencies.

Signup and view all the flashcards

Relative Frequency

Class frequency divided by total.

Signup and view all the flashcards

Cumulative Frequency

The sum of all frequencies up to a specific class.

Signup and view all the flashcards

Median

The value located at the center of a dataset when arranged in order.

Signup and view all the flashcards

Mode

The value that appears most frequently in a dataset.

Signup and view all the flashcards

Variance

A measure of the spread of data around the mean, always greater than or equal to zero.

Signup and view all the flashcards

Interquartile Range (IQR)

The difference between the third quartile (Q3) and the first quartile (Q1).

Signup and view all the flashcards

Standardization

Subtracting the mean from a data value, then dividing by the standard deviation.

Signup and view all the flashcards

Degrees of Freedom

The number of independent pieces of information used to calculate a statistic.

Signup and view all the flashcards

Bivariate Data

Data obtained from examining two variables.

Signup and view all the flashcards

Contingency Table

A frequency table that displays bivariate data.

Signup and view all the flashcards

Correlation Coefficient

The covariance divided by the standard deviations of the two variables, indicating the strength and direction of a linear relationship.

Signup and view all the flashcards

Sample Correlation Coefficient

Based on sample data, the value varies with sample.

Signup and view all the flashcards

Conditional Probability

The probability of an event given another.

Signup and view all the flashcards

No Correlation != Independence

The statement about an events, 'no correlation' and 'statistical independence' are different concepts, not same.

Signup and view all the flashcards

Probability Distribution

A probability distribution the sample space of a random variable and associated probabilities.

Signup and view all the flashcards

Expected Value

The average of each possible value by its probability.

Signup and view all the flashcards

Bernoulli Trial

Describes a series of independent trials, each with two outcomes (success or failure).

Signup and view all the flashcards

Binomial Random Variable

Represents the number of successes in a fixed number of independent Bernoulli trials.

Signup and view all the flashcards

Independent Trials

Independent trials where the outcome of one trial does not affect others.

Signup and view all the flashcards

Standard Deviation.

The spread around the mean for binomial is sqrt(np(1-p)).

Signup and view all the flashcards

Study Notes

Statistical analysis notes
Sangyung Lee, PHD
Kyung Hee University

True and False Questions

There are 15 True or False questions worth 2 points each, for a total of 30 points

Chapter 1

Discrete data cannot be ratio data
Even ratio data can be measured using an ordinal scale
Inferential statistics deal with making inferences about a population
A population can be considered a set of numbers
The daily sleep duration of housewives in Korea cannot be considered a population
Qualitative data refer to data that can be quantified

Chapter 1 Answers

Discrete data can be ratio data (False)
Even ratio data can be measured using an ordinal scale (True)
Inferential statistics deal with making inferences about a population (True)
A population can be considered a set of numbers (True)
The daily sleep duration of housewives in Korea can be considered a population (False)
Qualitative data cannot be quantified (False)

Population vs Sample

In inferential statistics, population characteristics are called parameters
Sample characteristics are called statistics
Population parameters are: Mean (μ), Variance (σ²), and Standard Deviation (σ)
Sample statistics are: Mean (X), Variance (S²), and Standard Deviation (S)

Data types

Data is either Qualitative or Quantitative
Quantitative data branch out into discrete or continuous data
Qualitative data branch out into Nominal data or Ordinal data
Quantitative data branch out into Interval data or Ratio data

Chapter 2

A frequency distribution table consists of class intervals and frequencies
Relative frequency is calculated by dividing the class frequency by the total frequency
Cumulative frequency is the sum of all frequencies up to a specific class
Frequency distribution may be represented using a graph
The sum of relative frequencies in a frequency distribution table is always 0
The first step in creating a frequency distribution table is determining the class width
Class width is determined by dividing the range of data by a number of class intervals

Chapter 2 Answers

A frequency distribution table consists of class intervals and frequencies (True)
Relative frequency is calculated by dividing the class frequency by the total frequency (True)
Cumulative frequency is the sum of all frequencies up to a specific class (True)
Frequency distribution may be represented using a graph (True)
The sum of relative frequencies in a frequency distribution table is always 1 (False)
The first step in creating a frequency distribution table is determining the class interval (False)
Class width is determined by dividing the range of data by a number of class intervals (True)

Chapter 3

The mode is the value located at the center of a dataset arranged in ascending order
Variance is always greater than or equal to zero
The interquartile range can have a negative value
To standardize ordinal data, the percentile should be calculated
The mean of standardized values of any population is always 0
Standardization refers to subtracting the mean from a data value and dividing it by the variance
Degrees of freedom refer to the actual number of observations used in calculating a sample statistic
The standard deviation has a different unit of measurement than the original data
If the sample size is increased, the sample variance increases to the population variance

Chapter 3 Answers

The mode is the median value located at the center of a ascending dataset (False)
Variance is always greater than or equal to zero (True)
Interquartile range cannot be a negative value (False)
To standardize ordinal data, the percentile is calculated (True)
The mean of standardized values of any population is always 0 (True)
Standardization subtracts the mean from a data value and divides it by the standard deviation (False)
Degrees of freedom is the actual number of observations used in calculating a sample statistic (True)
The standard deviation has the same unit of measurement as the original data (False)
When the sample size is adequately large, the sample variance is nearly equal to population variance (True)

Degrees of Freedom

When calculating the sample mean, one degree of freedom is lost as remaining values are dependent on it
When estimating variance from a sample, divide by n-1 instead of n to correct for bias
Adjustment makes sample variance an unbiased estimator of population variance
Case: When the population mean (μ) is known, then the following is true:
- Formula: s² = 1/n * Σ(xᵢ - μ)²
- Reason: Deviations are measured from the true mean, so no correction is needed
Case: When population mean is unknown (typical), then the following is true:
- Formula: s² = 1/(n-1) * Σ(xᵢ - x̄)²
- Reason: The sample mean (x̄) replaces the unknown population mean, using one degree of freedom, thus a correction is required.

Chapter 4

Bivariate data refers to data obtained by examining two variables
A contingency table cannot be considered a frequency table that displays bivariate data
The correlation coefficient is the covariance divided by the standard deviations of the two variables
The sample correlation coefficient is a type of sample statistic
The values of the sample correlation coefficient and the population correlation coefficient always match
If the correlation coefficient is 0, no relationship exists between the two populations
The correlation coefficient takes values between 0 and 1

Chapter 4 Answers

Bivariate data refers to data obtained by simultaneously examining two variables (True)
A contingency table can be considered a frequency table that displays bivariate data (False)
The correlation coefficient is the covariance divided by the standard deviations of the two variables (True)
The sample correlation coefficient is a type of sample statistic (True)
The values of the sample correlation coefficient and the population correlation coefficient almost always match (True)
If the correlation coefficient is 0, there is no relationship between the two populations (True)
The correlation coefficient takes values between -1 and 1 (False)

Chapter 5

Conditional probability can be expressed using joint probability and marginal probability
"No correlation" and "statistical independence" are different concepts
Bivariate data, where samples are drawn simultaneously from two populations, can be considered a compound event
The joint probability of two events equals the product of their marginal probabilities, so the two events are considered independent
A conditional probability can be expressed as the product of a joint probability and a marginal probability, then two events are considered independent

Chapter 5 Answers

Conditional probability can be expressed using joint probability and marginal probability (True)
Statistical independence equals 'no correlation' (False)
In bivariate data where samples are simultaneously drawn from two populations, it can be considered a compound event (True)
If two events' joint probability equals the product of marginal probabilities, the two events are independent (True)
If conditional probability can be expressed as the product of a joint probability and marginal probability, the two events are considered independent (False)

Chapter 6

A probability distribution consists of the sample space of a random variable and the associated probabilities
Any random variable, when standardized, has a variance of 0
The expected value of a random variable is conceptually the same as its mean
The expected value of a random variable is calculated by multiplying each possible value by its corresponding probability and summing the results
The standard deviation of a random variable is the square root of the average of the squared deviations from the mean
The variance of a new random variable created by adding two random variables is equal to the sum of their individual variances

Chapter 6 answers

The sample space of a variable and is related probabilies makes up a probability distribution (True)
When standardized, any random variable has a variance of 1 (False)
The expected value of a random variable is conceptually the same as its mean (True)
The expected value of a random variable is calculated by multiplying each possible value by its corresponding probability and summing the results (True)
A random variable’s standard deviation is the square root of the average of the squared deviations from the mean (True)
A new random variable created by adding two random variables has a variance equal to the individual variances only when the two random variables are independent (False)

Chapter 7

A binomial random variable can be expressed as the sum of Bernoulli random variables
The variance of a binomial distribution is the mean multiplied by the failure probability
Independent trials mean that the outcome of one trial does not affect the others
The standard deviation of a binomial distribution is np(1-p)
A discrete random variable takes only integer values
A binomial random variable is a continuous random variable
A Bernoulli trial is an independent trial with exactly two mutually exclusive outcomes
If the probability of success is less than 0.5, the binomial distribution has a long tail to the left
Even if the success probability is 0.5, the binomial distribution may not be symmetric
The total of all binomial probabilities always sums to 1

Chapter 7 Answers

A binomial random variable = sum of Bernoulli random variables (True)
The binomial distribution variance = mean * the failure probability (True)
Outcomes of independent trials means that one trial doesn't affect the others (True)
A binomial distribution standard deviation = √np(1-p) (False)
Discrete random values take on integer values (True)
A binomial random variable is discrete (False)
A Bernoulli trial is an independent trial with two mutually exclusive outcomes (True)
If success prob < 0.5, binomial distribution has a long tail to the right (False)
If the success probability is 0.5, the binomial distribution will be symmetric. (False)
total of all binomial probabilities = 1 (True)

Problem Solving

There are 10 Problem Solving questions for a total of 70 points
The presenter suggested that you will need a calculator

Chapter 3 Formula

Population mean μ = ΣΧ₁/Ν
Sample mean x = ∑x₁/n
Range = Maximum value – Minimum value
Interquartile Range (IQR) = Q3 – Q1
Population Deviation = Χ₁-μ
Sample Deviation = x₁- x
Population variance σ² = Σ(Χ₁-μ)²/Ν
Population standard deviation σ = ν(Σ(Χ₁-μ)²/Ν)
Sample variance s²= ∑(x₁- x)²/(n-1)
Sample standard deviation s = v(∑(x-x)²/(n-1))
Z score = (Χ₁-μ)/σ

C3-1

Answer the questions based on the given data:
Data: 6, 8, 10, 10, 10, 12, 14
Calculate the mean and median
Determine the range and interquartile range (IQR)
Standardize the value 10 using Z-scores

C3-1 Answer

Given Data: 6, 8, 10, 10, 10, 12, 14
Mean = (6+8+10+10+10+12+14)/7 = 10
Median = 10
Range = max-min = 14-6 = 8
IQR = Q3-Q1 = 12-8 = 4
Z score of 10 = (Χ₁-μ)/σ = (10-10)/σ = 0

C3-2

The following table outlines the steps for calculating the sample mean, variance, and standard deviation of variable x.
You have to fill in the blanks

C3-2 Answer

Calculations for the sample mean, variance, and standard deviation of variable x.

Chapter 4 Formula

Population covariance σχγ = ∑((Χ₁-μχ)(Υ₁-μγ))/N
Sample covariance sxy = ∑((x₁- ㄡ)(y₁- ӯ))/(n-1)
Correlation σχγ/σχσγ = Sxy/SxSy

C4

Need to calculate the sample covariance and correlation coefficient between statistics scores of five students
Need to fill in the blanks

C4 Answer

Chapter 5 Formula

Marginal Probability
- P(A) = Probability that event A occurs
Joint Probability
- P(A∩B) = Probability that event A and B occur together
Conditional Probability
- P(A|B) = Probability that event A occurs given that event B has occurred
- P(A|B) = P(A ∩ B) / P(B)
- P(B|A) = P(A ∩ B) / P(A)
- P(A ∩ B) = P(A|B) × P(B) = P(B|A) × P(A)
When events A and B are mutually independent:
- P(AB) = P(A)
- P(A ∩ B) = P(A|B) × P(B) = P(A) × P(B)

C5-1

Determine if there a relationship between "opinion" or "gender" according to the table below
Determine if there a relationship between "decision" or "ad-type" according to the table below

C5-1 Answer

Opinion and gender can be related:
- P(Agree ∩ Male) = 0.42
- P(Agree) × P(Male) = 0.54 × 0.6 = 0.324
- P(Agree ∩ Male) ≠ P(Agree) × P(Male)
Answer: Related
Decision and action are not related
- P(Purchase ∩ Offline) = 0.324
- P(Purchase) × P(Offline) = 0.54 × 0.6 = 0.324
- P(Purchase n Offline) = P(Purchase) × P(Offline)
Answer: Not related (Independent)

C5-2

The test gives 90% right answers for those having the disease, and 90% for the other people who don't
10% of the world's population has the disease
What is the probability of one having the disease, given a test comes back as positive?

C5-2 Answer

P = Test Positive, PC = Test Negative
D = Has disease, DC = No disease
P(D) = 0.1
P(P|D) = 0.9
P(Pc| DC) = 0.9
P(DNP) = P(D)*P(P|D) = 0.1×0.9 = 0.09
therefore, if a test comes back positive, there is about 50/50 that you actually have the disease

Chapter 6 Formula

X: Random variable; P(X): Assigned Probability of X; E(X): Expected Value of X; V(X):Variance of X; COV(X,Y): Covariance of X and Y
Ε(X) = ∑(Χ×P(X))
E(a) = a
E(aX) = a × E(X)
E(X+a) = E(X)+a
E(X+Y) = E(X)+E(Y)
V(X) = ∑((X-E(X))2×P(X))
V(a) = 0
V(aX) = a2×V(X)
V(X+Y) = V(X)+V(Y)+2×COV(X,Y)

C6

Given "Peter problem"
With a 50% probability, he loses all his money.
With a 30% probability, he breaks even.
With a 20% probability, he wins $100.
- What are the expected value and variance if his bet is 100?
- What are the expected value and variance if his bet is 1000?
- Under the assumption that the games are statically independent, what are the expected value and variance of his net gain if he bets 100 in each of 10 races?

C6-1

need to compute the expected value and variance of the amount Peter can gain from a $100 bet?
You have to fill in the blanks

C6-1 Answer

What are the expected value and variance of the amount Peter can gain from a $100 bet?
E(X)=(-100)×0.5+0×0.3+100×0.2=-30
V(X)=(-100+30)²×0.5+(0+30)²×0.3+(100+30)²×0.2=6100

Q6-2

Find the expected value and variance if he bets 1000
- (Y=10X

Q6-2 Answer

Y=10X
(Y)=(10X)=10×E(X)=-300
V(Y)=V(10X)=100×V(X)=610000

Q6-3

Games are independent, what is the expected value and variance if he bets 10 times.

Q6-3 Answer

Let Y=X1+X2+X3+...+X10
E(Y)=E(X1+X2+X3+...+X10)=10×E(X)=-300
V(Y)=V(X1+X2+X3+...+X10)=10×V(X)=61000

Chapter 7 formula

The probability that a desired event (success) occurs X times in n independent Bernoulli trials with success probability p:
X=X1+X2+…+X (where x=1 if success or x=0 if failure)
P(X=k) = nCkpk(1-p)n-k
nCk=n!/(k!x(n-k)!)
n!=nx(n-1)×(n-2)×…×1
Ε(X) = ∑(X×P(X)) = np
V(X) = ∑((X-E(X))2×P(X)) = np(1-p)

C7-1

Questions about if X~Binomial(n=25, p=0.3) using the binomial probability (with table provided)

C7-1 Answer

If X~Binomial(n=25, p=0.3), P(X≤12)=0.983
If X~Binomial(n=25, p=0.3), P(8≤X≤12)=P(X≤12)-P(X≤7)=0.983-0.512=0.471
If X~Binomial(n=25, p=0.3), P(X≥12)=1-P(X≤11)=1-0.956=0.044
If X~Binomial(n=25, p=0.3), P(X=12)=P(X≤12)-P(X≤11)=0.983-0.956=0.027

C7-2

25% of visiting customers make a purchase at the store; the average amount is $50
The store expects about 20k visitors per year

C7-2 Answer

If X~Binomial(n=20,000,p=0.25)
Expected number of purchasing customers is: E(X)=np=20000×0.25=5000
By the formula for the variance of a binomial distribution:
V(X)=np(1-p)=20000×0.25×0.75=3750
If each person spends on average $50:E(Y)=E(50X)=50×E(X)=50×5000=250000
E(50X)= 50* Number of Customers
V(Y)=V(50X)=502×V(X)=502×3750=9375000
SQRT(V(Y))=SQRT(9375000)=3061.86, Standard Deviation

Studying That Suits You

Use AI to generate personalized quizzes and flashcards to suit your learning preferences.

Statistical Analysis Chapter 1

Choose a study mode

Podcast

Questions and Answers

Discrete data cannot be ratio data.

Even ratio data can be measured using an ordinal scale.

Inferential statistics deal with making inferences about a population.

A population can be considered a set of numbers.

The daily sleep duration of housewives in Korea cannot be considered a population.

Qualitative data refer to data that can be quantified.

A frequency distribution table consists of class intervals and frequencies.

Relative frequency is calculated by dividing the class frequency by the total frequency.

Cumulative frequency is the sum of all frequencies up to and including a specific class.

Every frequency distribution table can be represented using a graph.

The sum of relative frequencies in a frequency distribution table is always 0.

The first step in creating a frequency distribution table is determining the class width.

Class width is determined by dividing the range of data by the number of class intervals.

The mode is the value located at the center of a dataset arranged in ascending order.

Variance is always greater than or equal to zero.

The interquartile range can have a negative value.

To standardize ordinal data, the percentile should be calculated.

The mean of standardized values (Z-scores) of any population is always 0.

Standardization refers to subtracting the mean from a data value and dividing it by the variance.

Degrees of freedom refer to the actual number of observations used in calculating a sample statistic.

The standard deviation has a different unit of measurement than the original data.

If the sample size is sufficiently large, the sample variance becomes nearly equal to the population variance.

Bivariate data refers to data obtained by simultaneously examining two variables.

A contingency table cannot be considered a frequency table that displays bivariate data.

The correlation coefficient is the covariance divided by the standard deviations of the two variables.

The sample correlation coefficient is a type of sample statistic.

The values of the sample correlation coefficient and the population correlation coefficient always match.

If the correlation coefficient is 0, there is no relationship between the two populations.

The correlation coefficient takes values between 0 and 1.

Conditional probability can be expressed using joint probability and marginal probability.

"No correlation" and "statistical independence" are different concepts.

In bivariate data where samples are drawn simultaneously from two populations, it can be considered a compound event.

If the joint probability of two events equals the product of their marginal probabilities, the two events are considered independent.

If a conditional probability can be expressed as the product of a joint probability and a marginal probability, the two events are considered independent.

A probability distribution consists of the sample space of a random variable and the associated probabilities.

Any random variable, when standardized, has a variance of 0.

The expected value of a random variable is conceptually same as its mean.

The expected value of a random variable is calculated by multiplying each possible value by its corresponding probability and summing the results.

The standard deviation of a random variable is the square root of the average of the squared deviations from the mean.

The variance of a new random variable created by adding two random variables is equal to the sum of their individual variances.

A binomial random variable can be expressed as the sum of Bernoulli random variables.

The variance of a binomial distribution is the mean multiplied by the failure probability.

Independent trials mean that the outcome of one trial does not affect the others.

The standard deviation of a binomial distribution is $np(1-p)$.

A discrete random variable takes only integer values.

A binomial random variable is a continuous random variable.

A Bernoulli trial is an independent trial with exactly two mutually exclusive outcomes.

If the probability of success is less than 0.5, the binomial distribution has a long tail to the left.

Even if the success probability is 0.5, the binomial distribution may not be symmetric.

The total of all binomial probabilities always sums to 1.

Given the data: 6, 8, 10, 10, 10, 12, 14. Calculate the mean and median.

Given the data: 6, 8, 10, 10, 10, 12, 14. Determine the range and interquartile range (IQR).

Given the data: 6, 8, 10, 10, 10, 12, 14. Standardize the value 10 using Z-scores (assuming this data represents the population).

In calculating the sample mean for the grouped data in table C3-2, the sum of (x * Frequency) is 670. What divisor is used to find the average $ar{x}$? $ar{x} = 670 / (_____)$

In calculating the sample variance for the grouped data in table C3-2, the sum of (Deviation * Frequency) is 1017. Assuming a sample size n=100, what divisor is used to find the sample variance $s^2$? $s^2 = 1017 / (_____)$

Given the sample variance $s^2 = 1017 / 99$ from table C3-2, calculate the sample standard deviation (s). Round to three decimal places.

Calculate the sample mean $ar{x}$ for the March scores (x) from table C4.

Calculate the sample mean $ar{y}$ for the September scores (y) from table C4.

Calculate the sample variance $s_x^2$ for the March scores (x) using the sum of squared deviations (850) from table C4.

Calculate the sample variance $s_y^2$ for the September scores (y) using the sum of squared deviations (350) from table C4.

Calculate the sample covariance $s_{xy}$ using the sum of the products of deviations (225) from table C4.

Calculate the sample standard deviation $s_x$ for the March scores (x) based on $s_x^2 = 212.5$. Round to two decimal places.

Calculate the sample standard deviation $s_y$ for the September scores (y) based on $s_y^2 = 87.5$. Round to two decimal places.

Calculate the sample correlation coefficient using $s_{xy}=56.25$, $s_x=14.58$, and $s_y=9.35$. Round to three decimal places.

According to the first contingency table (Gender vs Opinion), are 'Opinion' and 'Gender' related? Justify using probabilities.

According to the second contingency table (AD type vs Decision), are 'Decision' and 'AD type' related? Justify using probabilities.

Medical stats: 10% of the population has Disease A. A test is 90% accurate (gives correct positive if disease is present, correct negative if disease is absent). If a person receives a positive result, what is the probability they actually have Disease A?

Peter bets $100. Probabilities: P(Lose $100) = 0.5$, P(Break Even $0) = 0.3$, P(Win $100) = 0.2$. Let X be the gain. What is the expected value E(X)?

Peter bets $100. Probabilities: P(Lose $100) = 0.5$, P(Break Even $0) = 0.3$, P(Win $100) = 0.2$. E(X) = -30. What is the variance V(X)?

If Peter bets $1,000 (Y=10X) in a single race, using E(X)=-30 from the previous question, what is the expected value E(Y)?

If Peter bets $1,000 (Y=10X)$ in a single race, using V(X)=6100 from the previous question, what is the variance V(Y)?

Assuming races are independent, what is the expected value of Peter's total outcome if he bets $100 in each of 10 races ($Y = X_1 + ... + X_{10}$)? Use E(X) = -30.

Assuming races are independent, what is the variance of Peter's total outcome if he bets $100 in each of 10 races ($Y = X_1 + ... + X_{10}$)? Use V(X) = 6100.

Given X ~ Binomial(n=25, p=0.3), use the provided binomial probability table to find $P(X \le 12)$.

Given X ~ Binomial(n=25, p=0.3), use the provided binomial probability table to find $P(8 \le X \le 12)$.

Given X ~ Binomial(n=25, p=0.3), use the provided binomial probability table to find $P(X \ge 12)$.

In calculating the sample mean for the grouped data in table C3-2, the sum of (x * Frequency) is 670. What divisor is used to find the average $ar{x}$? $ar{x} = 670 / (_____)$

Calculate the sample mean $ar{x}$ for the March scores (x) from table C4.

Calculate the sample mean $ar{y}$ for the September scores (y) from table C4.