Random Variables and Probability

Choose a study mode

Play Quiz
Study Flashcards
Spaced Repetition
Chat to Lesson

Podcast

Play an AI-generated podcast conversation about this lesson
Download our mobile app to listen on the go
Get App

Questions and Answers

Which of the following is a key characteristic of a 'fair' die in the context of probability?

  • The numbers on the die are chosen randomly without sequence.
  • It has been used in several previous rolls, influencing future outcomes.
  • Each of its faces has an equal probability of landing face up. (correct)
  • Its sides are made of different materials, affecting its balance.

How are discrete random variables different from continuous random variables?

  • Discrete variables are associated with measurable quantities, while continuous variables represent countable entities.
  • Discrete variables can assume any value within a given range, while continuous variables can only take specific, predetermined values.
  • Discrete random variables must be integers, while continuous random variables can be fractions.
  • Discrete variables have a finite or countable set of possible values, while continuous variables can take on a continuum of values within a range. (correct)

Given a random variable X representing the sum of two dice rolls, and another random variable Y representing the maximum of the same two dice rolls, how would you describe the relationship between the event spaces of X and Y?

  • The event spaces for X and Y are identical because they are derived from the same sample space.
  • The event space for X is a subset of the event space for Y, because the sum will always be greater or equal to the max.
  • The event space for Y is a subset of the event space for X, because the max will always be less than or equal to the sum.
  • The event spaces for X and Y are different, as they represent different functions applied to the same original space of outcomes. (correct)

What is the primary goal of creating a probability distribution for a random variable?

<p>To describe the likelihood of all possible values that the random variable can assume. (A)</p> Signup and view all the answers

If events A and B are labeled as 'independent', what does this imply about their probabilities?

<p>The occurrence of one event does not affect the probability of the other event occurring. (C)</p> Signup and view all the answers

A factory has two machines: Machine A produces 100 items with 5% defective, and Machine B produces 150 items with 8% defective. If one item is chosen at random from each machine, what additional information is needed to calculate the probability that both items are non-defective?

<p>Whether the machines are operating independently. (C)</p> Signup and view all the answers

What is the primary purpose of a joint probability distribution involving two random variables?

<p>To analyze the relationship between the two variables. (B)</p> Signup and view all the answers

How does calculating the marginal probability from a joint probability distribution simplify analysis?

<p>It isolates the probability of each variable, regardless of the other. (A)</p> Signup and view all the answers

Which of the following statements accurately describes the relationship between probability mass function (PMF) and cumulative distribution function (CDF)?

<p>CDF gives the probability that a random variable is less than or equal to a value, while PMF gives the probability it is exactly equal to a value. (B)</p> Signup and view all the answers

What characteristic differentiates the use of standard deviation from variance?

<p>Standard deviation is measured in the same units as the data. (D)</p> Signup and view all the answers

Why is it important to normalize covariance when comparing the relationship between variables in different datasets?

<p>Normalization scales the values to a consistent range between -1 and +1, allowing straightforward comparisons. (D)</p> Signup and view all the answers

What does a high standard deviation signify in the context of a dataset?

<p>Indicates that the values in the dataset are spread out over a wider range. (D)</p> Signup and view all the answers

Why is correlation more often used than covariance when comparing the relationships between variables across machine learning domains?

<p>Correlation is usually preferred for machine learning. (B)</p> Signup and view all the answers

What is the significance of the expected value (mean) of a random variable?

<p>It provides a 'weighted average' of possible outcomes (B)</p> Signup and view all the answers

How would you describe covariance?

<p>Indicates how linearly two variables change together. (B)</p> Signup and view all the answers

Given two independent events A and B, where P(A) = 0.6 and P(B) = 0.4, what is the probability of both A and B occurring?

<p>0.24 (C)</p> Signup and view all the answers

In the context of random variables, what does the term 'outcome' refer to?

<p>A possible value that the random variable can take. (A)</p> Signup and view all the answers

How does knowing the probability distribution of a customer's taste for commodities benefit a shop manager?

<p>Will predict which products will be sold the most (D)</p> Signup and view all the answers

In the context of random variables, what does 'PMF' mean?

<p>Probability Mass Function (A)</p> Signup and view all the answers

Given that events A and B are mutually exclusive, what can be said about whether they are independent?

<p>They are not independent. (D)</p> Signup and view all the answers

In a study, event A is 'a person smokes' and event B is 'a person develops lung cancer'. If these events are independent, what is true?

<p>Smoking has no effect on the likelihood of getting lung cancer. (D)</p> Signup and view all the answers

What is the benefit of finding the correlation between variables?

<p>Allows comparing two sets of data. (A)</p> Signup and view all the answers

You have a six-sided die that is weighted such that the probability of rolling a 6 is twice as high as any other number, what is the probability of rolling a 6?

<p>$2/7$ (D)</p> Signup and view all the answers

You have a jar of marbles, 3 blue, 2 green, and 5 red. What is the probability that you draw a blue marble, followed by a red marble without replacement?

<p>$1/3$ (C)</p> Signup and view all the answers

Consider rolling two dice. Let X be the event of rolling doubles, Y the event that the sum is greater than 7. What statement best describes the relationship between rolling doubles and rolling a sum greater than 7?

<p>They are positively correlated. (D)</p> Signup and view all the answers

Event A is 'it rains tomorrow', and event B is 'the local baseball team wins their game tomorrow'. Are these two events typically independent or dependent, and why?

<p>Dependent, as rain can affect the likelihood of the team playing well or the game being played. (D)</p> Signup and view all the answers

A coffee shop owner uses past sales data and finds that the mean daily coffee sales is 200 cups with a standard deviation of 20 cups. What does the standard deviation tell you about the daily sales?

<p>Most days, sales are between 180 and 220 cups. (B)</p> Signup and view all the answers

Which of the following is an example of discrete data?

<p>The number of students in a class. (C)</p> Signup and view all the answers

Machine A's item's weights are normally distributed with a mean of 50 grams and machine B's item's weights are normally distributed with a mean of 50 grams. Knowing only this, what additional information is needed to determine which machine demonstrates more consistency in its production weights?

<p>I need the standard deviations of the item weights. (C)</p> Signup and view all the answers

How does statistical independence simplify the calculation of joint probabilities?

<p>By multiplying the probabilities of individual events. (A)</p> Signup and view all the answers

Which of the following is a discrete random variable?

<p>The number of books on a bookshelf. (D)</p> Signup and view all the answers

The number of dogs in each household of your neighborhood is a discrete random variable. You perform a survey to calculate the following probabilities: P(X=0) = 0.4, P(X=1) = 0.3, P(X=2) = 0.2. What is P(X>2)?

<p>0.1 (D)</p> Signup and view all the answers

In an experiment involving the toss of a fair coin and the roll of a fair six-sided die, what constitutes the sample space?

<p>The total number of outcomes for each independent event. (B)</p> Signup and view all the answers

If the covariance between height and weight is positive, what does that generally indicate?

<p>As height increases, weight tends to increase. (C)</p> Signup and view all the answers

Based on the probability that the event will happen, what does 'weight' in weighted average refer to?

<p>How likely each value will happen. (B)</p> Signup and view all the answers

What best describes the relationship between correlation and dimension reduction?

<p>Dimensionality reduction may be needed if there are multicollinear predictive variables. (A)</p> Signup and view all the answers

What is the difference in information that is communicated from variance to standard deviation?

<p>Standard deviation and variance communicate the same information, but standard deviation is in the same units as the original data. (C)</p> Signup and view all the answers

When is it useful to find the joint probability mass function (PMF)?

<p>When we consider the relationship between the two variables. (D)</p> Signup and view all the answers

Which of the following is a discrete random variable?

<p>Number of heads in 5 coin tosses (C)</p> Signup and view all the answers

Consider the following two datasets: Dataset A: 1, 2, 3, 4, 5 Dataset B: 5, 4, 3, 2, 1

What is the covariance between Dataset A and Dataset B?

<p>-2.5 (C)</p> Signup and view all the answers

What is the benefit of using correlation in machine learning training when dealing with high-dimensional data?

<p>Correlation allows us to reduce the number of features in the dataset (A)</p> Signup and view all the answers

Consider two random variables X and Y with joint PMF given by:

             Y = 0	 Y = 1	   Y = 2

X = 0 1/6 1/4 1/8 X = 1 1/8 1/6 1/6 What is P(X=0, Y≤1)? a) 5/12 b) 8/12 c) 9/24 d) 3/24

<p>5/12 (@)</p> Signup and view all the answers

According to the joint PMF in Q42, what is P(Y=1 | X=0)?

<p>6/13 (A)</p> Signup and view all the answers

According to the joint PMF in Q4, what is the relation between X and Y?

<p>Dependent (B)</p> Signup and view all the answers

How do we decide which features to drop based on the correlation coefficient threshold?

<p>Drop features with a correlation coefficient higher than the threshold (B)</p> Signup and view all the answers

The duration of a heart operation is normally distributed with mean 170 minutes and standard deviation 14 minutes. What percentage of operations last between 142-198 minutes?

<p>95% (B)</p> Signup and view all the answers

What happens to the variance of the dataset if each value is multiplied by 2?

<p>It is multiplied by 4 (B)</p> Signup and view all the answers

Why is correlation preferred over covariance for comparisons between variables across domains?

<p>Correlation is not affected by the units of measurement (B)</p> Signup and view all the answers

A correlation near +1 means:

<p>All of them (D)</p> Signup and view all the answers

A random variable Y has the following PMF : P ( Y = 1 ) = 0.2 , P ( Y = 2 ) = 0.5 , P ( Y = 3 ) = 0.3. What is E [ Y ] ?

<p>2.1 (C)</p> Signup and view all the answers

If Cov ( X , Y ) = 0, which statement is always true?

<p>X and Y have no linear relationship (B)</p> Signup and view all the answers

Dataset A: [1, 2, 3], Dataset B: [3, 2, 1]. What is their correlation coefficient?

<p>-1.0 (B)</p> Signup and view all the answers

If all values in a dataset are multiplied by 3, what happens to the variance?

<p>Multiplies by 9 (C)</p> Signup and view all the answers

A normal distribution has mean μ = 50 and standard deviation σ=10. What is P ( 40 ≤ X ≤ 60 ) ?

<p>68% (A)</p> Signup and view all the answers

A coin is flipped 5 times. What is the probability of getting exactly 3 heads?

<p>10/32 (D)</p> Signup and view all the answers

In a ML dataset, two features have a correlation coefficient of 0.9. What should you do?

<p>Drop one of the features (A)</p> Signup and view all the answers

Flashcards

What is a Random Variable (RV)?

Variable whose outcome is random.

What is a Discrete Random Variable?

A random variable with a finite or countable set of possible values.

What is a Continuous Random Variable?

A random variable that takes on a continuum of possible values within an interval.

What is a probability distribution?

A function that describes the likelihood of obtaining the possible values that a random variable can assume

Signup and view all the flashcards

What is PMF? (probability mass function)

Also known as probability mass function is the probability that a random variable will take a value exactly equal to x.

Signup and view all the flashcards

What is CDF? (cumulative distribution function)

Also known as cumulative distribution function is the probability that random variable values are less than or equal to x.

Signup and view all the flashcards

When are Random variables independent?

Random variables X and Y are independent if for any sets A and B, P(X in A, Y in B) = P(X in A) * P(Y in B).

Signup and view all the flashcards

What is joint probability distribution?

The probability distribution for two or more random variables.

Signup and view all the flashcards

What is the expected value (mean)?

The average value of a random variable, calculated by weighting each possible value by its probability.

Signup and view all the flashcards

What is Variance?

A measure of the spread of the values of a random variable around its mean.

Signup and view all the flashcards

What is Standard deviation?

The square root of the variance; a measure of the spread of the values of a random variable in the same units as the variable itself.

Signup and view all the flashcards

What is Covariance?

A measure of how much two random variables change together.

Signup and view all the flashcards

What is Correlation?

A standardized measure of the strength and direction of a linear relationship between two variables.

Signup and view all the flashcards

Study Notes

  • Session 2 covers statistics, including random variables, mean, variance, standard deviation, covariance, and correlation.
  • It also covers probability, including independence of events and joint vs. marginal probability.

Random Variables

  • Probability is often linked to at least one event.
  • The random variable represents the outcome of events.
  • Outcomes of events are random
  • Rolling a die or pulling a ball are toy examples of events
  • It's often useful to know the likelihood of a random variable/event taking on a specific value.
  • A "fair" die means each face has an equal chance of landing face up.
  • Random variables can be discrete as a finite or countable sequence e.g. cards or dice.
  • Random variables can be continuous and take on values in an interval: e.g. lifetime of a car, amount of water.
  • Focus is on discrete random variables.
  • Events/random variables can be any suitable function, including summation, difference, product, max, min, etc.
  • The event space differs from the original outcome space.
  • X(a,b) = a+b and Y(a,b) = max(a,b) are further examples of dice rolls
  • Sx represents the set of values resulting from either definition
  • Solving requires an understanding of r.v and P(r.v = ?)

Example

  • The space S covers every possible outcome of dice rolls
  • 36 outcomes are possible for 2 dice rolls
  • For R.v. X(a,b) = a+b, Sx covers from 2 to 12
  • For R.v. Y(a,b) = max(a,b) Sy covers from 1 to 6
  • P(X=2) = P({(1,1)}) = 1/36
  • P(X=3) = P({(1,2),(2,1)}) = 2/36
  • P(X=4) = P({(3,1),(1,3),(2,2)}) = 3/36
  • P(Y=1) = P({(1,1)}) = 1/36
  • P(Y=2) = P({(2,1),(1,2),(2,2)}) = 3/36
  • P(Y=3) = P({(1,3),(3,1),(3,2),(2,3),(3,3)}) = 5/36

Probability Distribution

  • Likelihood of obtaining possible values is captured using a probability distribution
  • Probability distribution has possible outcomes and statistical liklihood.
  • It shows the likelihood of each discrete random variable value occurring in experiments.
  • The discrete random variable's probability is the probability distribution.
  • The value of each outcome can easily be found in advance if you know probability distribution

PMF vs CDF

  • Probability mass function/PMF covers the probability that a random variable will take a value equal to x: formula F(X = 𝑥𝑖 ) = P{X =𝑥𝑖 }
  • Cumulative distribution function/CDF is "the probability that random variable values less than or equal to x": formula F(X ≤ 𝑥𝑖 ) = P{X ≤ 𝑥𝑖 }

Independent Random Variables

  • For random variables X and Y to be independent, for any sets of real numbers A and B: P{X ∈ A,Y ∈ B} = P{X ∈ A}P{Y ∈ B} or P(XꓵY) = P(X).P(Y)
  • For independent functions X, Y and Z: P(XꓵYꓵZ) = P(X).P(Y).P(Z)
  • Independence and mutually exclusive events don't overlap
  • Mutually exclusive events don't happen simultaneously
  • Independent events can happen at the same time but they don’t imply each other

Exam Question

  • If A and B are independent events, their complements are independent as well, P(𝐴𝑐 ꓵ 𝐵𝑐 ) = P(𝐴𝑐 ).P(𝐵𝑐 )
  • If A = "Ahmed passes the exam" and B = "Mahmoud passes the exam", and A and B are independent:
  • P(A) = 3/4
  • P(B) = 2/3
  • Probability of Both passing = P(AꓵB) = P(A).P(B) = (3/4) * (2/3) = 1/2
  • Probability of At least one passing P(AꓴB) = P(A)+P(B)-P(AꓵB) = 3/4 + 2/3 − 1/2 = 11/12
  • Probability of Neither Passing =P(𝐴𝑐 ꓵ 𝐵𝑐 ) = P(𝐴𝑐 ).P(𝐵𝑐 ) =(1-P(A)).(1-P(B))= 1/4 * 1/3 = 1/12
  • Probability of Only Ahmed passing = P(Aꓵ𝐵𝑐 ) = P(A).P(𝐵𝑐 ) = P(A).(1-P(B)) = (3/4) * (1/3) = 1/4
  • Probability of Only Mahmoud passing = P(𝐴𝑐 ꓵB) = P(𝐴𝑐 ).P(B) = (1-P(A)).P(B) = (1/4) * (2/3) = 1/6

Machine Example

  • In a factory: Machine A produces 100 items daily, 9% defective; Machine B produces 150, 18/150 is defective
  • A defective item produced by A is 0.09.
  • A non-defective item produced by A has P(NA) = 0.91.
  • Probability of a defective item from B: P(DB) = 18/150 = 0.12.
  • Probability of a non-defective item produced by B: P(NB) = 0.88
  • Probability that both items are non-defective P(NANNB) = P(NA).P(NB) = 0.91*0.88 = 0.8
  • Probability that one item is defective E = (NANDB)U(NBNDA) → P(E) = P(N₁NDB) + P(NBNDA) = P(NA).P(DB) + P(NB).P(DA) = 0.1893
  • A random variable and difference between discrete and continuous r.v. can be measured to understand meaning
  • Probability can be based on one or more outcomes of a sample space.
  • Probability of values taken by two independent r.v.s can be easily measured

Joint Probability

  • Joint probability assesses the likelihood if random variables are not independent
  • A joint probability distribution represents a probability distribution for two or more random variables with the formula f(x,y) = P(X = x, Y = y)
  • Joint probability distribution looks for a relationship between two variables

Example

  • Find the probability number 3 will occur twice - Given two dice are rolled at the same time.
  • Given below: f(x,y) = P(X = x, Y = y), the conditions are being labelled instead of X and Y
  • A table of probabilities for happening happening at the same time
  • A joint probability table can be used to find probability of event occurring.
  • To find the probability of X = 3 and Y = 3 is ⅙.
  • In experiments for possible causes of cancer, the number of cigarettes smoked is measured in one variable whilst the patient's age is measured in second
  • If , the F(X =𝑥𝑖 ,Y =𝑦𝑗 ), the joint probability mass function (PMF), we can compute that of values of X and Y, F(X =𝑥𝑖 ,Y =𝑦𝑗 ) = P{X =𝑥𝑖 ,Y =𝑦𝑗 }
  • If F(x, y), with joint cumulative probability distribution function, we can compute that of all values of X and Y, F(x, y) = P{X ≤ 𝑥𝑖 ,Y ≤ 𝑦𝑗 }
  • An individual PMF of X and Y can be calculated
  • P{X = 𝑥𝑖 } =σ𝑗 𝑃{X =𝑥𝑖 ,Y =𝑦𝑗 }  Marginal probability
  • Joint probability mass function computes PMF but the reverse is not true
  • The sum of marginal probabilities of one variable is 1

Example

  • If the random variables = X and Y
  • P(X=xi, Y=yj) = P(X= xi) = P(Y = yj) should be tested to see if X and Y are independent Example P(X=2,Y=2) = 1/6 not equal P(X=2)P(Y=2) = 3/16 means not independent

Mean, Variance, Std

  • Distribution for Independent r.v. has probability P(X=xi)
  • A single value of X or the most common value it takes is not reflected in distribution data
  • The data does not consider P(X=xi).
  • It gets the marginal probability over all values of Y, for joint variables
  • The expected value / mean is denoted by E(x) or µ
  • E(x) is calculated using a weighted average.
  • Weight determines how likely an event is, is dependent on infinite numbers
  • The weights are proportional to probability distribution for r.v

Expectation

  • If, the F(X =𝑥𝑖 ), is in the PMF, is based on random variable probability
  • 𝐸 𝑋 = σ𝑖 𝑥𝑖 . {P(X=𝑥𝑖 )} where i determines probability

Tossing a Die

  • A fair die results in no. of pounds
  • For a non-prime die, a no. of pounds is lost
  • Values are:
    • Xi: 2, 3, 5, -1, -4, -6
    • P(X = xi): 1/6, 1/6, 1/6, 1/6, 1/6, 1/6
  • E[X] = (21/6)+ (31/6)+ (51/6)+ (-11/6)+ (-41/6)+ (-61/6) = -(1/6)
  • An unfavourable gain is realised to player since E[X] is negative

Distribution

  • E(C) = c where c is a constant
    • E(𝑋 2) = σ𝑖 𝑥𝑖 {P(X=𝑥𝑖 )} and similarly E(𝑋 𝑛 ) = σ𝑖 𝑥𝑖{P(X=𝑥𝑖 )}, also E(W(X)) = σ𝑖 𝑊(𝑥𝑖 ). {P(X=𝑥𝑖 )} for W(X)
  • For a real number, 𝛼:
    • E(𝛼 X) = 𝛼 E(X)
    • E(X±𝛼) = E(X)±𝛼
  • Expectation can also be expressed between two samples.
    • E(X±Y) = E(X) ±E(Y)
    • E(𝛼𝑋 + 𝛽𝑌 + 𝛾) = 𝛼E(X)+𝛽E(Y)+𝛾 , where 𝛼, 𝛽, 𝛾are constants

Variability

  • Variance highlights how far data varies around weighted average
  • For discrete, the weights are determined by height
  • Variance: Var(X) = E[(X −μ)2]  Var(X) = E[𝑋2 ] − (𝐸[𝑋])2
  • This result can be expressed using properites of E(X)
  • standard division can be expressed through sigma (𝜎)
  • standard division = 𝜎= √𝑉𝐴𝑅(𝑋), 𝜎 > 0
  • a high range
  • low ranges mean a close data sector
  • Importance: variance allows different values to be estimated

Formulae

  • For real number alpha
    • Var(𝛼 X) = 𝛼 2 Var(X)
    • Var(X±𝛼) = Var(X)
    • 𝜎𝑋±𝛼 = 𝜎𝑋
    • 𝜎𝛼𝑋 = 𝛼 𝜎𝑋.
  • Formula properties must still be calculated on top of any variance calculation

Covariance and Correlation

  • Interchangable, however they may come from seperate origins
  • used for dependency and linear relations of two variables
  • Covariance: how variants impact each other
  • Correlation: how a change impacts another variable.
  • Covariance determines the degree of which two variables change together.
  • Covariance expresses: The tendency to increase and decrease. If the variable has a negative value then it suggest correlation
  • Magnitude depends on variable to interpret covariance,
  • Correlation measures strength and direction of relationships amongst the variables
  • 1 to -1 value is expressed by correlation
  • It scales and identifies relationship.
  • Covariance reveals direction, formula Cov(X, Y) = E[(X −𝜇𝑥 )(Y − 𝜇𝑦 )] or Cov(X, Y) = E[XY] − E[X]E[Y]
  • Changing units scale affects dependent variables
  • Limited data results of differing scales in different data sets
  • Standard division assesses the r/variable linear relationship.
  • The product of the deviations is equal to standard deviation

Properties

  • Cov(X, Y ) shows how much each relates to the other variable
  • Postive = indicates Y to the first, i.e. the average in x
  • Zero = indepedence means that the variance is = 0, in a non linear relation (X,Y)
  • Cov(X, Y) = Cov(Y,X)
  • Cov(X,X) = Var(X)
  • Cov(aX, bY) = ab Cov(X, Y) - for any constants
  • Cov(X1 +X2,Y) = Cov(X1,Y) + Cov(X2,Y)
  • Var(X+Y) = Var(X)+ Var(Y ) +2Cov(X, Y)
  • If one has a relationship based on -1 or another 1 the core indicates how well one impacts the next through both variables

Studying That Suits You

Use AI to generate personalized quizzes and flashcards to suit your learning preferences.

Quiz Team

Related Documents

Use Quizgecko on...
Browser
Browser