Probability, Random Variables, and Distributions (BMF5324) PDF
Document Details
Uploaded by Deleted User
National University of Singapore
Dr Yen Teik Lee
Tags
Summary
These lecture notes cover the topics of Probability, Random Variables and Distributions. The lecture notes include discussions of various probability concepts.
Full Transcript
PROBABILITY, RANDOM VARIABLES, AND DISTRIBUTIONS Dr Yen Teik Lee BMF5324 1 THE JOURNEY Hypothesis Final Project Generation,...
PROBABILITY, RANDOM VARIABLES, AND DISTRIBUTIONS Dr Yen Teik Lee BMF5324 1 THE JOURNEY Hypothesis Final Project Generation, Regression (2) Visualization Data Exploration and Wrangling, Statistics Regression (3) Regression (1) Course Summary Introduction, Team Pledge Probability Machine Learning & Inferential Statistics Time Series Analysis REVIEW 3 DATA LAB 2 TIPS 1. Pypfopt mu = expected_returns.mean_historical_return(stock_prices) Sigma = risk_models.sample_cov(stock_prices) 2. Portfolio optimization works iff the covariance matrix can be inversed. 3. Black-Litterman Model. How to get subjective views? 4 RANDOMNESS AND RISK 5 ORDER IN CHAOS: GALTON BOARD NORMAL DISTRIBUTION Image credit: MIT 7 WHY DO WE NEED STATISTICS? Source: Forbes, The St. Louis Trust Company, Factset Data. 9 10 PROBABILITY Dr Yen Teik L ee BMF5324 PENALTY KICKS Left Middle Right P(Aim) 45% 17% 38% P(Goal) 76.7% 81% 70% ROADMAP PROBABILITY CONDITIONAL LAW OF TOTAL BAYES’ RULE INDEPENDENCE PROBABILITY PROBABILITY The Big Idea. The Soul of The Mental What if you A Crucial Statistics. Gymnastics. know P(B|A) but Concept. want to know P(A|B)? 13 PROBABILITY Adapted from Professor Brandon Stewart course. 14 MODELING DATA “What is the relationship between stock price momentum (i.e., past returns) and future stock return?” One plausible approach A better approach Generate a deterministic model of stock return Treat instead other factors as stochastic returnit = f(past returnit) So, we express it as But past return is not the only determinant! returnit = f(past returnit) + εit We can attempt to account for everything This allows us to have uncertainty over outcomes returnit = f(past returnit) + g(otherit) given our inputs But this is impossible This way of talking about stochastic outcomes is probability. Factor zoo… 300 factors and expanding http://review.chicagobooth.edu/finance/2018/article/300-secrets-high-stock-returns VISUALLY Image Source: Brandon Stewart, Shay O’brien 17 VISUALLY Probability Data Observed Generating Data Process Inference 18 THOUGHT EXPERIMENT 1 Start with probability 2 Contemplate world under hypothetical scenarios Is the observed relationship happening by chance or is it systematic? What the world look like under a certain assumption? 19 WHY PROBABILITY? Helps us imagine the hypotheticals Describes uncertainty in how the data is generated Data Analytics: estimate probability that something will happen Therefore, we need to know how probability gives rise to data 20 SAMPLE SPACES To define probability, we first define the set of possible outcomes. The sample space is the set of all possible outcomes, and is expressed as S or Ω. For example, if we flip a coin three times, there are 8 outcomes. So, Ω = {HHH, HHT, HTH, THH, HTT, TTH, THT, TTT} Define illogical guesses to have probability = 0. 21 LADY TESTING TEA Time: 1920s Place: Cambridge, England Setting: An afternoon tea party at the University Claim: “I can tell whether the milk is poured first, or and the tea is added next, or whether the tea is poured first and the milk is added to the tea.” Experiment: Perform a taste test. 8 cups of tea; 4 tea first and 4 milk first. Lady to randomly pick 4 out of the 8 cups and guess. 22 TEAMWORK What is the sample space (hypothetical) for the Lady Tasting Tea? 23 LADY TESTING TEA The sample space (hypothetical): Count possibilities: 24 LADY TESTING TEA Distributions related to the number of successes in a sequence of draws With Without replacements replacements Given number of Binomial Hypergeometric draws distribution distribution Given number of Negative Negative failures binomial hypergeometric distribution distribution 25 LADY TESTING TEA The number of successes (correct guesses), X, follows the hypergeometric distribution: X~hypergeometric(N,K,n), where N is the total number of cups of tea, K is four cups of either type (tea or milk first), n is the number of cups drawn. Next session, we will use this distribution to make inference about whether the lady passes the taste test. 26 CONDITIONAL PROBABILITY Let’s imagine that we sample an apple from a bag and the bag looks like this. Source: Professor Brandon Stewart, Shay O’brien 27 CONDITIONAL PROBABILITY Say P(A)>0 then the probability of B conditional on A: Therefore, 28 TEAMWORK Solve a series of questions with your teammates. 29 What’s ? 30 Say we randomly draw two cards from a deck of 52 cards and define the events A = {Jack on 1st Draw); B = {Jack on 2nd Draw). What is P(A, B)? P(A) = 4/52 P(B|A) = 3/51 P(A,B) = P(A) x P(B|A) = 4/52 x 3/51 31 80% of your friends like Dark Chocolate and 20% like Dark Chocolate and Coffee. What is the percent of those who like Dark Chocolate also like Coffee? P(Coffee | Dark Chocolate) = P(Dark Chocolate and Coffee) / P(Dark Chocolate) = 0.2/.08 = 25% 32 L AW OF TOTAL PROBABILIT Y (LTP) Let’s imagine that we sample an apple from a bag and the bag looks like this. Source: Professor Brandon Stewart, Shay O’brien L AW OF TOTAL PROBABILITY Say we randomly draw two cards from a deck of 52 cards and define the events A = {Jack on 1st Draw); B = {Jack on 2nd Draw). What is P(A, B)? What’s P(B)? 35 VERIFYING LTP 36 Say you have designed a trading strategy around a trading signal, and you want to know the probability of trading (i.e., P(trading)). P(trade|signal) = 0.75 P(trade|no signal) = 0.15 P(signal) = 0.6 and P(no signal) = 0.4 Note: whether you observe the trading signal partitions the data. You either observe or do not observe the signal. Thus, you can apply the LTP. 37 CALCUL ATING P(TRADE) P(trade) = P(trade|signal) x P(signal) + P(trade|no signal) x P(no signal) = 0.75 x 0.6 + 0.15 x 0.4 = 0.51 38 BAYES’ RULE Let’s imagine that we sample an apple from a bag and the bag looks like this. Source: Professor Brandon Stewart, Shay O’brien BAYES’ RULE What if you know P(B|A) but want to know P(A|B)? Think Bayes’ rule. If P(B) > 0, Bayes’ rule says Proof? Multiplication rule (i.e., P(B|A) x P(A) = P(A,B)) and the definition of conditional probability. 40 BAYES’ RULE MECHANICS 41 U.S. Billionaires in 2014 76.5% (24.5%) of female (male) billionaires inherited their wealth. 82 (568) female (male) billionaires. Is P(female | inherited billions) > P(male | inherited billions) 42 CALCUL ATING P(Female | Inheritance) P(Female | Inheritance) = P(Inheritance | Female) x P(Female) P(Inheritance) = 0.765 x [82/(568 + 82)] [0.765(82) + 0.245(568)] / (568 +82) = 0.31 43 INDEPENDENCE INDEPENDENCE Intuition If events A and B are independent, the occurrence of A carries no information about whether B occurred. Formally, Thus, P(A|B) = P(A) and P(B|A) = P(B) Independence is a crucial concept in statistics. ROADMAP RANDOM VARIABLE & CODING EXERCISE (TIME DISTRIBUTIONS & PERMITTING) PROBABILITY Overview. Discrete and Continuous Random Convergence of the Variable. Distribution. probability of getting PMF/PDF to CDF. heads as the number of trials increases 46 RANDOM VARIABLE, DISTRIBUTIONS & PROBABILITY 47 A Random Variable is a function that maps outcomes to real values. Are you serious? 48 48 RANDOM VARIABLE A GENTLE REVIEW Outcomes Sample Space (Ω) Event Event Space (∑) Flipping a coin: head or tail {head, tail} Get head {H} Rolling a dice: 1, 2, 3, 4, 5, 6 {1, 2, 3, 4, 5, 6} get even number {2,4,6} Rolling a roulette: 1-36, 0, 00 {1-36, 0, 00} get 35 {35} A Random Variable is a function that maps Inherently random outcomes to real values. or uncertain 4 Flip twice: Ω={HH,HT,TH,TT} Define RV to generate {0,1,1,2} or T = {0,1,2} 50 Question How about stock returns? 51 RANDOM VARIABLE Discrete Random Variable Continuous Random Variable The number of possible values is finite The number of possible values is infinite. Roulette game, the number of Asset prices, the distance between a members in a household, etc. planetary object and the Earth, etc. Probability Mass Function Probability Density Function 52 DISTRIBUTION Describes how a random variable is distributed with: 1. ‘Enough’ tries 2. Unlimited number of tries Binomial Distribution Exponential Distribution Normal Distribution PMF PDF PDF λ = mean number of events in an interval Exponential Distribution PDF Probability Random Variable Random Variable 53 Image credit: Wikipedia, Boost Portfolio diversification can save lives: Recap: The Cancer Megafund The Cancer Megafund Program Factsheet Investment of $200M 10 years before payoff 5% probability of success Annual profits of $2B a year for This project either gives you 51% return or makes you lose $200M E(r)=11.9% SD = 423 10 years; 10% discount rate What if we invest in 150 programs? Investment of $30 Billion! Assume independent projects E(r) =11.9% SD =423%/sqrt(150) = 34.6% Source: Fernandez, Stein, and Lo. Commercializing biomedical research through securitization techniques. Nature Biotechnology. 30, 964 (2012). Question How do you calculate the probability of at least 2 hits (i.e., 99.59)? Which probability distribution? 55 Getting the probability of at least 2 hits 56 DISTRIBUTION From PDF to CDF CDF Binomial Distribution Exponential Distribution Normal Distribution 𝐶𝐷𝐹 𝑥 = 1 − 𝑒 −𝜆𝑥 λ = mean number λ = failure rateof events in (e.g., an interval failure/hr) Binomial Distribution CDF Exponential Distribution CDF 57 Random Variable Random Variable Image credit: Wikipedia, Boost Question Suppose λ=0.1 represents the rate of defaults per year in a portfolio of loans. What is the probability that the next default will occur within 3 years. 58 59 TEAM CODING EXERCISE (TIME PERMITTING) 60 TEAMWORK Objective: This exercise demonstrates how the empirical probability of getting heads in a series of coin flips converges to theoretical probability 0.5 (assuming a fair coin) as the number of trials increases. This is why having ‘enough tries’ or unlimited number of tries matters. Process: 1. Simulate flipping a fair coin multiple times (e.g., 50, 500, 10000 flips). 2. Track the cumulative number of heads after each flip. 3. Calculate the cumulative probability of getting heads as the number of flips increases. Outcome: The graph shows how the cumulative probability of getting heads approaches 0.5 as the number of flips increases. 61 Questions? 62 COMING FUN INFERENTIAL STATISTICS THANK YOU