Probability and Statistics Lecture Notes PDF
Document Details
Uploaded by WellReceivedMaclaurin
Imperial College Business School
Fanyin Zheng
Tags
Summary
These lecture notes cover probability and statistics, with topics including lecture logistics, the importance of probability and statistics, sets and events, probability functions, counting principles, and examples. They are part of a course at Imperial College Business School.
Full Transcript
Probability and Statistics Probability and Statistics Fanyin Zheng Imperial College Business School Lecture 1...
Probability and Statistics Probability and Statistics Fanyin Zheng Imperial College Business School Lecture 1 1 Probability and Statistics Outline ▶ Course logistics ▶ Why probability and statistics? ▶ Overview of the course ▶ Sets and events ▶ Probability and counting ▶ References: LM 2.1-2.3, 2.6-2.7 Probability and Statistics Logistics ▶ Lectures ▶ Introduce new materials ▶ Examples and exercise in class ▶ Group learning - please ask questions ▶ Attendance is mandatory Probability and Statistics Logistics ▶ Lectures ▶ Introduce new materials ▶ Examples and exercise in class ▶ Group learning - please ask questions ▶ Attendance is mandatory ▶ Insendi, one-stop shop for ▶ Syllabus and other course materials ▶ Slides will be posted ahead of each lecture ▶ In-class notes will be posted after each lecture ▶ Problem sets for each week ▶ Notes from tutorial sessions ▶ Course annoucements ▶ Lecture recordings ▶ Ed discussion for Q&A ▶ Please check Insendi frequently 3 Probability and Statistics Logistics ▶ Assessment: 30% midterm exam (on Nov. 6th), 70% final exam (Jan. exam period, tba) ▶ Problem sets ▶ Practice problems similar to exam questions ▶ Work on them on your own or with your group before tutorial ▶ Attend tutorials: solutions to problem sets explained in detail ▶ Solution will be posted after tutorial ▶ Key to do well in this class! ▶ Tutorials ▶ Review key concepts and problem sets ▶ Attendance is mandatory ▶ Extra lectures (in addition to weekly lectures on Tuesday) ▶ Wed. Nov. 20th, 9am-11am ▶ Fri. Dec. 13th, 2pm-3pm 4 Probability and Statistics Logistics ▶ Office hours ▶ Myself: 3pm-4pm on Tuesday or by appointment ▶ Tutorial leader: TBA ▶ Textbooks ▶ Larsen, R., and M. Marx. Introduction to Mathematical Statistics and Its Applications. (LM in lecture slides) ▶ Stock, J. H., and M. W. Watson. Introduction to Econometrics. (SW in lecture slides) 5 Probability and Statistics Logistics ▶ Office hours ▶ Myself: 3pm-4pm on Tuesday or by appointment ▶ Tutorial leader: TBA ▶ Textbooks ▶ Larsen, R., and M. Marx. Introduction to Mathematical Statistics and Its Applications. (LM in lecture slides) ▶ Stock, J. H., and M. W. Watson. Introduction to Econometrics. (SW in lecture slides) ▶ To do well in this class ▶ Regular work and repeated practice is extremely important ▶ Do not wait untill November to start working ▶ Attend lectures, attend tutorials, review lectures, work on problem sets ▶ Practice writing down complete and clear mathematical reasoning step by step 5 Probability and Statistics Why probability and statistics? ▶ Probability is the science of uncertainty ▶ The world is uncertain ▶ Foundation for any empirical study ▶ Data and evidence based answer to a question Probability and Statistics Why probability and statistics? ▶ Probability is the science of uncertainty ▶ The world is uncertain ▶ Foundation for any empirical study ▶ Data and evidence based answer to a question ▶ Economics: should inflation target be relaxed to avoid full-blown recession? how would it affect income inequality? Probability and Statistics Why probability and statistics? ▶ Probability is the science of uncertainty ▶ The world is uncertain ▶ Foundation for any empirical study ▶ Data and evidence based answer to a question ▶ Economics: should inflation target be relaxed to avoid full-blown recession? how would it affect income inequality? ▶ Finance: how will Amazon’s stock perform next quarter? Probability and Statistics Why probability and statistics? ▶ Probability is the science of uncertainty ▶ The world is uncertain ▶ Foundation for any empirical study ▶ Data and evidence based answer to a question ▶ Economics: should inflation target be relaxed to avoid full-blown recession? how would it affect income inequality? ▶ Finance: how will Amazon’s stock perform next quarter? ▶ Management and business analytics: should Airbnb promote new hosts with few reviews? Probability and Statistics Why probability and statistics? ▶ Probability is the science of uncertainty ▶ The world is uncertain ▶ Foundation for any empirical study ▶ Data and evidence based answer to a question ▶ Economics: should inflation target be relaxed to avoid full-blown recession? how would it affect income inequality? ▶ Finance: how will Amazon’s stock perform next quarter? ▶ Management and business analytics: should Airbnb promote new hosts with few reviews? ▶ Life: which elective course/internship/date should you choose? Probability and Statistics Why probability and statistics? ▶ How should we answer these questions? ▶ There is uncertainty in the outcomes ▶ We would like to collect some data ▶ Data is only a sample from the ground truth (or the population) ▶ Based on what we learn from the data, we might form beliefs about the outcomes ▶ We make informed decisions based on the beliefs ▶ Need both probability and statistics in this process ▶ Design data collection or experimentation ▶ Form beliefs from data and describe beliefs ▶ Assess confidence of beliefs and risk of decisions Probability and Statistics Sets and events ▶ An experiment is a procedure which can be repeated and has a set of well-defined possible outcomes ▶ Sample space S is the set of all possible outcomes ▶ s ∈ S is one outcome ▶ An event A is any collection of outcomes ▶ A can be any individual outcome, any subset of or the entire sample space, or the null (empty) set ▶ Event A is said to occur if the realized outcome of the experiment is in A 8 Probability and Statistics Sets and events Figure: Sample space as pebble world, two events A and B 9 Probability and Statistics Sets and events ▶ Toss a fair coin twice is an experiment ▶ Sample space is S = {HH, HT , TH, TT } ▶ Event A is at least one head, A = {HH, HT , TH} ▶ Another event B is exactly two heads, B = {HH} 10 Probability and Statistics Sets and events ▶ Toss a fair coin twice is an experiment ▶ Sample space is S = {HH, HT , TH, TT } ▶ Event A is at least one head, A = {HH, HT , TH} ▶ Another event B is exactly two heads, B = {HH} ▶ Probability of A, probability of B? Probability and Statistics Sets and events ▶ A set is a collection of elements ▶ s ∈ A if s is an element of set A ▶ s∈/ A if s is not an element of set A ▶ ∅, empty set, or null set, is the set without any element ▶ A is a subset of S, or A ⊆ S, all elements of A are also elements of S Probability and Statistics Sets and events ▶ The intersection of A and B, A ∩ B, is the event whose outcomes belong to both A and B ▶ The union of A and B, A ∪ B, is the event whose outcomes belong to either A or B (or both) ▶ A and B are mutually exclusive if they have no outcomes in common, A ∩ B = ∅ ▶ The compliment of A, Ac is the event consisting of all the outcomes in S which are not in A ▶ A1 , A2 ,..., An form a partition if ∪ni=1 Ai = S and Ai ∩ Aj = ∅ for all i ̸= j using only equations or verbal descriptions. An alternative ap Probability and Statistics highly effective is to represent the underlying events graphically as a Venn diagram. Figure 2.2.4 shows Venn diagrams for an in a complement, and two events that are mutually exclusive. In ea Sets and interior of a region corresponds to the desired event. events Venn diagrams A∩B A∪B A A B B S S A AC A A∩B=ø B S S Example When two events A and B are defined on a sample space, we13w Probability and Statistics Sets and events Figure: A1 , · · · A8 form a partition 14 Probability and Statistics Sets and events: properties of set operator ▶ Commutative: A ∩ B = B ∩ A, A ∪ B = B ∪ A ▶ Associative: (A ∩ B) ∩ C = A ∩ (B ∩ C ), (A ∪ B) ∪ C = A ∪ (B ∪ C ) ▶ Distributive: A ∪ (B ∩ C ) = (A ∪ B) ∩ (A ∪ C ), A ∩ (B ∪ C ) = (A ∩ B) ∪ (A ∩ C ) ▶ De Morgan’s theorem: (A ∩ B)c = Ac ∪ B c , (A ∪ B)c = Ac ∩ B c Probability and Statistics Sets and events: example ▶ There are two events, A and B, in sample space S ▶ What are the expressions for the following events? ▶ Exactly one of the two occurs ▶ At most one of the two occurs ▶ Hint: Venn diagram 16 Probability and Statistics Probability function ▶ Probability function P is a function from events to real numbers ▶ Probability of event A is P(A) ▶ P satisfies the following axioms ▶ Axiom 1: P(A) ≥ 0 for any event A ▶ Axiom 2: P(S) = 1 ▶ Axiom 3: If A and B are mutually exclusive, P(A ∪ B) = P(A) + P(B) ▶ If Ai and Aj are pairwise mutually exclusive, P P(∪i≥1 Ai ) = i≥1 P(Ai ) Probability and Statistics Properties of probability We can show the following properties are true using the three axioms of probability function P: ▶ P(∅) = 0 ▶ P(Ac ) = 1 − P(A) ▶ P(A) ≤ 1 ▶ P(A ∪ B) = P(A) + P(B) − P(A ∩ B) Probability and Statistics Counting: interpretation of probability ▶ What is the interpretation of probability? Or how do we compute probability? 19 Probability and Statistics Counting: interpretation of probability ▶ What is the interpretation of probability? Or how do we compute probability? ▶ When we toss a fair coin, we say the probability of heads is 1/2 19 Probability and Statistics Counting: interpretation of probability ▶ What is the interpretation of probability? Or how do we compute probability? ▶ When we toss a fair coin, we say the probability of heads is 1/2 ▶ If we toss it 10,000 times and count the number of heads, we get 5,013 ▶ 5013/10000=0.5 ▶ If an experiment is repeated many times, we can count the frequency of a given event’s occurrence to determine its probability ▶ We will come back to this idea later in this course when we introduce estimators 19 Probability and Statistics Counting: interpretation of probability ▶ Let’s formalize how we get the probability of heads is 1/2 in a fair coin toss 20 Probability and Statistics Counting: interpretation of probability ▶ Let’s formalize how we get the probability of heads is 1/2 in a fair coin toss ▶ If the sample space S contains finite number of equally likely outcomes ▶ P(A) = totalnumber of outcomes in A number of outcomes in S 20 Probability and Statistics Counting: interpretation of probability ▶ Let’s formalize how we get the probability of heads is 1/2 in a fair coin toss ▶ If the sample space S contains finite number of equally likely outcomes ▶ P(A) = totalnumber of outcomes in A number of outcomes in S 20 Probability and Statistics Counting: multiplication ▶ Counting the number of possible outcomes might be complicated ▶ We introduce (or review) a couple of tools next ▶ If an experiment has 2 parts, where the first part has m possible outcomes, and the second part has n possible outcomes (regardless of the outcome in the first part), then the experiment has m × n outcomes. 21 Probability and Statistics Counting: multiplication Figure: Multiplication rule 22 Probability and Statistics Counting: multiplication example ▶ If a password is required to have 8 characters (letters or numbers) and is case-sensitive, how many possible outcomes are there? ▶ Hint: multiplication rule with 8 parts Probability and Statistics Counting: multiplication example ▶ 10 runners compete in a race. Assume ties are not possible. All 10 completes the race. How many possible outcomes are there for the first, second, and third winners? ▶ Hint: multiplication rule with 3 parts Probability and Statistics Counting: multiplication special cases ▶ Sampling with replacement: take k draws from a group of size n options, with replacement ▶ n × n × · · · = nk ▶ Sampling without replacement: take k draws from a group of size n options, without replacement (n ≥ k) ▶ n × (n − 1) × (n − 2) × · · · (n − (k − 1)) n(n−1)(n−2)···3·2·1 n! = (n−k)(n−k−1)···3·2·1 = (n−k)! 25 Probability and Statistics Counting: multiplication special cases ▶ Sampling with replacement: take k draws from a group of size n options, with replacement ▶ n × n × · · · = nk ▶ Sampling without replacement: take k draws from a group of size n options, without replacement (n ≥ k) ▶ n × (n − 1) × (n − 2) × · · · (n − (k − 1)) n(n−1)(n−2)···3·2·1 n! = (n−k)(n−k−1)···3·2·1 = (n−k)! ▶ If n = k, we have n! – the number of permutations of n objects ▶ Note 0! = 1 25 Probability and Statistics Counting: multiplication special cases ▶ Sampling with replacement: take k draws from a group of size n options, with replacement ▶ n × n × · · · = nk ▶ Sampling without replacement: take k draws from a group of size n options, without replacement (n ≥ k) ▶ n × (n − 1) × (n − 2) × · · · (n − (k − 1)) n(n−1)(n−2)···3·2·1 n! = (n−k)(n−k−1)···3·2·1 = (n−k)! ▶ If n = k, we have n! – the number of permutations of n objects ▶ Note 0! = 1 ▶ So far we have seen ordered sets, let’s look at unordered sets next 25 Probability and Statistics Counting: combination ▶ Consider our runners example. Suppose we only care about which 3 runners won the top 3 places, but not which one of the three is first, second, or third ▶ We are simply choosing 3 runners out of 10, and do not care about their order Probability and Statistics Counting: combination ▶ Consider our runners example. Suppose we only care about which 3 runners won the top 3 places, but not which one of the three is first, second, or third ▶ We are simply choosing 3 runners out of 10, and do not care about their order ▶ An unordered collection of k elements is called a combination of size k ▶ The number of different combinations of size k taken from a group of size n options without replacement is ▶ n! Recall the number of ordered collection is (n−k)! ▶ Each combination of k distinct elements shows up k! times ▶ So the number of (unordered) combinations is kn = (n−k)!k! n! n n ▶ Note 0 ≤ k ≤ n, 0 = n = 1 Probability and Statistics Counting and probability example ▶ An urn contains eight chips, numbered 1 through 8. Three chips are drawn without replacement. What is the probability that the largest chip in the sample is a 5? 27 Probability and Statistics Counting and probability example ▶ An urn contains n red chips numbered 1 to n, n white chips numbered 1 to n, and n blue chips numbered 1 to n. Two are drawn at random and without replacement. What is the probability that the two drawn are either the same color or the same number? 28 Probability and Statistics Counting and probability example ▶ What is the probability that at least two people in this room were born on the same day? ▶ Forget about the year to be nice to those obviously very old in the room ▶ Only consider 365 days a year to simplify the problem 29 Probability and Statistics Counting and probability example ▶ What is the probability that at least two people in this room were born on the same day? ▶ Forget about the year to be nice to those obviously very old in the room ▶ Only consider 365 days a year to simplify the problem ▶ There are 39 US presidents who have died: Three pairs and one triple died on the same day. ▶ Rare coincidence (hence conspiracy theories) or simple data mining? Probability and Statistics Counting and probability example Figure: X-axis: the number of people in the room, Y-axis: the probability of at least two people with the same birthday. 30 Probability and Statistics Summary and next time ▶ Sets, events, and Venn diagram ▶ Probability and its properties ▶ Counting and calculating probabilities ▶ Multiplication rule ▶ Ordered sets with and without replacement, permutation ▶ Unordered sets without replacement: combination ▶ Apply properties of probability in calculations ▶ Next time: conditional probability, independence, Bayes’ rule, random variables ▶ Problem set 1 is posted and will be reviewed in tutorial next week