AIRC 1115 Day 4: Introduction to Aviation Statistics PDF
Document Details
Uploaded by Deleted User
British Columbia Institute of Technology
Tags
Summary
This document provides an introduction to aviation statistics, focusing on inferential statistics, including key concepts like Z-scores, T-tests, ANOVA, Chi-Square, correlation, and regression. It also includes example calculations and visualizations. This document describes various statistical methods from the British Columbia Institute of Technology.
Full Transcript
AIRC 1115 1 Introduction to Aviation Statistics Day 4 2 Learning Outcomes Describe statistical inference and the impact of inference on the validity of data ...
AIRC 1115 1 Introduction to Aviation Statistics Day 4 2 Learning Outcomes Describe statistical inference and the impact of inference on the validity of data 3 Video Inferential statistics tutorial https://www.youtube.com/watch?v=6E6pB5JFLgM (13 mins) What are the 5 statistical inference tests described in the video? What does each one test? 4 Common Inferential Tests 1. T-Tests 2. ANOVA 3. Chi-Square 4. Correlation 5. Regression 5 Descriptive vs. Inferential Stats Descriptive: Summarizes and organizes sample data Inferential: Makes conclusions about the broader population 6 What are Inferential Statistics? Test if patterns observed in a sample reflect the population Help determine if patterns are real or due to chance 7 Key Roles of Inferential Stats 1. Make predictions about the population based on sample data Test if patterns observed in a sample reflect the population 2. Test hypotheses to determine significance Help determine if patterns are real or due to chance 8 Common Inferential Tests 1. T-Tests 2. ANOVA 3. Chi-Square 4. Correlation 5. Regression 9 But First… a descriptive stats test Z Score 10 Normal distribution Source: Rose et al. (2024: 329) 11 Skewness Source: Rose et al. (2024: 330) 12 Kurtosis Source: Rose et al. (2024: 330) 13 Z-Score Z-scores, also known as standard scores, are a way to standardize values from different normal distributions, allowing for easy comparison. To calculate the Z-score for a given data point (X) in a normal distribution: Z = (X - μ) / σ , where: Z is the Z-score X is the data point μ is the mean of the distribution σ is the standard deviation of the distribution 14 Z-Score A Z-score tells you how many standard deviations a particular data point is from the mean. A positive Z-score indicates that the data point is above the mean, while a negative Z-score indicates it's below the mean. 15 Z-Score 16 Z-Score - Example Question 1: Basic Z-Score Calculation A student scored 85 on a math test. The class average score is 75, with a standard deviation of 5. What is the z-score of the student’s score? 17 Z-Score - Example Question 1: Basic Z-Score Calculation A student scored 85 on a math test. The class average score is 75, with a standard deviation of 5. What is the z-score of the student’s score? The z-score is 2.0, meaning the student’s score is 2 standard deviations above the mean. 18 Z-Score - Example Question 2: Z-Score for Below-Average Performance The average time to complete a task is 30 minutes, with a standard deviation of 4 minutes. Alex completed the task in 24 minutes. What is Alex's z-score? Interpret the z-score: Did Alex complete the task faster or slower than average? 19 Z-Score - Example Question 2: Z-Score for Below-Average Performance The average time to complete a task is 30 minutes, with a standard deviation of 4 minutes. Alex completed the task in 24 minutes. What is Alex's z-score? Interpret the z-score: Did Alex complete the task faster or slower than average? The z-score is -1.5, meaning the student’s score is 1.5 standard deviations below the mean. Alex completed the task faster than average. 20 Z-Score - Example Question 3: Comparing Two Z-Scores Two students, Jane and Mark, took different exams. Jane scored 78 on an exam with a mean of 70 and a standard deviation of 4. Mark scored 88 on an exam with a mean of 80 and a standard deviation of 5. Calculate the z-score for both Jane and Mark. Who performed better relative to their peers? 21 Z-Score - Example Question 3: Comparing Two Z-Scores Two students, Jane and Mark, took different exams. Jane scored 78 on an exam with a mean of 70 and a standard deviation of 4. Mark scored 88 on an exam with a mean of 80 and a standard deviation of 5. Calculate the z-score for both Jane and Mark. Who performed better relative to their peers? Jane Jane performed better relative to her group since Mark her z-score (2.0) is higher than Mark’s (1.6) 22 Back to… Common Inferential Tests 1. T-Tests 2. ANOVA 3. Chi-Square 4. Correlation 5. Regression 23 T-Test Compares the means of two groups to test if the difference is significant. Example: Comparing exam scores of two classes. Types: Independent t-test: Compares two groups Paired t-test: Compares one group at two different times 24 ANOVA (Analysis of Variance) Compares the means of three or more groups Example: Testing if test scores vary between public, private, and homeschool groups One-Way ANOVA: Compares one factor across groups 25 Chi-Square Test Tests for relationships between categorical variables. Example: Link between gender and car type preference. Observed vs. Expected frequencies 26 Correlation Analysis Measures the strength and direction of a relationship between two numerical variables. Example: Hours studied vs. exam scores. Coefficient ranges: -1 to +1. 27 Scatterplots Use scatterplots to determine relationship between ratio variables: Is the association positive or negative? Is the relationship linear? How strong is the relationship? Are there any outliers? 28 Scatterplot exploring association between ratio variables Source: Rose et al. (2024: 339) 29 Example scatterplots Source: Rose et al. (2024: 340) 30 Example: Correlation 31 Pearson’s correlation coefficient (r) Measure of the strength of linear association between two metric variables Takes a value between -1 and 1: -1 = perfect negative correlation 0 = no correlation 1 = perfect positive correlation 32 Scatterplots showing Pearson’s r Source: Rose et al. (2024: 357) 33 Correlation coefficient strength descriptors Absolute value of Descriptor correlation coefficient 0.10 to 0.29 Low 0.30 to 0.49 Medium 0.50 to 0.69 High 0.70 and above Very high Source: Rose et al. (2024: 357) 34 Regression Analysis Predicts the value of a dependent variable based on one or more independent variables Example: Predicting exam scores based on study hours Equation: y = a + b(x). 35 Scatter plot, regression line and regression equation Source: Rose et al. (2024: 359) 36 2 R : Coefficient of Determination R-squared (R2) is defined as a number that tells you how well the independent variable(s) in a statistical model explains the variation in the dependent variable. It ranges from 0 to 1, where 1 indicates a perfect fit of the model to the data. Example: E.g. Study hours and test performance R2 = 0.67 -> 67% of variation in test performance variable is explained by study hours 37 Estimation Estimation Point estimates Interval estimates 38 Estimating population parameters Characteristics of the sample are sample statistics Characteristics of the population are population parameters Sample statistics provide a point estimate of the population parameter Inferential statistics can be used to provide an interval estimate called a confidence interval Confidence intervals of means can be used to compare mean scores of groups and sub-groups 39 Point and interval estimates Source: Rose et al. (2024: 344) 40 The Role of Point Estimation Point estimation serves as the foundation for inferential statistics. It involves using sample statistics to estimate population parameters. The sample mean (x̄) is the most common point estimate for estimating the population mean (μ). Example: Suppose you want to estimate the average time customers spend on your website. You take a random sample of 100 visitors and find that the sample mean time spent is 5 minutes. In this case, 5 minutes serves as the point estimate for the population mean. 41 Confidence Interval A confidence interval, in statistics, refers to the probability that a population parameter will fall between a set of values for a certain proportion of times. Analysts often use confidence intervals that contain either 95% or 99% of expected observations. 42 Confidence Interval Example: If a point estimate is generated from a statistical model of 10.00 with a 95% confidence interval of 9.50 to 10.50, it means one is 95% confident that the true value falls within that range. 43 Confidence intervals for the mean Confidence Sample mean Lower Upper interval confidence confidence level level 95% 4.25 3.80 4.70 99% 4.25 3.68 4.82 Source: Rose et al. (2024: 344) 44 Confidence intervals for the mean by sub-group 95% confidence interval Mean for mean satisfaction Standard Lower Upper Age group n level deviation bound bound 18–29 30 5.10 0.898 4.76 5.44 30–39 35 4.72 1.108 4.34 5.10 40–49 38 4.19 1.055 3.84 4.53 50–59 41 3.91 0.969 3.60 4.21 60 and over 31 3.70 0.961 3.35 4.06 Total 175 4.30 1.111 4.13 4.46 Source: Rose et al. (2024: 345) 45 Chart of confidence intervals for the mean by sub-group Source: Rose et al. (2024: 346) 46 Homework 1. Read Article “Quant Analysis 101: Inferential Statistics” Source: Grad Coach Pages: All 2. Read Article “What is Inferential Statistics?” Source: Appinio Blog Pages: All