BREES Statistics Revision PDF
Document Details
Uploaded by InstructiveColosseum
University of Bristol
BREES
Karen Still
Tags
Summary
This BREES statistics revision handout provides a summary of various statistical concepts and methods. Relevant information about the statistics exam, including questions and instructions, is included. The document is specifically designed for revision purposes by Karen Still, for the University of Bristol, and will cover the content covered in the exam.
Full Transcript
BREES: Statistics Statistics exam revision Karen Still School of Cellular and Molecular Medicine I nf or ma t io n ab ou t t he st at is ti cs e xa m Statis tics ex am 1 hour 30 minutes (extra time if you have agreed AEAs) 30% of the unit mark In person – check your exam timetable for the...
BREES: Statistics Statistics exam revision Karen Still School of Cellular and Molecular Medicine I nf or ma t io n ab ou t t he st at is ti cs e xa m Statis tics ex am 1 hour 30 minutes (extra time if you have agreed AEAs) 30% of the unit mark In person – check your exam timetable for the date, time & venue You will be tested on the statistics content of the course only You must bring a calculator – any type is allowed, as long as it can’t be used to communicate with someone else (i.e. not a phone/tablet) Check the University rules about what you can take into an in-person exam: https://www.bristol.ac.uk/students/your-studies/exams-assessments/what-to-take/ Statistics exam content and SPSS analysis 25 MCQs: 15 MCQs on statistical theory and application. Includes calculations. 5 MCQs on a data set released in advance of the exam. Perform an analysis of this data using SPSS BEFORE THE EXAM, and bring a print-out of the analysis to the exam. 5 MCQs refer to a pre-performed SPSS analysis of a second data set. The analysis will be released as part of the exam paper. Each MCQ is worth 1 mark. Negative marking is NOT used. A list of relevant statistical formulas is provided. Data Set One anal ysi s Data Set One will be released on Thursday 26th November (see Assessment, submission and feedback). Analyse the data using appropriate methods in SPSS and print out your analyses. You may bring a maximum of 4 pages, double sided (i.e. 8 sides) into the exam. You may annotate your printed sheets with anything you like, either by hand or typing. Make sure your notes contain your name/student ID. You will not be permitted to take your printed sheets out of the exam room. Answer s heet for stati stic s exam Answers must be transferred onto the Examination Answer Sheet during the 1.5 hours The answer sheet contains more possible answers than the question paper - ignore all the answer slots after 25 Checklist: bring to the statistics exam… ❑ A pencil (HB or softer) ❑ An eraser ❑ A pencil sharpener ❑ A calculator ❑ A print-out of your SPSS analysis of Data Set One ❑ Your UCard Any questions? S t at i s t i c s r e v i si o n 1a) Ho w is my data distribute d? Parame te rs Population variables can be described using parameters Central tendency (average) mean () median and mode Distribution 924 g 876 g 703 g 776 g 1032 g Standard deviation range and IQR N=5 1 b ) N o r m a l d is t r i b u t i o n an d t h e Z - s c o r e The Z-score is the number of standard deviations a value is away from the mean X- = 70 - 60 = 0.67 Z= 15 0.67 What proportion of values are less than 70? Z-table says 74.9% values are less than 0.67 above the mean 30 45 60 70 90 = 15 You measure the exam scores of 120 students in BREES. The average mark is 48%. The standard deviation is 12%. Marks are normally distributed. Estimate the percentage of students who got a 2:1 or first (>60%) 68% values lie +/- 1 from the mean A. 4% 95% values lie +/- 1.96 from the mean 99.7% values lie +/- 3 from the mean B. 12% C. 16% D. 32% E. 36% You measure the exam scores of 120 students in BREES. The average mark is 48%. The standard deviation is 12%. Marks are normally distributed. Estimate the percentage of students who got a 2:1 or first (>60%) 2a) S ampl i ng di st rib uti o ns Population mean of the sample means = population mean = 162 g = 162.0 g standard deviation of = 6.8 g sample means = st.dev population / √N (s.e.m. or M) = 6.8 g = 3.4 g √4 Distribution of sample means, N = 4 The greater N, the smaller the s.e.m If the population variable is normally distributed, the M = 162.0 g sampling distribution will be too M = 3.4 g If N>30 then the sampling distribution will be normal whatever. 159 161 153 155 157 163 165 167 169 171 2b) 95% co nfi den ce int e rval s – e sti mat es o f pop ula ti on mean df 0.95 0.99 We estimate the population mean so we need a t- 2 4.30 9.93 distribution. 3 3.18 5.84 df = N-1 4 2.78 4.60 t-score – how many s.e.m either side of the 5 2.57 4.03 mean contains 95% values (ie sample means) 8 2.31 3.36 10 2.23 3.17 Our expt is a single sample mean. We can t - distribution 20 2.09 2.85 estimate the population mean and the s.e.m 50 2.01 2.68 100 1.98 2.63 5 volunteers donate blood samples for an experiment to measure concentration of a circulating hormone. The mean value is 82 pM. The standard deviation is 16 pM. Assuming that concentration is normally distributed in the population, what is the width of the 95% CI for the mean value in the population? df 0.95 0.99 2 4.30 9.93 A. 7.0 pM 3 3.18 5.84 4 2.78 4.60 B. 14.0 pM 5 2.57 4.03 C. 28.0 pM 8 2.31 3.36 D. 39.8 pM 10 2.23 3.17 20 2.09 2.85 E. 19.8 pM 50 2.01 2.68 100 1.98 2.63 5 volunteers donate blood samples for an experiment to measure concentration of a circulating hormone. The mean value is 82 pM. The standard deviation is 16 pM. Assuming that concentration is normally distributed in the population, what is the width of the 95% CI for the mean value in the population? df 0.95 0.99 A. 7.0 pM 2 4.30 9.93 3 3.18 5.84 B. 14.0 pM 4 2.78 4.60 5 2.57 4.03 C. 28.0 pM 8 2.31 3.36 D. 39.8 pM 10 2.23 3.17 20 2.09 2.85 E. 19.8 pM 50 2.01 2.68 100 1.98 2.63 3) p-values Test the null hypothesis that there is no difference in the means of the populations from which samples were randomly drawn. t – distribution of difference of sample means for alpha = 0.05 – null hypothesis 0.95 p is the probability we would sample our experimental difference if the null hypothesis was true. 0.025 0.025 alpha allows you to accept or reject null hypothesis. 0 5 volunteers donate blood samples for an experiment to measure concentration of a circulating hormone. The mean value is 82 pM. The standard deviation is 16 pM. Assuming that concentration is normally distributed in the population, what test should we use to see if our sample mean is significantly different to the population reference value of 78 pM? A. Independent sample t-test B. One-sample t-test C. Paired t-test D. Tukey’s test E. N is not large enough 4a) T yp e I and Type II e rro rs and po wer -value (e.g. 0.05) We say there is an effect when there isn’t: H0 H1 We reject H0 Type I error We say there isn’t an effect when there is: We do not reject H0 Type II error How can we reduce the chance of a type I error? A. Lower alpha B. Raise alpha C. Set alpha after the data are in D. Increase the power E. Do a one-sided test 4a) T yp e I and Type II e rro rs and po wer -value (e.g. 0.05) We say there is an effect when there isn’t: H0 We reject H0 Type I error 4a) T yp e I and Type II e rro rs and po wer -value (e.g. 0.01) We say there is an effect when there isn’t: H0 We reject H0 Type I error 4b) Binary data – yes or no, left or right… Does Drug X prolong life for 6 months? R isk rat io Outcome (dependent) variable e.g. chance of being alive after 6 months on treatment compared to control alive dead Drug X 164 87 251 164/251 = 1.13 control 133 96 229 133/229 297 183 480 Is this significant? - 2 c h i s q u a r e d t e s t Exposure (independent) variable 5 a ) S ca t t e r pl o t s a n d co r r e l a t i o n Correlation/relationship/association between bivariate data Linear if it can be described using a straight line Positive if a stronger grip is associated with a stronger arm strength Quantify with Pearson’s coefficient, r and r2 There is no assumption of causality 5b) R eg ression equa tion outcome = a + ( b × predictor ) Grip = 54.7 + (0.71 × arm strength) (a) Y intercept when predictor = 0 (b) gradient Outcome (Y) For each 1 unit increase in arm strength the grip increases by 0.71 units Predictor (X) You test several concentrations of a drug (µg) in a randomised control trial and measure the blood pressure (mmHg) in 100 volunteers. You perform a linear regression analysis which yields the equation: y = 120.7 – 0.2x Predict the blood pressure of someone treated with 20 µg of the drug. Assume this was in the tested range. A. 100.7 mmHg B. 111.2 mmHg C. 116.7 mmHg D. 120.5 mmHg E. 124.7 mmHg You test several concentrations of a drug (µg) in a randomised control trial and measure the blood pressure (mmHg) in 100 volunteers. You perform a linear regression analysis which yields the equation: y = 120.7 – 0.2x Predict the blood pressure of someone treated with 20 µg of the drug. Assume this was in the tested range. A. 100.7 mmHg B. 111.2 mmHg C. 116.7 mmHg D. 120.5 mmHg E. 124.7 mmHg 6a) ANOVA = ANalys is Of Variance A comparison of more than two means. Null hypothesis: mean A = mean B = mean C Whiskas Post-hoc to determine which one(s) is(are) not equal! Iams James Wellbeloved In this case p= 0.004. We reject the null hypothesis. How many factors are there in this cat food experiment? A. 0 B. 1 Whiskas C. 2 D. 3 E. 4 Iams James Wellbeloved 6b) Pa rametric vs n on -par ametric Thank you for listening… Any questions?