🎧 New: AI-Generated Podcasts Turn your study notes into engaging audio conversations. Learn more

Biostatistics 2.pdf

Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...

Transcript

Biostatistics 8/26/24 Chapter 1 biostats-field of stats related to science statistics collection of data organization and reporting - results sources of data surveys /observation stu...

Biostatistics 8/26/24 Chapter 1 biostats-field of stats related to science statistics collection of data organization and reporting - results sources of data surveys /observation studies - - experiment previously obtained records - random variable - variable with some chance or probability quantitative value (meaningful numeric data - Lex. number of total students in class , dollar amounts , qualitative observation (name label category( - Lex. color, wearing glass ? Zip code , social security numbers , sports jerseys continuous-entire spectrum possible (decimal can a lot ex. height weight , discrete countable values only gaps in a - number line # of students 5j ! 234 population entire group - to be studied sample-group selected study 8128/24 Chapter 1 Levels of measurement/measurement scales - listed from weakest to strongest * when goal collecting data strong levels of : - , measurement produce more powerful results data nominal-qualitative , or classification -categories , labels ex : gender wear glasses (yes or no , Ordinal-data can be ordered but the differences are unknown or unclear ex. pain scale 1-10 (Likert Scale) ; Top 100 college scale grading scale A ,B , CD , F (differce between 1 or 19) interval-data can be ordered, differences are known but no natural zero exists. ex Time. (years) Temp ( , F) , shoe size , ratio-data can be ordered , differences are known , natural zero exists &X age , time , height. Data collection Survey Methods -voluntary response - subjects can choose to respond or not (ex email). - connivence-easiest data for research to obtain random-everyone has an equal chance at Selection simple random-each section or pull is also random (SRS) stratified-break into groups randomly sample with groups cluster-break into groups , sample everyone from within select groups systematic 1st person - selected at random , then a pattern or process incentive based study 8130/24 Chapter 1 - Experimental Design Guidelines · purpose and goals of study · obtain people w/ disease · obtain "treatment" · obtain medical equipments personal · break subjects into groups ↳ placebo effect · get baseline data · apply treatment ↳ with personal observations for adverse effects including · collect data · data analysis (analyze results) · report conclusions Precision vs Accuracy Precision-how close the points are to each other Accuracy - how close the points are to a target value Correlation vs Causation causation - A causes B or B causes A correlation - relationship between A and B * Correlation does imply causation NOT - because two variables are correlated does not mean one is the other causing Chapter 2 8) 30124 Types of graphs , charts diagrams - Descriptive Statistics · common types of graphs/charts/tables/diagrams - Die/circle graph - line graph time series-line but time (sees , years) graph x= - - bar graph-vis qualitative 3 bars 11 do not touch -histogram- X is guantative bars touch - scatter plot - box and whisker plot pictograph pictures with a key - - - stem and leaf ploty , one Is · O ,6 0 90 2 5. , 9/4/24 common graph/charts. cont - dot plot (like a bar graph) - frequency distribution eX. Frequency i 18 24 C D ' - tally graph table of subjects - contingency - # must be in the table PlaceOn YAB) eX. Drug Pure - spreadsheets - Venn Diagrams Tree Diagrams 12 H - 17 is 32 22 - Flowchart - Array - List of numbers - ordered Array -frequency polygon -line graph Finding midpoint - uses midpoints for X take the two numbers - includes one grouping above or below divide by 2 given groupings eX 68 72 , 75 , 80 82 ,86 90 , 91 ,94 ,99. , ,. & 60-b N 70 79 Students - -frequency polygon -line graph Finding midpoint - uses midpoints for X take the two numbers - includes one grouping above or below divide by 2 given groupings eX 68 72 , 75 , 80 82 ,86 90 , 91 ,94 ,99.. , , should # ↓& 50 590 shape fresame - I be StudentsI g ⑳ stay 54 M disa 100-1098 *Relative fraction - or percentage must add up to relative frequency distribution relative bar % graph M 60-6918% Yo.I 70.19 20% 15. 2 80 8930 %. 3/10. 3 90 9948% - 25. 4 Er so tr si Grad Cumulative-total Cumulative Relative # of student percentage grade I 60-69 I 60-69 10%6 3 70-79 60-7930% 80-89 60-8960% 90-99 10 60-99100% Common Shapes (distributions) of Histograms Ermapure it ↳ height showed distributions => 1 skewed left ↳ retirement ages ↳ Skewedright - income 9/6/24 Cyclical won't ever goto zero constant up and down M Bimodal has to finish - and start a zero (2 peaks) TTTTTTTTi ex. rate of change in height rate of speed Trimodal-3 peaks (must start send at zero) Multimodal-3 or more peaks (still must start Bend at zero) uniform-all bars in theory have the even if sample results same height vary ex. rolling a dice deck of cards 7.7777.1 * Testbygiving by and figure o st Increasing Decreasing mel Imm = ageoflocallibraries &X. could have been bimodal , skewed right or left time in McDondale X = waiting of line at Y= # people could be trimodal , bell curve shapes and distributions of the normal bell curve -messkurtic middle Kurtosis-how peaked the data height is - -averge light bulb mesokurtic will always be zero - -- - - Leptokurtic - thin/narrow ex. ages in first grade Leptokurtic will always be + - Platykurtic - - flat - speed between two stop lights/signs average LED bulb - platykurtic will always be negative 9/9/24 Measures of Central Tendency mean-addingnumber then diving by sample size - - skewed data - median-center number of ordered set - mode-most frequently occurring number For example if each number occurs only once then no mode. ex 40 , 40 , 70 , 80 , 95 95.. we median Measures of spread I range-Max-Min 97 40 - 57 rang = Standard Deviations + Variances 6 steps for standard dev. 5 steps for sta. Var. 1. find the mean. take each number and subtract the mean 2 eX four numbers · for subtraction problems =. 3 Square each value from (2) 4 Add. each answer from (3) 5. divide by n-1 if data is from sample or unknown (from slep4TN If data is population 6 take root. square Find the varience 3 Sta dev. 84 ,92, 76 ,90 , 88 84 +92776 + 98788 = 86 5 84 8692 8676 8690 - - - - 8688-86 (2) + 62 + ( 10)2 + 42 + 22 168 = variance = 6 32. sta der. 9/11124 Coefficients of Variation (CV) St dev CV = percent. as a mean = = 7 3%. 5 Number Summary * minimum Firstquartile-percentimeidian a for lowef Q 3 (Third quartile) 15thpercentile (median of upper half - maximum Find 5 number a summary 71 74 , 79 , 81 , 85 , 89 ,91 , 94 94 , 99 , , min-71 Q1-79 02-87 93-94 Max-99 * ignore median when breaking odd lists Boxplots t 25% 25% 25% grads a & purpose - symmetric or not - - where the Outliers quartile start and stop IQR-inner quartile range for middle range I QR = Q3 - Q Outliers - low outlier is a pointbelow 01-1 5 IQR.. above Q3 + 1 5 IQR -high is a point.. Is there an outlier ? NO 79 - 1 5 (15). = 56 5. 94 + 1. 5(15) 79 22 - 94 + 22 5. 56.5 116 5. * Central tendency mean median mode 9/11/24 Chapter 3 Probability probability-likelyhood or chances of an event main ways to probabilities Classical/Theoretical ex.. a deck of cards P(success) of success n * Outcomes must be likely equally 9) 13/24 3 primary Methods for Probability - classical/theoretical ofsuccess of success relative frequency - - M - subjective - Best guess Bayesian Statistics given prior distribution, - - update the model as new information is known to end with posterior distribution - Elementary Probability Properties Plevent) must be between 0 and 100 percent. O and I Plsum of all possible outcomes) =I Definitions Independent : outcome or probability of event A does NOT impact/influence/affect outcome or probability of event B. Dependent : NOT independent I ↳Youtcomes - then changes mutually exclusive - events A and B cannot occur at the same time. Types ofSets Overlapping subjects maybe - in group A group B both or neither Disjoint - no overlap between groups AB Subset-everyone in A also in B Conditional Probability Plone event occuring given another event outcome) PLAIB) given complement - not in outcome set (opposite( P() P(A) + P() = 1 Multiplication Rule If A and B are independent n(A3 B) = n(A) n(B) · and p(A ? B) P(A) P(B) = · You have 20 +-shirts 10 pairs of pants and 2 pair of shoes How many different combinations are possible. 20 10 2.. = 400 A high school Consists of 43 Master lock dights , but the same number cannot be used twice in a row How many 3 digit codes are possible ? 43 42 42 = 75 852.. , P(2 coin flips both tails · 5..5 =. 25 Addition Rule-used for overlapping events eX. I had 140 students one semester and I brought incandy for halloween. I let students take a bag of M&MS , a bag of skittles , or both. * Ta total bags of skittles were taken To bags of MA Ms were taken 3) people took both. How many students didn't take and y Skittles M & M's 47 47 3238 32 38 79-32 2 didn't take candy 148 - 117 Glo 9/lb124 examples contingency table Probabilities - Titanic P (children) = 10"22x = 5% mired a P(surrived) /2201 = 32% Total 2201 Pladult) = 95 % Pladult died) = 65% Plchidren's surrved) & P(chidren's surrived) = 2 6%. 109t 711 - 57 = 35 % Pladult died) - 2201 or = 2092 + 1490 1438 = - 97 % 2201 P(surviving/chidren) = = 528 4(not) Surving (adult) 138 = 69 % = 2092 P(child/survived) = = 8% P (2 people both survived). screening Testing IntaOVID Rapid Testing (Last non-commerically) (available test) has covid ? Test - T = test + Test t F= test - D = has covid 778 J doesn't have covid = Formula Sensitivity= P(TID) ==. 75 15% or specificity = P(FIT) = 96 % + 658 = 94% accuracy= 20 778 If you don't know the population prevalence (% ) of a disease PredictedValue Positive (DVP) = P(DIT) 72% = PredictedValue Negative (PVN) = P(DIF) == 97 % If the population percent is known PVP= TID) P(D) -are given B (TID) PLD+ P(TIT)) PUNID(FID)P(b) & 9/18/24 Screen Sens Testing Spee PVD PVN RR OR If population unknown PVP = P(DIT) DUN = P(JIT) zo 00) PVP = ( =. =. 018 + PVN = 558 ( 685. 999) = 0 997. ( 999) + (00. RelativeRisk (RR) P(disease I P/disease I risk factor present risk factor absent S RR = RR = 21 4. You are 21 4 times. greater risk of having Covid if you test positive vs. testing negative. Odds(against) odds are failure to success. odds are 2 tol Probability 13 = odds are Stol Probability : " odds are 1 : 1 Probability : "a 120 + 100 100 + 165 test success odds ration= (test failures) OR (T0 (655)) = - (27(23) OR = 74 apopy Examl Topics and graphs diagrams - - varience and sta dev. - shapes of graphs ex skewed left or right , bimodel - meso , plato , and why - definitions-chap ! - normial Ordinal qualitative quantitative continuous > number lines height - # numberof discreat- gaps in number line question right disjount exam on overlapping subset are things independent/dependent 5 number Summary box plots outliers w/ formula ranges/mean/median mode addition rule with venn diagram multiplication rule sensitivity 2 spec. PVP PVN Whether OR RR s a good test accuracy nominal : name label group (names of people who took exam) Ordinal : order but differences unknown /letters of grades interval : differences known ; no zero starting point (pH) differces ratio : zero starting of point. percentages exam heigh weight 9123/24 Exam Review suppose I recorded cal count n=b Students 1420 , 1510 5 summary , - 2012500+2 , 200, 3000 Outliers Above Qs+ 1. 51Qs-Q) min-1420 2700 +(1 5 1190).. Q1 1510 = 2700 + 1785 92 : median = 2218 4485 Q3 = 2700 Max = 3 ,000 Below Q1 1 5 -. (1190) 1510-1785 no First 151 days in Pittsburgh Gale forecast Yes No g - i Yes 2b 17 - · No Totals 126 #3 Y Sensitivity : P(TID) P(15117) P15/17 = Specificity P(FIT) : : PVP = P (DIT) = 15/26 PVN P(D1T) 13 = = accuracy 15123 = = 151 relative risk RR = = 30 36 times the risk of it , happening odds ratio : bC = 84 1420 1510 1920 2500 2700 , 3 000 relative histogram in percents ↳A histogram in - whole numbers - P(gale)"/si Plgale and gale forecasted) is /is P(gale or gale predicted) = - 17726-15 28 151 151 P (2 days with no wind) (multiplication rule) # frequency polygon 0-999 O S 4 ,000 4 gaa ↳ - ⑧ - - ⑧ -99 5 iaa5 aga Buaa iyaa.

Tags

biostatistics data collection experimental design
Use Quizgecko on...
Browser
Browser