Mitmw Reviewer Aly PDF
Document Details

Uploaded by AwestruckSugilite3877
Tags
Summary
This document discusses measures of central tendency, including mean, median, and mode. It also explains linear regression and offers a formula to calculate the correlation coefficient and the line of best fit. It is mainly about statistics.
Full Transcript
†EASURE OF CENTRAL TENDENCY‬ M †ttributed‬ â€to‬ â€changes‬ â€in‬ â€an‬ â€explanatory‬ â€variable,‬ â€which‬ â€is‬...
†EASURE OF CENTRAL TENDENCY‬ M †ttributed‬ â€to‬ â€changes‬ â€in‬ â€an‬ â€explanatory‬ â€variable,‬ â€which‬ â€is‬ a â€Measure of central tendency‬ â€placed on the x-axis‬ â€-‬ â€one‬ â€of‬ â€the‬ â€basic‬ â€statistical‬ â€concepts‬ â€that‬ â€is‬ â€used‬ â€to‬ â€find‬â€a‬â€single‬â€value‬â€representing‬â€the‬â€center‬â€of‬â€a‬â€set‬â€of‬ â€Line of best fit or least-squares regression line‬ â€data‬ â€-‬ â€usually of most interest‬ â€-‬ â€involve‬â€the‬â€method‬â€of‬â€finding‬â€out‬â€the‬â€central‬â€value‬â€of‬â€a‬ â€-‬ â€line‬ â€that‬ â€fits‬ â€the‬ â€data‬ â€better‬ â€than‬ â€any‬ â€other‬ â€line‬ â€that‬ â€statistical series or set of quantitative data‬ â€might be drawn‬ â€-‬ â€any‬ â€single‬ â€value‬â€that‬â€is‬â€used‬â€to‬â€identify‬â€the‬â€center‬â€of‬ â€-‬ â€set‬ â€of‬ â€bivariate‬ â€data,‬ â€line‬ â€that‬ â€minimizes‬ â€the‬ â€sum‬ â€of‬ â€the data or typical value‬ â€the‬ â€squares‬â€of‬â€the‬â€vertical‬â€deviations‬â€from‬â€each‬â€data‬ â€Three types of central tendency‬ â€point to the line‬ â€â€¬ â€Mean:‬ â€sum‬ â€of‬ â€all‬ â€observed‬ â€values‬ â€divided‬ â€by‬ â€the‬ â€number of observations‬ â€â€¬ â€Median:‬ â€positional‬ â€middle‬ â€value‬ â€when‬ â€observations‬ â€are ordered from smallest to largest or vice versa‬ â€â€¬ â€Mode: observed value that occurs most frequently‬ â€Unimodal: one mode‬ â€Bimodal: two mode‬ â€Trimodal: three mode‬ â€Mean‬ â€Median‬ â€Mode‬ â€Quantitative Data‬ â€Quantitative Data‬ †uantitative‬ â€and‬ Q â€Qualitative Data‬ -†‬ â€most‬ â€popular‬ -†‬ â€extreme‬ â€values‬ â€- may not exist‬ â€measure‬ â€of‬ â€do‬ â€not‬ â€affect‬ â€the‬ â€-‬ â€may‬ â€not‬ â€be‬ â€central location‬ â€median‬ â€as‬ â€unique‬ â€-‬ â€affected‬ â€by‬ â€strongly‬ â€as‬ â€they‬ â€-‬ â€extreme‬ â€values‬ â€extreme values‬ â€do the mean‬ â€do‬ â€not‬ â€affect‬ â€the‬ â€-‬ â€unique,‬ â€one‬ â€-‬ â€useful‬ â€when‬ â€mode‬ â€answer‬ â€comparing‬ â€sets‬â€of‬ â€-‬ â€no‬ â€values‬ â€Three main purpose of regression analysis‬ â€-‬ â€useful‬ â€when‬ â€data‬ â€repeat:‬ â€mode‬ â€is‬ â€-‬ â€To‬ â€describe‬ â€or‬ â€model‬ â€a‬ â€set‬ â€of‬ â€data‬ â€with‬ â€one‬ â€comparing‬ â€sets‬â€of‬ â€-‬ â€unique,‬ â€one‬ â€every‬ â€value‬ â€and‬ â€dependent‬ â€variable‬ â€and‬ â€one‬ â€or‬ â€more‬ â€independent‬ â€data‬ â€answer‬ â€useless‬ â€-‬ â€more‬ â€than‬ â€1‬ â€variables‬ â€mode:‬ â€difficult‬ â€to‬ â€-‬ â€To‬ â€predict‬ â€or‬ â€estimate‬ â€the‬ â€values‬ â€of‬ â€the‬ â€dependent‬ â€interpret‬ â€and‬ â€variable‬ â€based‬ â€on‬ â€given‬ â€value(s)‬ â€of‬â€the‬â€independent‬ â€compare‬ â€variable(s)‬ â€-‬ â€To‬ â€control‬ â€or‬ â€administer‬ â€standards‬ â€from‬ â€a‬ â€useable‬ †ormula‬ F †ormula‬ F †ormula‬ F â€statistical relationship‬ â€âˆ‘x/n‬ â€n+1/2 position‬ â€none‬ â€Linear correlation coefficient‬ †INEAR REGRESSION AND CORRELATION‬ L â€-‬ â€determine‬ â€the‬ â€strength‬ â€of‬ â€a‬ â€linear‬ â€relationship‬ â€Data analysis: provides insight that improve decisions‬ â€between two variables‬ â€Linear‬ â€regression‬ â€model:‬ â€have‬ â€an‬ â€important‬ â€role‬ â€for‬ â€many‬ â€-‬ â€statistic used by statisticians‬ â€analyses and predictions‬ â€-‬ â€denoted by the variable r‬ †orrelation‬ â€or‬ â€simple‬ â€linear‬ â€regression‬ â€analysis:‬ â€determine‬ â€if‬ C â€two numeric variables are significantly linearly related‬ â€Correlation‬ â€analysis:‬ â€provides‬â€information‬â€on‬â€the‬â€strength‬â€and‬ â€direction of the linear relationship between two variables‬ â€Simple‬ â€linear‬ â€regression‬ â€analysis:‬ â€estimates‬ â€parameters‬ â€in‬ â€a‬ â€linear‬ â€equation‬ â€that‬ â€can‬ â€be‬ â€used‬ â€to‬ â€predict‬ â€values‬ â€of‬ â€one‬ â€variable based on the other‬ Iâ€f‬ â€r‬ â€is‬ â€positive,‬ â€the‬ â€relationship‬ â€between‬ â€the‬ â€variables‬ â€has‬ â€a‬ â€positive‬â€correlation.‬â€In‬â€this‬â€case,‬â€if‬â€one‬â€variable‬â€increases,‬â€the‬ †inear‬ â€regression:‬ â€statistical‬ â€model‬ â€that‬ â€attempts‬â€to‬â€show‬â€the‬ L â€other variable also tends to increase.‬ â€relationship between two variables with a linear equation‬ Iâ€f‬ â€r‬ â€is‬ â€negative,‬ â€the‬ â€linear‬ â€relationship‬ â€between‬ â€the‬ â€variables‬ †egression‬ â€analysis:‬â€graphing‬â€a‬â€line‬â€over‬â€a‬â€set‬â€of‬â€data‬â€points‬ R â€has‬ â€a‬ â€negative‬ â€correlation.‬ â€In‬ â€this‬ â€case,‬ â€if‬ â€one‬ â€variable‬ â€that most closely fits the overall shape of the data\‬ â€increases, the other variable tends to decrease.‬ †egression:‬ â€shows‬ tâ€he‬ â€extent‬ â€to‬ â€which‬ c†hanges‬ â€in‬ â€a‬ R †he‬ â€closer‬ â€|‬ â€r‬ â€|‬ â€is‬ â€to‬ â€1,‬ â€the‬ â€stronger‬ â€the‬ â€linear‬ â€relationship‬ T â€dependent‬ â€variable,‬ â€which‬ â€is‬ â€put‬ â€on‬ â€the‬ â€y-axis,‬ â€can‬ â€be‬ â€between the variables.‬ â€Other strengths of association‬ â€r value‬ â€Interpretation‬ â€0.9‬ â€strong association‬ â€0.5‬ â€moderate association‬ â€0.25‬ â€weak association‬ †orrelation:‬ â€measure‬ â€of‬ â€association‬ â€between‬ â€two‬ â€numerical‬ C â€variables‬ †earson’s‬ â€sample‬ â€correlation‬ â€coefficient,‬ â€r:‬ â€measures‬ â€the‬ P â€direction‬â€and‬â€the‬â€strength‬â€of‬â€the‬â€linear‬â€association‬â€between‬â€two‬ â€numerical paired variables‬ â€Regression‬ â€-‬ â€specific‬â€statistical‬â€methods‬â€for‬â€finding‬â€the‬â€line‬â€of‬â€best‬ â€Strength of linear association‬ â€fit‬ â€for‬ â€one‬ â€response‬ â€(dependent)‬ â€numerical‬ â€variable‬ â€based‬ â€on‬ â€one‬ â€or‬ â€more‬ â€explanatory‬ â€(independent)‬ â€r value‬ â€Intrepretation‬ â€variables‬ â€-‬ â€statistical‬â€methods‬â€to‬â€assess‬â€the‬â€goodness‬â€of‬â€fit‬â€of‬â€the‬ â€1‬ â€perfect positive linear relationship‬ â€model‬ â€0‬ â€no relationship‬ â€-‬ â€Correlation Coefficient‬ â€-1‬ â€perfect negative linear relationship‬ â€Simple linear regression‬ â€-‬ â€statistical‬ â€methods‬ â€for‬ â€finding‬ â€the‬ â€line‬ â€of‬ â€best‬ â€fit‬ â€for‬ â€one‬ â€response‬ â€(dependent)‬ â€numerical‬ â€variable‬ â€based‬ â€on one or more explanatory (independent) variables‬ â€Least squares regression‬ â€-‬ â€minimize‬ â€the‬ â€sum‬ â€of‬ â€the‬ â€square‬ â€of‬ â€the‬ â€errors‬ â€of‬â€the‬ â€data points‬ â€-‬ â€minimizes the Mean Square Error‬ â€Steps to reaching a solution‬ â€â€¬ â€Draw a scatterplot of the data.‬ â€â€¬ †isually,‬ â€consider‬ â€the‬ â€strength‬ â€of‬ â€the‬ â€linear‬ V â€Using data ethically‬ â€relationship.‬ â€-‬ â€What is data?‬ â€â€¬ â€If‬ â€the‬ â€relationship‬ â€appears‬ â€relatively‬ â€strong,‬ â€find‬ â€the‬ â€-‬ â€How is data used?‬ â€correlation coefficient as a numerical verification.‬ â€-‬ â€What is data misconduct?‬ â€â€¬ â€If‬â€the‬â€correlation‬â€is‬â€still‬â€relatively‬â€strong,‬â€then‬â€find‬â€the‬ â€How we work with data?‬ â€simple linear regression line.‬ â€-‬ â€Generating‬ â€â€¬ â€Interpreting and Visualizing‬ â€-‬ â€Curating‬ â€-‬ â€Recording‬ â€y = a + bx‬ â€-‬ â€Processing‬ â€-‬ â€value of b: slope‬ â€-‬ â€Dissemenating‬ â€-‬ â€value of a: y-intercept‬ â€-‬ â€Sharing‬ â€-‬ â€r: correlation coefficient‬ â€-‬ â€Using‬ â€-‬ â€r^2: coefficient of determination‬ â€Plagiarism‬ â€Misconduct‬ †trength of the association: r^2‬ S â€-‬ â€more than just plagiarism‬ â€Coefficient of determination‬ â€-‬ â€fabrication and falsification‬ â€-‬ â€r^2‬ â€-‬ â€Department‬ â€of‬ â€Health‬ â€and‬ â€Human‬ â€Services,‬ â€-‬ â€percent‬ â€of‬ â€the‬ â€variation‬â€in‬â€the‬â€response‬â€variable‬â€that‬ â€fabrication,‬ â€falsification,‬ â€or‬ â€plagiarism‬ â€in‬ â€proposing,‬ â€is‬ â€explained‬ â€or‬ â€determined‬ â€by‬ â€the‬ â€model‬ â€and‬ â€the‬ â€performing, or reviewing research results‬ â€explanatory variable‬ â€Fabrication: making up results and recording or reporting them‬ â€Falsification‬ â€-‬ â€manipulation‬ â€of‬ â€research‬ â€materials,‬ â€equipment,‬ â€or‬ â€process‬ â€-‬ â€changing‬ â€or‬ â€omitting‬ â€results‬ â€so‬ â€research‬ â€is‬ â€not‬ â€accurately represented‬ â€Plagiarism‬ â€-‬ â€appropriation‬ â€of‬ â€another’s‬ â€ideas,‬ â€processes,‬ â€results,‬ â€or words without giving proper credit‬ â€Real life application‬ â€-‬ â€Multiple‬ â€regression:‬ â€cost‬ â€estimating‬ â€for‬ â€future‬ â€space‬ â€flight vehicles‬ â€-‬ â€Nonlinear‬â€application:‬â€predicting‬â€when‬â€solar‬â€maximum‬ â€will occur‬ â€-‬ â€Periodic:‬ â€estimating‬ â€seasonal‬ â€sales‬ â€for‬ â€department‬ â€stores‬ â€-‬ â€predicting‬ â€student‬ â€grades‬ â€based‬ â€on‬ â€time‬ â€spent‬ â€studying‬ †THICAL ISSUES IN MANAGEMENT OF DATA‬ E â€Data‬ â€-‬ â€simply a piece of information‬ â€-‬ â€facts or statistics‬ â€Data ethics‬ â€-‬ â€National‬ â€Center‬ â€for‬ â€Biotechnology‬ â€Information,‬ â€new‬ â€branch‬ â€of‬ â€ethics‬ â€that‬ â€studies‬ â€and‬ â€evaluates‬ â€moral‬ â€problems‬ â€related‬ â€to‬ â€data,‬ â€algorithm,‬ â€and‬ â€corresponding‬ â€practices‬ â€in‬ â€order‬ â€to‬ â€formulate‬ â€and‬ â€support morally gold solutions‬ â€-‬ â€branch‬ â€of‬ â€ethics‬ â€that‬ â€studies‬ â€and‬ â€evaluates‬ â€moral‬ â€concerns related to data‬ â€-‬ â€includes‬ â€but‬ â€is‬ â€not‬ â€limited‬ â€to‬ â€any‬ â€kind‬â€of‬â€information‬ â€created‬ â€-‬ â€include‬ â€algorithims,‬ â€scripts,‬ â€and‬ â€research‬ â€processes‬ â€(references, results, samples, and raw data)‬ â€Sensitive data: personally identifiable information‬