Summary

This document is a review for CS 201 Exam 2, covering data analysis concepts and techniques, and provides examples using Excel commands. Topics covered include univariate, bivariate, and multivariate analysis.

Full Transcript

CS 201 Exam 2 Review Data Analysis Review When to comes to data analysis, start by asking… What is the question asking you to do? How many variables? What kind of variables? 1. How many members used vs. did not use the exercise circuit? 2. What is the average monthly...

CS 201 Exam 2 Review Data Analysis Review When to comes to data analysis, start by asking… What is the question asking you to do? How many variables? What kind of variables? 1. How many members used vs. did not use the exercise circuit? 2. What is the average monthly revenue of AFC? 3. Describe the visits of members. 4. Do those that use the pool visit AFC more or less often? 5. Do members rate social aspects of the club or fitness as being more important? 6. What is the relationship between Likelihood to Recommend and visits? 7. To what extent does age, Likelihood to Recommend and education influence fees paid? Data analysis hinges on the variable(s) in question 1. Univariate Analysis: One variable at a time or “describing data” How many members used vs. did not use the exercise circuit? What is the average monthly revenue of AFC? Describe the visits of members. 2. Bivariate Analysis: Two variables at a time and the focus shifts to analyzing the relationships between the variables Do those that use the pool visit AFC more or less often? Do members rate social aspects or fitness as being more important? What is the relationship between likelihood to recommend and visits? 3. Multivariate Analysis: More than two variables at a time To what extent does age, likelihood to recommend and education influence fees paid? Excel Command Examples 1. How many members used the exercise circuit? Use the =countif function on the variable in question In this case, you are counting the “1’s” and the “0’s’’ to see how many used and did not use the circuit 2. What is the average monthly revenue of AFC? Use the =average function and highlight the range of values for the variable in question, in this case “fees paid” 3. Describe the visits of members. Run a Descriptive Statistics Summary on the variable in question Click on Data > Data Analysis > Descriptive Statistics Click on Summary Statistics Excel Command Examples (cont.) 4. Do those that use the pool visit AFC more or less often? Determine how many variables and what kind of variables are in question Here we have TWO variables, one categorical (pool usage) and one continuous (visits) We would look at the average number of visits to the center across the two groups, those who use the pool vs. those who did not use the pool Analyze whether the two groups truly differ on # of visits Ideal for independent samples t-test Copy the two variables to a different section and sort the variables (pool and corresponding visits) so that group 1 is together (pool users and their visits) and group 2 are together (non pool users and their visits) Run an independent samples t-test in Excel on the (newly sorted) visits variable comparing visits of pool users vs. non-users Review the t-test statistic and p-value to determine the significance (see Chapter 18 Part B Individual Assignment) 5. Do members rate social aspects or fitness as being more important? Determine how many variables and what kind of variables are in question Here we have TWO variables, both continuous Ideal for paired sample t-test Compare the two means (mean important scores) when both measures are provided by the same group (among all AFC members) Run a paired samples t-test on the social and fitness importance variables Review the t-test statistic and p-value to determine the significance (see Chapter 18 Part B Individual Assignment) Excel Command Examples (cont.) 6. What is the relationship between Likelihood to Recommend and visits? Determine how many variables and what kind of variables are in question Here we have TWO variables, both continuous Correlation is most appropriate Use =correl function, inserting the data array of the two variables inside the parentheses Hit “Enter” 7. To what extent does age, Likelihood to Recommend and education influence fees paid? Here we have multiple variables indicating regression is most appropriate Identify the Y (outcome) and X (predictor/explanatory) variables Copy and paste the variables to a separate place so they are adjacent to one another Be sure to fill in missing values, such as by using the average if needed Click on Data > Data Analysis > Regression Input the range of the Y variable and the range of the X variables Click on the Labels box, if needed Click on the Confidence Level box, default is 95% Select “Ok” Some considerations with data analysis Pivot tables: powerful and efficient way to look at all kinds of relationships between variables, cross-tabs AND other relationships ‒ Cross tab: examines a relationship between two categorical variables ‒ Other relationships, such as between one categorical and one continuous variable (e.g., gender and visits) Independent samples t-test: comparing means of two variables, one categorical (e.g., gender) and one continuous variable (e.g., visits) ‒ Purpose of the categorical variable (e.g., gender) is to simply classify the data into two groups. ‒ We only analyze the continuous variable (e.g., visits) ‒ Only include data on visits, in this example, in the t-test calculation (e.g., range 1 = male visits and range 2 = females visits) Correlation and simple regression (regression with one predictor, x, variable) produce the same output, Multiple R in the regression output is the same as the correlation “r” between two (continuous) variables Multiple Regression - involving two or more predictor variables (2 or more x's in the regression) ‒ Y variable = dependent variable = outcome = what we are trying to predict or understand, such as fees paid. The dependent variable, y, is usually a quantitative metric such as sales or revenue ‒ X variables = independent variables = explanatory variables = predictors = the variables we think will predict or explain the outcome, such as age or importance factors, like fitness or social aspects Primary research involves collecting data by “asking” or _______________ people Primary research involves collecting data by “asking” or _______________ people “watching” Which of the following is not true about observation research? a. Less versatile, behavior in the moment, can’t observe past or future intentions b. Reveals the “why” behind behaviors c. Not as efficient, is on their terms, may need to “wait” for behaviors to happen d. Highly accurate and less subjective as observation is the “real deal” Which of the following is not true about observation research? a. Less versatile, behavior in the moment, can’t observe past or future intentions b. Reveals the “why” behind behaviors c. Not as efficient, is on their terms, may need to “wait” for behaviors to happen d. Highly accurate and less subjective as observation is the “real deal” b. Reveals the “why” behind behaviors What are the two ways observation research can be carried out? What are the two ways observation research can be carried out? 1. Direct – watching actual activity 2. Indirect – observing the outcome of activity Which of the following are factors to consider when designing primary research? a. Degree of structure b. Degree of disguise c. Setting d. Method of administration e. All of the above Which of the following are factors to consider when designing primary research? a. Degree of structure b. Degree of disguise c. Setting d. Method of administration e. All of the above e. All of the above When it comes to the standardization of questions in primary research, with __________________ both the questions and responses are consistent where as with ________________ only the question is consistent. When it comes to the standardization of questions in primary research, with __________________ both the questions and responses are consistent where as with ________________ only the question is consistent. Fixed alternative Open-ended Which of the following is not true about highly structured questions? a. most useful when possible replies are known, are limited in number and are clear cut b. work well for obtaining factual information and assessing opinions about issues c. are best suited to understanding motivations, or the reasons “why” d. can be used to collect ratings on attitudes, intentions, awareness and so on. e. all of the above Which of the following is not true about highly structured questions? a. most useful when possible replies are known, are limited in number and are clear cut b. work well for obtaining factual information and assessing opinions about issues c. are best suited to understanding motivations, or the reasons “why” d. can be used to collect ratings on attitudes, intentions, awareness and so on. e. all of the above c. are best suited to understanding motivations, or the reasons “why” Name one way to lessen concerns about disguising research? Name one way to lessen concerns about disguising research? 1. By letting respondents know the study is “blind” and why 2. Allowing respondents to “opt out” 3. Debriefing: providing respondents with details of the research after data have been collected Name and briefly describe one method of conducting Primary Research. Name and briefly describe one method of conducting Primary Research 1. Personal Interviews – can take place anywhere, can collect very detailed information 2. Telephone Interviewing – becoming increasingly difficult, due largely to mobile (no lists, can’t due RDD). 3. Mail (Paper) Surveys – historically the leading from of primary research. The shift to paperless is one reason for downward trend. 4. Online Surveys – administered online such as through a website, the predominant method Which of the following are true of Personal Interviews? a. Involves direct, face-to-face conversation b. Generally strong sampling control (including higher response rates) c. Great flexibility, but higher levels of interviewer bias d. Time- and cost-intensive e. All of these are true Which of the following are true of Personal Interviews? a. Involves direct, face-to-face conversation b. Generally strong sampling control (including higher response rates) c. Great flexibility, but higher levels of interviewer bias d. Time- and cost-intensive e. All of these are true e. All of these are true Name three ways online surveys can be further optimized? Name three ways online surveys can be further optimized? 1. Short as possible 2. Optimize for mobile phones 3. Simple questions 4. Use of visuals/graphics 5. 3D technology or AR as appropriate 6. Show respondents see the progression 7. PRETEST 8. Highlight incentives 9. Gaming techniques 10. Keep it interactive Which of the following is not a characteristic of online surveys? a. A method of administration that relies on the web for completing the survey b. Explosion in use over the past decade c. Response rates are often very low d. Good flexibility; visuals and complex material possible e. Usually quick and inexpensive Which of the following is not a characteristic of online surveys? a. A method of administration that relies on the web for completing the survey b. Explosion in use over the past decade c. Response rates are often very low d. Good flexibility; visuals and complex material possible e. Usually quick and inexpensive c. Response rates are often very low Name and briefly define two of the four types of scales used in Primary Research? Name and briefly define two of the four types of scales used in Primary Research? 1. Nominal – Identification. Numbers don’t mean anything. 2. Ordinal – Order. Numbers are meaningful. 3. Interval – Comparison of the size of the differences. Attitude measurement. 4. Ratio - Comparison of absolute magnitude, includes natural zero. Ranking. Match the scale with the correct statistic ____All summary statistics a. Ratio ____Mean b. Ordinal ____Median c. Nominal ____Mode d. Interval Match the scale with the correct statistic ____All summary statistics a. Ratio ____Mean b. Ordinal ____Median c. Nominal ____Mode d. Interval All summary statistics = Ratio Mean = Interval Median = ordinal Mode = Nominal Interval and ratio-level data are considered ____________ variables and can accommodate numerous types of descriptive statistics Interval and ratio-level data are considered ____________ variables and can accommodate numerous types of descriptive statistics “continuous/numeric” Which of the following applies to itemized-rating scales? a. A scale on which individuals must indicate their ratings of an attribute or object b. Is a form of interval scale c. Examples include Likert Summated ratings scales and Semantic Differential scales d. Measures “unobservable” concepts such as attitudes e. All of the above Which of the following applies to itemized-rating scales? a. A scale on which individuals must indicate their ratings of an attribute or object b. Is a form of interval scale c. Examples include Likert Summated ratings scales and Semantic Differential scales d. Measures “unobservable” concepts such as attitudes e. All of the above e. All of the above Nominal and Ordinal scales are often used to group respondents or objects into categories and are considered ___________ variables or measures Nominal and Ordinal scales are often used to group respondents or objects into categories and are considered ___________ variables or measures categorical Which of the following are considerations in designing scales? a. Number of items in a scale b. Individual versus composite measures c. Odd or even number d. Including a “don’t know” or “not applicable” response category e. All of the above Which of the following are considerations in designing scales? a. Number of items in a scale b. Individual versus composite measures c. Odd or even number d. Including a “don’t know” or “not applicable” response category e. All of the above e. All of the above What is the difference between validity and reliability of measures? What is the difference between validity and reliability of measures? Reliability: Results are consistent. How well the measure obtains consistent scores across time or situations Validity: Results satisfy the objectives. How well a test measures what it is supposed to measure. Which of the following applies to telescoping error? a. Remembering events as having occurred more recently than they did b. Gets worse as time frame asked about is shorter c. Respondents tend to bring in purchases from broader time frames d. Ideal time frame is 2-4 weeks e. All of the above relate to telescoping error Which of the following applies to telescoping error? a. Remembering events as having occurred more recently than they did b. Gets worse as time frame asked about is shorter c. Respondents tend to bring in purchases from broader time frames d. Ideal time frame is 2-4 weeks e. All of the above relate to telescoping error e. All of the above relate to telescoping error Which of the following is not a consideration when developing the content of questions? a. Is the question necessary? b. Is the question too revealing? c. Are several questions needed instead of one? d. Do respondents have the necessary information? e. Will respondents give the information? Which of the following is not a consideration when developing the content of questions? a. Is the question necessary? b. Is the question too revealing? c. Are several questions needed instead of one? d. Do respondents have the necessary information? e. Will respondents give the information? b. Is the question too revealing? What is meant by a filter question and give an example? What is meant by a filter question and give an example? A question used to determine if a respondent has the knowledge or “qualifies” for the study Examples: “Do you do the grocery shopping for your family?” “Have you shopped at Trader Joes within the past six months?” “Did you vote in the last presidential election?” What is one of the most important things you can do before launching your questionnaire? What is one of the most important things you can do before launching your questionnaire? PRETEST PRETEST PRETEST Name two considerations with closed-ended questions? Name two considerations with closed-ended questions?  Include a “don’t know” if applies to a sizeable portion (>/+ 20%)  Responses must be exhaustive, may need to include an “other” option  Responses must be mutually-exclusive  Response order bias occurs when responses to a question are influenced by the order Which of the following is not a best practice regarding the wording of a question? a. Avoid simple words, err on the side of simplicity b. Avoid ambiguous words and questions c. Avoid leading questions d. Avoid assumed consequences e. Avoid double-barrel questions Which of the following is not a best practice regarding the wording of a question? a. Avoid simple words, err on the side of simplicity b. Avoid ambiguous words and questions c. Avoid leading questions d. Avoid assumed consequences e. Avoid double-barrel questions a. Avoid simple words, err on the side of simplicity Name one of three common nonprobability sampling techniques Name one of three common nonprobability sampling techniques 1. Convenience Sample Being included in a sample as a matter of convenience, right place, right time. Examples are mall intercepts, surveys on websites 2. Judgement Sample Sample elements are handpicked because they are expected to serve the research purpose. Examples are snowball and onsite panels. 3. Quota Sample Sample is constructed with certain characteristics that reflect the target population. Online panels are examples. What is the difference between a Census and a Sample? What is the difference between a Census and a Sample? Census Collecting data from all members of a population. Parameters are characteristics or measures of a population. Sample A subset of individuals or entities from a larger group. Statistics are characteristics or measures of a sample. Why is Probability Sampling the preferred approach? Why is Probability Sampling the preferred approach? Sampling error can be estimated and inferences can be made about the population. What is a meant by Snowball Sampling and when might it be used? What is a meant by Snowball Sampling and when might it be used? Judgement sample that is used to sample special, hard to find, populations in which an initial set of respondents are located and asked to others with same, special characteristics. What is the difference between precision and confidence? What is the difference between precision and confidence? Precision The degree of ERROR in an estimate of a population parameter. Confidence The degree to which one can feel confident that an estimate approximates the TRUE VALUE. As the precision gets smaller our confidence gets larger Which of the following factors do not influence sample size? a. Precision – the greater the precision desired, the larger the sample needed b. Size of population c. Confidence – the bigger the sample, the more confident that the true value falls within the range d. Variation of the characteristic – the greater the variability in the sample the larger the sample size needed Which of the following factors do not influence sample size? a. Precision – the greater the precision desired, the larger the sample needed b. Size of population c. Confidence – the bigger the sample, the more confident that the true value falls within the range d. Variation of the characteristic – the greater the variability in the sample the larger the sample size needed b. Size of population What is a sampling frame and give one example? What is a sampling frame and give one example? Sampling Frame is a LIST of population elements from which a sample will be drawn; could be geographic areas, institutions, individuals, or other units. Examples: Customer database Member directories Lists developed by data compilers What are the two approaches to drawing a sample and give one example of each? What are the two approaches to drawing a sample and give one example of each? 1. Probability sample: A sample in which each target population element has a known, nonzero chance of being included in the sample. ‒ Examples: Simple Random, Systematic, Stratified, and Cluster sampling (including area) 2. Nonprobability: A sample that relies on personal judgment in the element selection process. ‒ Examples: Convenience, Judgment, and Quota sampling What is meant by target population and give two examples? What is meant by target population and give two examples? A target population comprises all individuals or entities that meet the designated criteria to qualify for a research study. Examples: ‒ College students ‒ Primary grocery shoppers ‒ Household with kids 6-12 What could you do if greater precision is desired What could you do if greater precision is desired 1. Increase the sample size (if possible) 2. Decrease the degree of confidence, such as from 95% to 90% Which of the following is true about precision? a. Reflected by the width of a confidence interval, how precise we are with our estimate b. More precision means a more narrow estimate, or range c. Estimated by the sample error. d. Desired precision of an estimate is also sometimes called the allowable or acceptable error in the estimate e. All are true about precision Which of the following is true about precision? a. Reflected by the width of a confidence interval, how precise we are with our estimate b. More precision means a more narrow estimate, or range c. Estimated by the sample error. d. Desired precision of an estimate is also sometimes called the allowable or acceptable error in the estimate e. All are true about precision e. All are true about precision If we want to be very certain that we capture the true population parameter a ___________ interval is better, though we become ___________ precise in our estimate. If we want to be very certain that we capture the true population parameter a ___________ interval is better, though we become ___________ precise in our estimate. wider less Which of the following applies to random error? a. Error in measurement that is temporary b. Usually a result of personal or measurement situation c. Affects the measurement in irregular ways d. Is difficult to control e. All of the above Which of the following applies to random error? a. Error in measurement that is temporary b. Usually a result of personal or measurement situation c. Affects the measurement in irregular ways d. Is difficult to control e. All of the above e. All of the above What are the two key considerations in data analysis? What are the two key considerations in data analysis? 1. What type of analysis will be done?  Univariate analysis: One variable is analyzed at a time. Major purpose is to describe.  Bivariate or Multivariate analysis: Two or more variables analyzed together. Purpose is to understand relationships 2. What level of measurement will be used?  Four levels of measurement based on the data  Scales each lend themselves to different analyses An error in measurement that is also known as constant error because it affects the measurement in a constant way is called__________________________? An error in measurement that is also known as constant error because it affects the measurement in a constant way is called__________________________? Systematic Error Which of the following does not apply to sampling error? a. Can be estimated (assuming probability sample) b. Should be the focus as the single most important error c. Due to chance d. Usually less troublesome than other kinds of error e. Decreased by increasing sample size Which of the following does not apply to sampling error? a. Can be estimated (assuming probability sample) b. Should be the focus as the single most important error c. Due to chance d. Usually less troublesome than other kinds of error e. Decreased by increasing sample size b. Should be the focus as the single most important error Name three sources of error besides sampling error? Name three sources of error besides sampling error?  Noncoverage - failure to include qualified elements of the defined population in the sampling frame  Nonresponse - failing to obtain information from some of the sample elements of the population. Refusals and Not at homes  Response - individual provides an inaccurate response, consciously or subconsciously  Recording - Errors due to data editing, coding, or analysis errors.  Administrative - Mistakes made by people or machines. Why is sampling error considered less troublesome? Why is sampling error considered less troublesome? Can reduce it by increasing sample size Can account for it by calculating margin of error (assuming probability sample). The goal in sample surveys is to decrease ___________ error, not any one source of error. The goal in sample surveys is to decrease ___________ error, not any one source of error. Total or overall The degree of nonresponse is a good indicator of ________________ The degree of nonresponse is a good indicator of ________________ Overall quality The response rate itself is also a good way to check the quality of a survey Which of the following are the best strategies for handling nonresponse? a. Identify cases with a significant amount of item nonresponse and eliminate completely during the editing process b. Eliminate the case with the missing item(s) from further analyses c. Substitute values for the missing items d. Contact the respondent again e. All are helpful strategies Which of the following are the best strategies for handling nonresponse? a. Identify cases with a significant amount of item nonresponse and eliminate completely during the editing process b. Eliminate the case with the missing item(s) from further analyses c. Substitute values for the missing items d. Contact the respondent again e. All are helpful strategies e. All are helpful strategies What are THREE ways to improve response rates? What are THREE ways to improve response rates?  Respondent interest in topic  Survey length  Guarantee of confidentiality or anonymity  Interviewer characteristics and training  Personalization  Response incentives  Follow-up surveys Which of these strategies will not help lessen respondent nonresponse? a. Guarantee anonymity b. Place sensitive questions right up front c. Include statement showing the situation is NOT unusual, called counter biasing d. Ask in terms of “other people,” e.g., Do you think most students cheat? e. Ask for general rather than the specifics (e.g., income categories vs specific income) Which of these strategies will not help lessen respondent nonresponse? a. Guarantee anonymity b. Place sensitive questions right up front c. Include statement showing the situation is NOT unusual, called counter biasing d. Ask in terms of “other people,” e.g., Do you think most students cheat? e. Ask for general rather than the specifics (e.g., income categories vs specific income) b. Place sensitive questions right up front Name two tasks in the data preparation stage to ensure quality standards are met Name two tasks in the data preparation stage to ensure quality standards are met 1. Convert all responses to consistent units, e.g. months to years, days to weeks, cents to dollars 2. Assess degree of nonresponse, delete record if >/= 50% are missing 3. Check for consistency across responses 4. Look for evidence that the respondent wasn’t thinking about answers, e.g., straight lining” (all 5’s) 5. Verify that branching questions were followed correctly, e.g., “if yes, continue to 3” “Otherwise skip to 5” Preliminary steps in the data analysis phase involve which of the following steps: a. Editing or cleaning the data b. Coding and converting data into consistent symbols c. Running frequency analysis d. Identifying outliers e. All of the above Preliminary steps in the data analysis phase involve which of the following steps: a. Editing or cleaning the data b. Coding and converting data into consistent symbols c. Running frequency analysis d. Identifying outliers e. All of the above e. All of the above The three types of primary research are: The three types of primary research are: Most primary research is ______________in nature Most primary research is ______________in nature Descriptive What are the primary uses of Frequency Analysis? a. Univariate categorical analysis, such as Cross tabulation b. Identify blunders and cases with excessive item nonresponse c. Identify outliers d. Identify the median e. All are primary uses of Frequency Analysis What are the primary uses of Frequency Analysis? a. Univariate categorical analysis, such as Cross tabulation b. Identify blunders and cases with excessive item nonresponse c. Identify outliers d. Identify the median e. All are primary uses of Frequency Analysis e. All are primary uses of Frequency Analysis Which of the following is not true about descriptive statistics? a. Statistics that describe important characteristics/ properties of the data b. Relies on measures of central tendency and dispersion c. Should be the core of most analyses d. Most commonly used descriptive statistics are the mean and standard deviation e. Involves correlation which adds greater depth of understanding Which of the following is not true about descriptive statistics? a. Statistics that describe important characteristics/ properties of the data b. Relies on measures of central tendency and dispersion c. Should be the core of most analyses d. Most commonly used descriptive statistics are the mean and standard deviation e. Involves correlation which adds greater depth of understanding e. Involves correlation which adds greater depth of understanding Besides calculating the mean and standard deviation, which of the following are common approaches to Descriptive Statistics a. Frequency analysis b. Median split c. Cumulative % breakdowns d. Two-box technique e. All are common approaches to Descriptive Analysis Besides calculating the mean and standard deviation, which of the following are common approaches to Descriptive Statistics a. Frequency analysis b. Median split c. Cumulative % breakdowns d. Two-box technique e. All are common approaches to Descriptive Analysis e. All are common approaches to Descriptive Analysis What is the Two-Box Technique? What is the Two-Box Technique? Converting interval-level ratings into a categorical measure, such as for presentation purposes, by only showing percentage of respondents choosing one of the top two positions on a rating scale. What are the two forms of hypotheses testing? What are the two forms of hypotheses testing? Data analysis should always begin with a ________________________ Data analysis should always begin with a ________________________ Summary of descriptive measures, or univariate statistics When it comes to hypothesis testing, we are always looking to reject the ____________ hypothesis in favor of the ____________ hypothesis. When it comes to hypothesis testing, we are always looking to reject the ____________ hypothesis in favor of the ____________ hypothesis. NULL ALTERNATIVE Confidence intervals take ____________ into account but do account for other common types of error (e.g., response error, nonresponse error) Confidence intervals take ____________ into account but do not account for other common types of error (e.g., response error, nonresponse error) sampling error What is the primary goal of Descriptive Statistics? JD Schramm - The Secret to Successful Storytelling with Statistics What is the primary goal of Descriptive Statistics? Aim is to describe properties of data based on measures central tendency and of dispersion.  Central Tendency: central point of distribution  Variability: how the scores are scattered around the central point Together central tendency and variability are the two primary values that are used to describe a distribution JD Schramm - The Secret to Successful Storytelling with Statistics What is the p-value in statistics? What is the p-value in statistics?  a p-value that is less than the significance level indicates the results are statistically significant  Is used with many statistics in both univariate and multivariate analyses to determine if results are meaningful  the probability of obtaining a given result if the null hypothesis were true in the population Which of the following is true about cross tabs? a. Looking to see whether one variable has an influence on another b. Can also be used to look at the joint distributions of two variables c. Most used multivariate tool for categorical variables d. Also known as “contingency tables” or “two-way analysis” e. All of the above Which of the following is true about cross tabs? a. Looking to see whether one variable has an influence on another b. Can also be used to look at the joint distributions of two variables c. Most used multivariate tool for categorical variables d. Also known as “contingency tables” or “two-way analysis” e. All of the above e. All of the above Which of the following is true about Independent samples t- tests? a. Determines whether two groups differ on some characteristic that is assessed on a continuous measure b. Test the difference in means between two variables, an outcome continuous variable and an independent categorical variable c. P-value is used with the test statistic to help decide the outcome d. All are true about independent samples t-tests Which of the following is true about Independent samples t- tests? a. Determines whether two groups differ on some characteristic that is assessed on a continuous measure b. Test the difference in means between two variables, an outcome continuous variable and an independent categorical variable c. P-value is used with the test statistic to help decide the outcome d. All are true about independent samples t-tests e. All are true about independent samples t-tests Which of the following characteristics are not true of correlation? a. A statistic that indicates the degree of linear association between two continuous variables b. The correlation coefficient can range from −1 to +1, the closer to 1 the stronger the association c. Assesses the degree to which the variables change consistently (or not) across cases d. Indicates causality e. The p-value is used to decide significance Which of the following characteristics are not true of correlation? a. A statistic that indicates the degree of linear association between two continuous variables b. The correlation coefficient can range from −1 to +1, the closer to 1 the stronger the association c. Assesses the degree to which the variables change consistently (or not) across cases d. Indicates causality e. The p-value is used to decide significance d. Indicates causality Which of the following are true of regression? a. Seeks to identify the “best fit” between the predictors and the outcome b. Produces coefficients for each predictor variable that shows the individual impact c. Also produces a model “coefficient of multiple determination” (R2)” that reveals the combined “fit,” or how much variation is explained d. Uses the F statistic along with the p-value to determine model significance e. All are true of regression Which of the following are true of regression? a. Seeks to identify the “best fit” between the predictors and the outcome b. Produces coefficients for each predictor variable that shows the individual impact c. Also produces a model “coefficient of multiple determination” (R2)” that reveals the combined “fit,” or how much variation is explained d. Uses the F statistic along with the p-value to determine model significance e. All are true of regression e. All are true of regression Regression Analysis is a statistical technique used to understand the influence of a _________________on a ___________ variable. Regression Analysis is a statistical technique used to understand the influence of a _________________on a ___________ variable. set of independent or predictor variables, dependent or outcome How is a Chi-Square Good-of-fit Test used and with what kind of variables? How is a Chi-Square Good-of-fit Test used and with what kind of variables? A statistical test on categorical variables to determine whether some observed pattern of frequencies corresponds to an expected pattern. What is the key difference between univariate and multivariate analysis? What is the key difference between univariate and multivariate analysis? Univariate analysis is based on one variable at a time (or describing data), with Multivariate Analysis involves more than two variables at a time (or understanding relationships). A Chi Square statistic is used to test relationships between __________________ variables, referred to as Tests of Independence in cross tabulation A Chi Square statistic is used to test relationships between __________________ variables, referred to as Tests of Independence in cross tabulation categorical What is a one-sample t-test and what kind of variables are involved? What is a one-sample t-test and what kind of variables are involved? Based on continuous variables, compares the mean of the sample data to a known value, such as a population or historical mean What are the three MUST HAVES for effective communication What are the three MUST HAVES for effective communication 1. KNOW YOUR AUDIENCE 2. USE A “GOLDILOCKS” LEVEL OF DETAIL – NOT TOO MUCH, NOT TOO LITTLE 3. ALWAYS END WITH A VERY CLEAR CALL TO ACTION JD Schramm - The Secret to Successful Storytelling with Statistics Besides knowing your audience, what is another fundamental rule in research presentations Besides knowing your audience, what is another fundamental rule in research presentations KNOW YOUR STUFF What is meant by research reports serving as an archive? What is meant by research reports serving as an archive? The research report is often the ONLY documented history of the project that lives on Why is knowing your audience SO important? Why is knowing your audience SO important?  Comfort level with technical/analytical content?  Involvement in the project?  Interest in the project?  History with the project?  Relationship with the team?

Use Quizgecko on...
Browser
Browser