Pre-Finals Reviewer PDF
Document Details
Uploaded by Deleted User
STI
2024
Tags
Summary
This document contains a pre-finals reviewer for a course in data management and statistics for students. It covers basic concepts of data management, different statistical treatments, data collection and sampling methods, and chi-square analysis, with examples from various fields including education and healthcare.
Full Transcript
PRE-FINALS PERIOD October 24 – November 18, 2024 Basic concepts of Data Management Different Statistical Treatments Data Collection and Sampling Method Chi-Square Analysis INTRODUCTION TO DATA MANAGEMENT Data Management or Statistics is the science of collecting, or...
PRE-FINALS PERIOD October 24 – November 18, 2024 Basic concepts of Data Management Different Statistical Treatments Data Collection and Sampling Method Chi-Square Analysis INTRODUCTION TO DATA MANAGEMENT Data Management or Statistics is the science of collecting, organizing, analyzing, presenting and interpreting data. BRANCHES OF STATISTICS Descriptive Statistics Inferential Statistics DESCRIPTIVE STATISTICS A type of statistics that deals with the collection, presentation and description of data. It describes the main features of a data set and provides a summary of the information the data contains. ❑ Measures of Central Tendency (Mean, Median, Mode) ❑ Measures of Variability (Range, Variance and Standard Deviation) ❑ Visual representation (histogram, bar and pie charts) EXAMPLES OF DESCRIPTIVE STATISTICS Understanding Survey Demographics Descriptive statistics summarize respondent demographics, like age and income, to ensure a representative sample. Marketing Research: Analyzing Skincare Product Use Descriptive statistics can show how often Gen Z males use different products (like sunscreen or cleansers), revealing popular items and daily routines, useful for product development. EXAMPLES OF DESCRIPTIVE STATISTICS Educational Research: Analyzing Student Performance Descriptive statistics summarize test scores, attendance, and extracurricular data across districts, highlighting high- and low-performing areas. This helps in resource allocation and targeted interventions. Healthcare Research: Evaluating Patient Satisfaction Hospitals use descriptive statistics, like average and median satisfaction scores, to assess patient feedback across departments. To identify areas needing quality improvements. INFERENTIAL STATISTICS A type of statistics that interpret the values from sample data to draw conclusions, predict the behavior of a given population and make judgements or decisions. ❑ Hypothesis Testing – analyst tests an assumption regarding a population parameter. ❑ Regression Analysis –prediction method to express relationship between variables that affect phenomenon and factors. EXAMPLES OF INFERENTIAL STATISTICS Regression Analysis: In education research, regression analysis reveals that each additional study hour raises exam scores by 2.5 points with strong confidence (95%), helping to quantify study impact on performance. EXAMPLES OF INFERENTIAL STATISTICS ANOVA (Analysis of Variance): Researchers investigated the effects of three different diets on weight loss among participants over 12 weeks. Using ANOVA, they found significant differences in weight loss among the three diet groups, suggesting that at least one diet was more effective than the others. EXAMPLES OF INFERENTIAL STATISTICS Chi-Square Test: A study analyzed the relationship between smoking status (smoker, non-smoker) and the incidence of lung cancer in a population. A chi-square test indicated a significant association between smoking and lung cancer, demonstrating that smokers had a higher incidence of the disease compared to non-smokers. What is DATA? Methods of Data Collection WHAT IS DATA? Data are facts and figures that are collected, analyzed, and summarized for presentation and interpretation. Data are either numeric or non-numeric and must be contextualized. METHODS OF DATA COLLECTION Observations – a way to gather data by watching people, events or nothing physical characteristics in their natural settings. Experimentations – a method in which one or more variables are consciously manipulated and the outcome or effect of the manipulation on other variables is observed. METHODS OF DATA COLLECTION Simulations – a technique that reproduces actual events and process under test conditions Interviews – it involves asking one or more people about their opinions about a certain topic. ▪ Structured ▪ Semi-structured ▪ Unstructured METHODS OF DATA COLLECTION Panel Method/Focus Group – involves interviewing the same group of people on two or more occasions. Census – a collection of information from all units in the population. Survey – a collection of information from a sub-part of the population. LEVELS OF MEASUREMENT DATA CATEGORICAL NUMERICAL (Qualitative) (Quantitative) Nominal Ordinal Interval Ratio Categorical data Categorical data with A data with order but A data with order that without ranking ranking no true zero point. has true zero point. How do you select the respondents for your study? Different Sampling Method PLANNING AND CONDUCTING A STUDY SAMPLING METHOD The process by which the researchers select a representative subset or part of the total population that could be studied for their topic so that they will be able to draw conclusions about the entire population. The representative subset of a population is called Sample. Example: If you are researching the opinions of students in your university, you could survey a sample of 100 students. PLANNING AND CONDUCTING A STUDY TYPES OF SAMPLING METHOD 1. Non-probability Sampling – a sampling method where each member of the target population does not have an equal chance of being included. The selection is based on some non-random criteria like availability, geographical proximity, expertise or particular characteristics of the individuals. 2. Probability Sampling – a sampling method where each member of the target population must have an equal chance of selection. The selection is based on the principle of randomization. SAMPLING METHOD SAMPLING NON-PROBABILITY PROBABILITY 1. Convenience 1. Simple Random 2. Quota 2. Systematic 3. Snowball 3. Stratified 4. Purposive 4. Cluster PLANNING AND CONDUCTING A STUDY NON-PROBABILITY SAMPLING 1. Convenience Sampling – a process of selecting a group of individuals that are available for study. 2. Purposive Sampling – based from researcher’s judgement to select a sample which believes it will provide the data they need. 3. Quota Sampling – collects information from an assigned number, or quota of individuals from one of several sample units fulfilling certain prescribed criteria of one situation. 4. Snowball Sampling – which one or more members of a population located and used to lead the to other members of the population. PLANNING AND CONDUCTING A STUDY Determine which of non-probability sampling was used in each of the following scenarios: 1. A human resource director interviews the qualified applicants in a supervisory position. 2. A group of researchers included their closed friends to be included in the sampled population. 3. In STI Novaliches with 100 employees consist of non-teaching and teaching personnel. The 20 employees selected consist of 10 representatives for each respective group. 4. Philippine Embassy interviewed victims of the human trafficking and illegal recruiters in the Middle East. PLANNING AND CONDUCTING A STUDY PLANNING AND CONDUCTING A STUDY PROBABILITY SAMPLING 1. Simple Random Sampling – a process whose members had an equal chance of being selected from the population. 2. Systematic Sampling – a process of selecting nth element in the population until the desired number of subjects or respondents is attained. 3. Stratified Sampling – a process of subdividing the population into subgroups called strata and drawing members at random from each subgroup or stratum. 4. Cluster Sampling – a process of selecting group from a population which is very large or widely spread out over a wide area PLANNING AND CONDUCTING A STUDY Determine which of probability sampling was used in each of the following scenarios: 1. To evaluate employee compensation, all business within 10 random zipcode in the state where survey about salary benefits. 2. To determine the views and opinions of STIers, a group of researchers select and survey every 20th student in a class until 30 responses are collected. 3. The names of 25 employees are being chosen out of a hat from a company of 250 employees. 4. 20 respondents from each dorm where surveyed about the quality of on- campus housing. PLANNING AND CONDUCTING A SURVEYS Characteristics of a well-designed and well-conducted survey a. A good survey must be representative of population. b. To use the probabilistic results, it always incorporates a chance. c. Survey questions must be neutral. d. Errors and biases should be controlled. PLANNING AND CONDUCTING AN EXPERIMENTS Characteristics of a well-designed and well-conducted experiment A good statistical experiment includes: a. Stating the purpose of research b. Design of experiments c. Examining the data set in secondary analysis d. Documenting and presenting the results of the study. PLANNING AND CONDUCTING AN EXPERIMENTS Control and Experimental groups, Treatment, Random Assignments and Replication a. Control groups and Experimental units – to compare effects and make inference about associations or predictions b. Random Assignments – allocation of treatments to units c. Replication – All measurements observation or data collected are subject to variations, as there are no completely deterministic processes. PLANNING AND CONDUCTING AN EXPERIMENTS Sources of bias and cofounding, including placebo effect and blinding a. Confounding – correlates with both the dependent variable and independent variables. b. Placebo and Blinding – an imitation pill identical to the actual treatment pill, but without the treat ingredients. c. Blocking – the arranging od experimental units in groups (blocks) that are similar to one another. What method is used to see if two categorical variables (Independent and dependent) are related or different? Pearson’s Chi-Square (Χ2) Analysis CHI-SQUARE ANALYSIS WHAT IS CHI-SQUARE TEST Pearson’s chi-square (Χ2) Tests Nonparametric tests are used for categorical variables such as nominal and ordinal data. Chi-Square Test Types and Their Uses 1. Chi-Square Goodness of fit Test - used to determine the difference between observed and expected frequencies of one variable. 2. Chi-Square Test of independence - used to determine the relationship of two or more categorical variables CHI-SQUARE ANALYSIS STEPS IN CONDUCTING CHI-SQUARE ANALYSIS 1. State the problem. 2. Construct an assumption and state the hypothesis. SYMBOL DESCRIPTION MEANING There is no significant relationship or effect H0 Null Hypothesis exist between variables Alternative There is a significant relationship or effect H1 Hypothesis exist between variables 3. Identify level of significance (α) & compute degrees of freedom (df) 4. Determine the critical value on chi-square distribution table. CHI-SQUARE ANALYSIS STEPS IN CONDUCTING CHI-SQUARE ANALYSIS 5. Compute the Chi-Square value (X2) using the formula : 𝑂 −𝐸 2 2 𝑥 = 𝐸 where: X2 = Chi-square value. O = observed frequency value E = expected frequency value 6. Decision making and Conclusion. If X2computed value ≥ X2tabular value, then reject H0 , accept H1 If X2computed value < X2tabular value, then accept H0 , reject H1 CHI-SQUARE ANALYSIS TYPES OF CHI-SQUARE TEST 1. Chi-Square Goodness-of-Fit Test – is used to test whether the frequency distribution of one categorical variable is significantly different from expected frequencies. Often, but not always, the expectation is that the categories will have equal proportions. Examples: a. association of students sex ratio per classroom. b. proportion of births in the Philippines on each day of the week Chi-Square Goodness of fit Test EXAMPLE: 1 Researchers have conducted a survey of 1600 coffee drinkers asking how much coffee they drink in order to confirm previous studies. Related studies have indicated that 72% of Filipinos drink coffee. At 𝛼 = 0.05, is there enough evidence to conclude that the distributions of survey and previous studies are the same? Response % of Coffee Drinkers Observed 2 cups per week 15% 206 1 cup per week 13% 193 1 cup per day 27% 462 2+ cups per day 45% 739 Chi-Square Goodness of fit Test Solution 1. State the problem: Is there enough evidence to conclude that the distributions of survey and previous studies are the same? 2. Construct an assumption and state the hypothesis: H0 : There is no significant difference between the frequency distribution of survey and result of previous studies. H1 : There is a significant difference between the frequency distribution of survey and result of previous studies. 3. Level of significance: α = 0.05 (always given on the problem) Degrees of Freedom: df = number of responses - 1 =4–1 =3 Chi-Square Goodness of fit Test Solution 4. Determine the critical value on chi-square distribution table based on the level of significance: α = 0.05 and degrees of Freedom: df = 3: X2tabular value = 7.81 Chi-Square Goodness of fit Test Solution 5. Compute the Chi-Square value (X2): 𝑂−𝐸 2 Response Ratio O E 0–E (0 – E)2 𝐸 2 cups per week 15% 206 240 -34 1156 4.8167 1 cup per week 13% 193 208 -15 225 1.0817 1 cup per day 27% 462 432 30 900 2.0833 2+ cups per day 45% 739 720 19 361 0.5014 Chi-Square value (X2) = 4.8167 + 1.0817 + 2.0833 + 0.5014 = 8.48 Chi-Square Goodness of fit Test Solution 6. Decision making and Conclusion: The computed Chi-Square value is greater than the Chi-Square tabulated value, then reject H0 (X2computed value > X2tabular value), then reject H0 Based on the chi-square computed value of 8.48 which is greater than the chi-square tabulated value of 7.81 at 0.05 level of significance with 3 degrees of freedom, So null hypothesis is rejected. This means that there is significant difference between the results in the distribution of conducted survey and the results of survey conducted in a population. CHI-SQUARE ANALYSIS TYPES OF CHI-SQUARE TEST 2. Chi-Square Test of Independence Test – is used to test whether the two categorical variables are significantly related to each other. If two variables are independent (unrelated), the probability of belonging to a certain group of one variable isn’t affected by the other variable. Examples: a. The sex and course preference of Senior High School Students in STI Novaliches. b. The proportion of students who are left-handed is same for Multimedia Arts and Hospitality Management students. Chi-Square Test of independence EXAMPLE: 2 Ninety individuals male and female, were given a test in psychomotor skills and their scores were classified into high and low. Using the chi-square test of independence at 0.05 level of significance, the table is shown as follows: Scores Sex High Low Male 18 28 Female 32 12 Chi-Square Test of independence Solution 1. State the problem: Is there a significant relationship between sex and scores in psychomotor skills? 2. Construct an assumption and state the hypothesis: H0 : There is no significant relationship between sex and scores in psychomotor skills among individuals. H1 : There is a significant relationship between sex and scores in psychomotor skills among individuals. 3. Level of significance: α = 0.05 (always given on the problem) Degrees of Freedom: df = (number of rows -1)(number of columns -1) = (2 - 1)(2 - 1) = (1)(1) =1 Chi-Square Test of independence Solution 4. Determine the critical value on chi-square distribution table based on the level of significance: α = 0.05 and degrees of Freedom: df = 1: X2tabular value = 3.84 Chi-Square Test of independence Solution Scores High Low TOTAL Sex O E O E Male 18 25.5556 28 20.4444 46 Female 32 24.4444 12 19.5556 44 TOTAL 50 40 90 5. Compute the Chi-Square value (X2): 18 −25.5556 2 32 −24.4444 2 28 −20.4444 2 12 −19.5556 2 𝑥2 =σ + + + 25.5556 24.4444 20.4444 19.5556 𝑥 2 = 2.2338 + 2.3354 + 2.7923 + 2.9192 𝑥 2 = 10.28 CHI-SQUARE ANALYSIS Solution 6. Decision making and Conclusion: The computed Chi-Square value is greater than the Chi-Square tabulated value, then reject H0 (X2computed value > X2tabular value), then reject H0 Based on the chi-square computed value of 10.28 which is greater than the chi-square tabulated value of 3.84 at 0.05 level of significance with one degrees of freedom, This leads to the confirmation of research hypothesis which means that there is a significant relationship between sex and scores in psychomotor skills. It implies that the female’s high score is more in psychomotor skills than their counterpart. MATH IN THE MODERN WORLD GOD BLESS ON YOUR EXAM!!! I AM ROOTING FOR YOU ☺