Practical Research I Term 1: Finals PDF
Document Details
Uploaded by Deleted User
Tags
Summary
This document outlines the procedures for data preparation in quantitative research, focusing on data coding, entry, and transformation. It details how to turn answers from questionnaires or interviews into usable numerical data for analysis. The document also covers methods for ensuring data accuracy and consistency.
Full Transcript
PRACTICAL RESEARCH I Term 1: Finals Lesson: Data Preparation Data preparation in quantitative research follows a standard set of procedures to facilitate the conduct of quantitative analysis. Procedures: Data Coding An important first step in preparing data for research. It involves turnin...
PRACTICAL RESEARCH I Term 1: Finals Lesson: Data Preparation Data preparation in quantitative research follows a standard set of procedures to facilitate the conduct of quantitative analysis. Procedures: Data Coding An important first step in preparing data for research. It involves turning answers from questionnaires or interviews into numbers so they can be analyzed. This is especially important for open-ended questions to keep results consistent. However, not all data can be coded into numbers. For example, interview transcripts usually can’t be turned into numbers for statistical analysis. Codebook To make coding easier, researchers use a codebook. A codebook explains: Each variable in the study What questions or items measure that variable The format of each item (like numbers or text) The scale used to measure each item (like a scale from 1 to 5, or 1 to 7) How each answer will be turned into a number For example, if a question uses a 7-point scale where 1 means “strongly disagree” and 7 means “strongly agree,” you might code the responses as 1 for “strongly disagree,” 4 for “neutral,” and 7 for “strongly agree.” Different types of data require different coding. For instance: Nominal data (like types of industries) might be coded as 1 for manufacturing, 2 for retail, and so on. Ratio data (like age or income) can be entered just as it was collected. Data Entry It involves transferring information from questionnaires or interviews into computer files for processing. This task can be done faster and more accurately if two people work together: one person reads the information, and the other types it in. For large and complex studies, a statistical program like SPSS (Statistical Package for the Social Sciences) is often used to help with data entry. These programs make it easier to handle many variables and items. However, the data is usually stored in a special format that might not work with other programs. To avoid this, data can be entered into a spreadsheet or database, which makes it easier to share and analyze later. For smaller data sets (less than 65,000 rows and 256 columns), you can use a spreadsheet like Microsoft Excel. Larger data sets with millions of rows will need a database. In a spreadsheet, each row represents one response, and each column represents a question or item. Data Data transformation is another step in the coding process. Sometimes, data needs to be changed or Transformatio adjusted before it can be properly understood. n For example, if a question is reverse-coded (meaning it has the opposite meaning of what you want to measure), the data needs to be reversed. Reverse-coded items are questions where the agreement should actually mean the opposite. For example, if most of your questions are positive (“I enjoy my job”), a reverse-coded question might be negative (“I dislike my job”). If someone gives a high score (like 6 or 7) to the negative statement, it actually means they don’t enjoy their job. To make sure the scores can be compared, you need to reverse the score for the negative question. If someone answered 7 (strongly agree) to the negative statement, you would change that to a 1. If they answered 2, you would change it to 6, and so on. Other types of data transformations include: Adding up scores from several questions to get a total score. Giving different questions different weights (importance) to create an overall score. Grouping specific answers into broader categories, like turning specific income amounts into ranges (e.g., $20,000-$30,000). PRACTICAL RESEARCH I Term 1: Finals Data It involves double-checking the data entered into the computer to make sure everything is Cleansing correct. This is especially important when there are many respondents. For example, if someone accidentally enters an age of 501 years, it’s clearly a mistake that needs to be fixed. Computers often won’t catch these errors, so humans need to look out for them. Another issue in research is missing values—when respondents don’t answer certain questions. This can happen if the questions are unclear or too personal. It’s best to catch these issues during pretests, before collecting the main data. When entering data, some programs automatically mark missing values, while others need a specific code (like -1 or 999) to show that data is missing. During analysis, most software will simply ignore any data points with missing values, which can reduce your sample size and make it harder to find patterns. To avoid this, some programs offer imputation—a way to estimate and fill in missing values. For example, if a respondent skips one question in a series, you might use the average of their other answers to fill in the gap. If the question was skipped by many people, you might use the average answer from all respondents. Memos vs Codes Memos are your initial notes in margin as you actively read transcripts Codes are the themes, topics, or concepts that emerge as repetition across multiple transcripts Memos to Codes Refining the Codebook You develop codes – software does not develop codes Codes can change during project period Use about 1/3 of your data to develop codes May want to split code into two or combine multiple Choose diverse transcripts codes into one Memos that are repeated might indicate code Iterative process to determine scope needed You can modify codes later – iterative process Definitions need to be a clear YES or NO for quick coding Can code positive or negative but often easier to code one and differentiate later (ex: satisfaction) Coding Once codebook is refined and finalized, systematically apply codes to text Code entire dataset Involves reading, identifying, selecting, and applying code based on interpretation Need to be able to justify applying code or not applying code Define Text Segment Length With multiple coders, need to determine coding style Macro “Lumper” coding vs Micro “Splitter” coding – can Developing a Codebook lump all text into large codes or split into several little Dictionary or guidebook of all codes in project text strings Important to define for inter‐coder Provides guidelines for consistent coding across agreement multiple staff Provides name of the code (easy to remember) Provides clear definition or description of what code is and is not Provides example of appropriate text excerpt for code What Makes a Code? Theme relevant to research question Discussed by participants Repeated in data Approaches to Coding Clear issue Try to code more than you may need Code in stages if you have a lot of codes How Many Codes? Can use functional codes, such as indicating good Depends on how rich or thin data are quotations to be used later Until “Saturation” Can use more than one code on same text Depends on depth of analysis Use code lengths long enough to provide context Depends on level of detail of code (narrow or wide (codes will be separated out from transcripts) focus) PRACTICAL RESEARCH I Term 1: Finals Inter‐coder Agreement The values represent categories, not numerical Measure of reliability of coding differences. Can two coders independently code data in same way? For example, the variable “gender” might have Way of assessing validity or accuracy in data Improves consistency and quality of analysis categories like male and female. Assigning Have two coders or team code same few interviews numbers to these categories (e.g., 1 for male, 2 Software will calculate inter‐coder agreement for female) is just a way to label them; the Come to consensus on codes/revise definitions numbers don’t imply any order or amount. There’s no inherent order in nominal variables. Coding Manually vs Using Software Software is not required for qualitative data analysis For instance, hair color categories like brown, Analysis is primarily done by investigators blond, and black are just different types, with no Can code using highlighters or colored pencils one being “higher” or “lower” than the others. Can code using color‐coding in Word When assigning categories, ensure they are Software helps with sorting by codes and by subgroups mutually exclusive (each case fits only one across interviews Software also calculated inter‐coder agreement category) and exhaustive (every case fits into a category). For example, in civil status, Thick Description categories like single, married, divorced, Way of summarizing and providing interpretation on a separated, and widowed should cover all topic across multiple interviews Describes an act and context surrounding that act possible cases without overlap. If you’re unsure Provides social and cultural meaning whether all possibilities are covered, include an Describes how participants perceive act or concept “Other” category to avoid misclassification and capture finer distinctions. How to Write Thick Description Search through codes Search by subgroups Ordinal Level of Measurement Search for overlapping codes (ex: where “satisfaction” The ordinal level of measurement is the first and “facilitators” overlap) quantitative level. Search around a question (“In what context do people Here, numbers assigned to categories indicate see family as a barrier or facilitator”) order or rank, showing which cases are greater What to look for when writing or less than others. However, the gaps between Breadth – shows differences and range of context the categories don’t have a specific meaning. around an issue (ex: all types of barriers to physical Like nominal variables, ordinal variables must be activity) mutually exclusive (each case fits only one Depth – shows details and thoroughly explains an category) and exhaustive (all possible cases are issue (ex: access to the gym is a barrier and involves cost, location, etc.) covered). Context – shows meaning of issue such as who, what, Ordinal variables allow you to logically rank when, where, why (ex: context around life events that categories. For example, “level of education” change physical activity levels) might include categories like some high school, Nuance – Is an issue different under different circumstances? (ex: family can be perceived as barrier high school, some college, college graduate, to exercise or as a facilitator) and postgraduate. Another example is measuring prejudice, where one person can be Comparisons and Categorizations more or less prejudiced than another. Analysis can go beyond summaries and thick The key is that you can rank the categories, but descriptions around a concept Can compare differences by circumstance the exact distance between them doesn’t matter. Can make comparisons across subgroup (males vs females, by age, by employment status, etc.) Interval Level of Measurement Can categorize codes into groups that represent as The interval level of measurement uses values single issue that represent fixed units, but there is no true Ultimately looking for patterns to build explanations zero point. LESSON: Levels of Measurement The categories are mutually exclusive, exhaustive, and ordered. In quantitative research, there are four levels of Here, the gaps between numbers are measurement: nominal, ordinal, interval, and ratio. Ordinal, interval, and ratio measures are quantitative meaningful—each unit difference is consistent while nominal measures are not. across the scale. Unlike ordinal measures, where the distance Nominal Level of Measurement between categories doesn’t matter, the actual It identifies variables that differ in type or quality but not distances between values are important in in quantity. interval-level measures. PRACTICAL RESEARCH I Term 1: Finals ○ Examples include temperature and IQ tests, where the intervals between Frequency polygons (line values are consistent (e.g., the graphs) can also be used for difference between 75 and 80 degrees). interval/ratio data and However, since there’s no absolute zero, you sometimes provide a clearer can’t say someone is twice as smart or that it’s view than histograms. three times as hot in one room compared to another. Before creating these charts, it’s important to With interval measures, you can determine: organize raw data into frequency tables. 1) that people differ on this variable, This involves grouping scores or values into categories 2) that one person has more or less of the and then listing how often each category occurs. variable than another, and 3) how much more or less. Ratio Level of Measurement The ratio level of measurement has fixed units and an absolute zero point, meaning zero truly represents “none” of what you’re measuring. This allows for meaningful comparisons, like saying 10 is two units more than 8 and twice as much as 5. You can add, subtract, multiply, and divide ratio numbers. ○ For example, age is a ratio variable. If Lesson: Drawing Conclusions someone is 30 years old and another person is 15 years old, the first person is Results and Discussion 15 years older and twice as old. Present the results of your study according to Ratio measures also have distinct categories, the sequence of your objectives/questions. cover all possibilities, are ordered, and have Important negative results should be reported equal gaps between values. too. Ratio measures include all features of interval Format measures but with a true zero point. ○ Objective/Question 1 ○ Examples include age, income, length of Introduce the graph/table/figure residency, and number of times you Present the table attend church. Because of the absolute graph/table/figure zero, you can say things like someone is Interpret the results twice as rich as someone else. Analysis of the results/give implications Conclusion In formulating your conclusion, be guided by the following questions: What answer(s) have you found to your research question? If you have a hypothesis, has it been strengthened, weakened, or falsified? Do not introduce issues here that have not been mentioned earlier. Histogram and Bar Charts: Histograms and bar charts If the results of your study do not allow you to look similar but have key differences. draw any conclusions, you can end with a In histograms, the bars touch because they represent summing up. equal intervals of data. A Gantt chart is a bar chart that graphically illustrates a In bar charts, the bars are separated by spaces, schedule for planning, coordinating, and tracking indicating that the categories don’t have equal intervals. specific tasks related to a project. Pie charts are best for displaying nominal data, where each slice represents a percentage of the whole (100%).