Marketing Research II: Survey Design PDF

TOPIC 3: Marketing Research II; Survey Design © 2017 by Mercedes Esteban-Bravo & José M. Vidal-Sanz All Rights Reserved. No part of this publication can be reproduced, stored, or distributed in any way (either manual, electronic, recording, mechanical, photocopying or including information storage and retrieval systems), without explicit permission from the authors. Updated in 2020. 1 Contents: 1. Introduction 2. Writing a good questionnaire (Questionnaire Design) 3. Measurement Scales 4. Sampling methods 5. Survey Errors 2 Esta foto de Autor desconocido está bajo licencia CC BY-NC-SA 1. Introduction We have already discussed what is a survey, how it is used, and the different ways to organize the data collection process (face-to- face, telephone, etc.…) In this chapter we study I depth the details necessary to successfully design a survey, such as questionnaire design, scales, and sample selection methods. 3 Esta foto de Autor desconocido está bajo licencia CC BY-NC-SA Write the research Data analysis report. Data -Codification, collection Statistical Sampling analysis, design Interpret and integrate Scales- How findings will answers be rated? Questions - What questions to ask? In what order? 4 2. Writing a good questionnaire Questionnaire is defined as a structured list of questions asked to survey participants and administered in written or verbally. Its main functions are: Sets the order of the interview Ensure that all questions are posed in the same way Basis for recording and collecting the data to be analyzed Including Fieldwork procedures: instructions for selecting, approaching, and questioning respondents, Some reward, gift or payment offered to respondents, Communication aids (maps, pictures, ads and products or return envelopes) 5 Esta foto de Autor desconocido está bajo licencia CC BY-NC-SA Questionnaire type: Structured (specifies a set of questions and the format SURVEY of the response in a standardized way). Appropriate for quantitative research. Semi-structured (providing a script of questions but the Exploratory answers are not tabulated) Adequate for exploratory qualitative qualitative research if you have any prior information and research want to clarify aspects. Unstructured is a completely open and flexible script. Suitable qualitative exploratory research. 6 Steps Specify the Write questions information (structure, Test the needed wording) questionnaire Determine the Arrange the Eliminate content of questions in problems, and items/questions proper order, distribute form and layout 7 Taxonomy of possible questions By response format: A) Open-end questions: the respondent is free to choose any response. Advantages Easy to question Disadvantages Answers difficult to tabulate To express general attitude and opinions answers. Often use in qualitative exploratory analysis. Difficult to analyze. B) Closed-end questions the respondent is provided with predetermined answers and is asked to choose the one that best describes his/her view. There are different alternatives: Dichotomous choice Multiple choice (select one among several alternatives) Multiple answer (marking some elements from a closed list) Measurement scales (measuring attitudes) C) Mixed: closed questions on option “Others, please specify: _______” including an open-end line of response 8 2-By the information given by the interviewer (i) Aided versus (ii) Unaided questions: Both types, open/closed -ended questions, can be classified as aided or unaided, depending on how much information is given to the respondent 3- Depending on the type of information: 1) Introductory questions: To open the survey, generating a relaxed environment. 2) Identification questions (name, address and telephone number) when the survey is not anonymous (confidentiality can be assured, anyway). 3) Classification questions consists of socio-economic and demographic characteristics (relevant for segmentation purposes), including gender, marital status, education level, etc. 4) Basic information: relates to the basic research problem or goal. 9 5) Control questions: To test the quality of the information (answers are mainly true). Example Q.6. Have you ever purchased the X brand? Yes No …. Q.16. Select from the following list the brands you have ever bought remember: X Y Z U 6) Filter or logical: They serve to select subsamples to which an adequate sequence of the questions is adapted, improving the coherence of the answers. Example Q.7.a Have you heard of brand x? Yes (answer question 7b) No (go to question 8) Q.7.b Do you use brand X? Yes (Interviewer: ask question 7c) No (ask question 8) Q.7.c How often do you use it? Q.8. ……. 7) Screening questions: They are used at the beginning of the questionnaire to discard elements of the sample that are not part of the target population of the study. 8) Scales questions: To quantify customers’ attitudes 10 9) Special questions: related to personal hygiene, law observance, sexual practices, political ideology, health, religious practices, income, etc. There are different approaches for dealing with special questions, some examples are: Scales: a multi-item scale is used to build a construct, it is convenient to intercalate dual (positive and negative) statements. Circumvent: asking indirectly in a hidden way. Inference: asking about respondent’s opinion about third persons’ behaviour instead of their own case. Randomized questions they use randomness to preserve respondents intimacy, applying Bayes’ theorem. 11 Example: circumvent questions Instead of asking how many times you brush your teeth, that for some unhygienic interviewee may feel embarrassed and therefore lie about it, we might ask: How many people do you share your toothpaste tube? How many days does a toothpaste tube last in your home? What is your average monthly expenditure in toothpaste? Example: inference questions How frequently do you think your friends brush their teeth? 1. None 2. Once every few weeks 3. Weekly 4. Daily 5. Two or three times per day 12 Example: Scales Instead of asking for machoism attitudes directly, use a scale. The Neosexism Likert scale, based on the following statements: 1. Discrimination against women in the labor force is no longer a problem in the U.S. 2. I consider the present employment system to be unfair to women. 3. Women shouldn't push themselves where they are not wanted. 4. Women will make more progress by being patient and not pushing too hard for change. 5. It is difficult to work for a female boss. 6. Women's requests in terms of equality between the sexes are simply exaggerated. 7. Over the past few years, women have gotten more from the government than they deserve. 8. Universities are wrong to admit women in costly programs such as medicine, when in fact a large number will leave their jobs after a few years to raise their children. 9. In order not to appear sexist, many men are inclined to overcompensate women. 10. Due to social pressures, firms frequently have to hire underqualified women. 11. In a fair employment system, men and women would be considered equal. 13 Example: randomized questions Estimating incidence of theft. Toss a coin, given the result please answer the first question if you get head, and the second one if you get tail: ▪ Have you stolen any article from a retailer during the last two months? ▪ Is your zodiac sign Leo? Yes _____ No ____ Assume the following survey results: the sample size is 240 and 50 answer “Yes”, ANALYSIS: denote head as event A and tails as B, Probability of the coin = 1/2 for A, 1/2 for B, Probability(Leo)= 1/12. Now, notice that: P (answering Yes) = P (A) * P (answering Yes to the 1st question) + P (B) * P (answering Yes to the 2nd question) 50/240 = 1/2 * P (answering Yes to the 1st question) + 1/2 * 1/12 implying that P (answering Yes to the 1st question) = 0.3328 14 Determining the order of questions General types of question orderings: (it can be used within question blocks, and even for ordering the blocks) Funnel sequence: the procedure of asking the most general or unrestrictive questions first, followed by successively more restrictive questions. It is the most commonly used, especially if respondents have some idea on the topic. Inverted-funnel-sequence: inverts the funnel sequence in the sense that questioning begins with specific questions and concludes with general ones. It is used in personal and telephonic surveys. 15 General structure of a questionnaire Introduction (objectives, author or institution, ensure either confidentiality or anonymity, explain incentives, thanks the collaboration) Screening questions (e.g., informed consent to personal data collection, and questions to ensure the participant fits the population target) Insert blocks of questions (by topics) Initial questions that should be simple and interesting Each topic starts with general questions, then specific questions. First evaluation questions (what?) and after the diagnosis (why?), Place filter or logical questions when needed Place difficult or delicate questions slightly after the centre of the questionnaire. Before reaching the end, focus on the easy questions. Classification questions at the end of the questionnaire. Insert and end of survey message (e.g., Thank you for your cooperation.) Insert the interviewer's instructions (if the survey is not self-administered) as well as the supporting material as close as possible to the question where its use is required. 16 Layout In printed questionnaires use notebooks: they prevent any page from being lost or moved, they allow you to move from one page to another more easily, they provide a greater sense of professionalism. They should look easy to read. Use of white space and a nice typeface. Color coding. Color doesn't seem relevant, but different colors could be used to specify different subsamples. Numbering of the questions of the questionnaire: helps to realize that they have skipped a question inadvertently, and simplifies the work of the interviewer (especially when you have to skip some questions). Fit questions on one page, not split questions or answer categories into two pages. In multiple choice questions, general less confusion show the options vertically (as opposed to presenting them horizontally) In online surveys, when using measurement scales that require several responses, try to avoid using tables, because with the table it is difficult to answer using a mobile phone. In printed surveys, however, the table can make the respondent's task easier. © Mercedes Esteban-Bravo & Jose M- Vidal-Sanz 17 Some Errors to avoid: What type of religious do you practice? Jew___ Catholic ___ Protestant ___Others___ Specify___ None___ Do you think that Lenovo’s personal computers are the most compatible, and you get the best value for money? 1) Did you purchase a customer support renewal? Yes___ No___ 2) Do you think that a customer support services are a good investment? Yes___ No___ 3) How much does it cost to maintain customer support service? ___ Less than €30, ___ Between € 30 and €60, ___More than €60 18 Pre-testing questionnaire All aspects of the questionnaire should be pre-tested using a small sample. The pre-test should be conducted in an environment and context identical to the one that will be used in the.survey, A debriefing procedure should be used, explaining the target to the interviewers. Specialist should review all the pre-test process, analysing the presence of ambiguities and doubts, also asking to the respondents, leading to improved questionnaire. Collect statistical information on the pre-test sample, such as the empirical variance (and use it to determine the sample size for the final survey). Finally the survey can be launched. 19 Checklist for Unacceptable Questionnaires Major portions of questionnaire or key questions left unanswered Evidence that respondent did not understand instructions or did not take task seriously Missing pages Respondent not qualified for target population Questionnaire returned after cutoff date 20 3. Measurement Scales 21 Esta foto de Autor desconocido está bajo licencia CC BY-NC-SA “When you can measure what you are speaking about, and express it in numbers, you know something about it, when you cannot express it in numbers, your knowledge is of a meager and unsatisfactory kind; it may be the beginning of knowledge, but you have scarcely, in your thoughts advanced to the stage of science.” ― William Thomson, 1st Baron Kelvin 22 3.1. Attitudes Attitude is a person's predisposition to evaluate and respond to some stimulus. Generally, it is considered to have three components: cognitive, affective and behavioural. But actually, each of these components may involve multiple magnitudes. Attitude magnitudes can be defined in terms of: 1. A facet: an attitudinal direction, it can can be defined in one sense or pole (zero to positive, for example knowledge, purchase intention), or with two opposite senses or poles (e.g. hate-love). Bipolar attitude magnitudes can be interpreted as two confronted unipolar magnitudes. Remark. The pole or sense is generally described as having a positive valence (attractiveness or goodness) or a negative valence (adverseness or badness); e.g. Love has a positive valence, fear and hate negative. Ambivalent senses have positive and negative connotations (but this classification is relatively subjective). If a person holds simultaneously attitudes with positive and negative valence towards an stimulus, then we say that it has ambivalence. 2. Intensity: strength of the attitude feature towards a pole (null-weak-strong). For example, you might like marketing (between hate and love) - thus, your attitude towards marketing has a positive sense in the emotional facet. If you are crazy about it, your emotional attitude has a high intensity. 23 How to measure consumers attitude (direction, sense, intensity)? 24 How to measure attitude magnitudes A measurement scale is a mapping assigning (1) an attitudinal magnitude of a person towards an object or stimulus, to (2) numbers or other symbols, following certain pre-specified rules. To elicit a measurement of the attitudinal magnitude, respondents select the numbers or symbols in the scales that better represent their attitude towards the stimulus. When the stimulus has several attributes, we can study attitudinal magnitudes towards each attribute, or the global attitude. The scale can measure attitudes towards several objects simultaneously. Alternatively, we can study separately the attitude towards each object. 25 Different Ways (and scales) to Ask Same Question Can Yield Different Responses 26 Most of the scales fall in these categories: COMPARATIVE SCALES (the subject is asked to compare some objects directly one against other, e.g., pair-wise) or NON COMPARATIVE SCALES. FORCED SCALES (a scale that forces the respondents to express an opinion because “no opinion” or “no knowledge” options is not provided) or NON FORCED SCALES. In addition, they can have a neutral point or not. In the example below, a neutral point would say “neither dood nor bad” as central option. BALANCED (a scale with an equal number of favorable and unfavorable categories) OR UNBALANCED SCALES. Service quality is: Service quality is: Extremely bad Extremely bad Very bad Very bad Bad Somewhat Good Good Good Very good Very good Extremely good Extremely good BALANCED UNBALANCED 27 3.2. Stevens’ taxonomy of measurement scales In the early 1940’s, the Harvard psychologist S.S. Stevens coined the terms nominal, ordinal, interval, and Although criticized by statisticians, ratio to describe a hierarchy of Stevens’s categories still influence measurement scales, and classified marketing researchers, and are statistical procedures according to described in most textbook. the scales for which they were “permissible.” 28 Nominal: A scale whose numbers serve only as labels or tags for identifying and classifying objects with a strict one-to-one correspondence between the numbers and the objects. Ordinal: A ranking scale in which numbers are assigned to objects to indicate the relative extent to which some characteristics is possessed. Thus it is possible to determine whether an object has more or less of a characteristic than some other object (when you score neutral as 0, comfortable as 1, and very comfortable as 2, you should be wary of any procedure that relies heavily on treating "very comfortable" as being twice as comfortable as comfortable) Interval: A scale in which the numbers are used to rate objects such that numerically equal distances in the characteristic being measured. Ratio: The highest scale. It allows the researcher to identify or classify objects, rank order the objects, and compare intervals or differences. It is also meaningful to compute ratios of scale values. 29 Esta foto de Autor desconocido está bajo licencia CC BY-SA 30 How can you measure jumpers’ performance? 1 2 3 4 Attach labels to the Rank jumpers by the Judges rate the technical Measure the jump jumpers? Nominal scale length of their jump? quality of the jump? distance achieved: ratio Ordinal scale Typically an interval scale scale, the extremes are hard to interpret 31 In ordinal scales, only monotonously non decreasing transformation f are permissible (preserve the one-to-one link between objects and measurement items) If s(i) > s(j) then f[s(i)] > f[s(j)] For interval scales are permissible all linear transformations in which we add the same constant to each value and/or multiply each value by a constant are permissible, so that s(i) – s(j ) = c ([s(i)] – [s(j)]) Ratio scales preserve relative ratios, so permissible transformations satisfy: s(i)/s(j) = c ([s(i)]/[s(j)]) Thus, it is permissible to multiply ratio scale data by a constant, but we may not take logs or add a constant. Ratio scale data have a defined zero, which may not be changed. Nominal scales are invariant under any transformation that preserves the relationship between individuals and their identifiers. 32 ▪ Stevens only approved certain types of statistical techniques for each type of scale data, but this has been refuted by statisticians who are more flexible. ▪ Steven’s classification should not be taken strictly, because scale type is not just an attribute of the data, but rather depends upon the questions we intend to ask of the data and the additional information we might have. For example: University student identification number might can be considered as a nominal variable (a mere label). But if IDs are assigned sequentially, there could be more relationships, e.g. by the scores form an entrance exam at the university, and then they should be considered as an ordinal scale. But if, let say, the identification numbers are simply the students’ exam marks, the IDs would be an interval scale. If the marks were based on the time to solve an exercise, IDs would be a ratio scale. ▪ Nevertheless the taxonomy is widely used by marketing researchers, and you must know it. 3.3. Commonly used Scales There are a large number of scales, which try to measure data of different nature Some are easy and only a question on the subject is considered. Others are multiple, several related questions are formulated, and from the answers an overall value can be calculated. The idea is to reduce the degree of influence of the words chosen when formulating the question, and therefore reduce the measurement error. 34 Esta foto de Autor desconocido está bajo licencia CC BY-NC-ND MEASUREMENT SCALES SCALES COMPARATIVES NON COMPARATIVES Nominal Binary Multiple choice Pair comparison Verbal/ Semantic Forced ranking Stapel Ordinal Q-sort (also known as Class or Differential Semantic (or bipolar adjective) similarities) Likert Picture Continuum Equal-width Interval Guttman (not discussed in this course) Pair comparison in value (or Thurstone (not discussed in this course) $-paired metric) Interval Constant sum Direct quantification Ratio Reference alternative 35 Nominal, non comparatives Binary: To select between 2 alternatives Are you social network user? Yes No (=1) (=2) Multiple choice: To choose between several altenatives Where do you live? Europe America Asia Africa Oceania =1 =2 =3 =4 =5 36 Ordinal, non comparatives Semantic/Vebal: Respondents are instructed to check the category that best describes the intensity of their attitudinal facet towards the stimulus being measured. How do you feel? □ Delighted □ Pleased □ Mostly satisfied □ Mixed (about equally dissatisfied □ Mostly dissatisfied □ Unhappy □ Terrible Often there is a neutral category. The main limitation is that the words typically seem different for each respondent 37 Stapel: Respondents are asked to indicate how accurately or inaccurately a single adjective describes the object by selecting an even-numbered range of values (from -5 to 5). It is a multi- item scale, participants usually evaluate many adjectives, and the final measure is an average or weighted average. +5 +5 +4 +4 +3 +3 +2 +2 +1 +1 Cheap Easy to use -1 -1 -2 -2 -3 -3 -4 -4 -5 -5 The polar points are not defined, and therefore each respondent set the criteria subjectivelly. The interpretation of the results is difficult. 38 Differential Semantic or bipolar adjective. It is variation of the semantic scale for bipolar magnitudes, rather than attaching a description to each of the response categories, only two extreme categories are labeled. It is generally used as a multi-item scale. We would like to know your opinion about X-Rays diagnostic. For each factor, please put a check a mark in the line that best reflects your opinion about it between both extreme endpoints: 1) Useful :---:---:---:-X-:---:---:---: Useless 2) Unhealthy :---:-X-:---:---:---:---:---: Healthy 3) Kind :---:-X-:---:---:---:---:---: Cruel 4) Beautiful :---:---:-X-:---:---:---:---: Ugly 5) Pleasant :---:---:---:-X-:---:---:---: Unpleasant 6) Cheap :---:---:---:---:-X-:---:---: Expensive 7) Hard :---:---:---:---:-X-:---:---: Soft 8) Fast :---:---:-X-:---:---:---:---: Slow 9) Complex :---:---:---:-X-:---:---:---: Simple 10) Noisy :---:---:---:---:---:---:-X-: Quiet It can include numbers to anchor responses: Bad 1 2 3 4 5 6 7 Good © Mercedes Esteban-Bravo & Jose M- Vidal-Sanz Likert: Respondents are asked to indicate the amount of agreement or disagreement (from strongly agree to strongly disagree) on a five-point scale. 1 2 3 4 5 Computers Mac have (Strongly disagree) (Disagree) (Neither agree (Agree) (Strongly agree) high-quality. nor disagree) 1 2 3 4 5 Mac computers are robust 1 2 3 4 5 Mac computers are not cheap An overall (positive) attitude toward Lenovo can be computed using an average score: The negative statements can be reversed with respect to the neutral (e.g., interpreting the last question into an opposite statement (lenovo charges fair prices) transforming the 2 into a 4 (agree). The average attitude towards the brand is (5+3+4))/3, where we have transformed unfavorable items. In practice, it is very common to use a weighted average of all items. The scales items can be described different attitudes toward a product, using “dislike strongly”, “Dislike”, Neutral”, “Like”, “Like strongly”. 40 Building the scale: 1. First we collect a long series of items related to the attitude we want to measure, selecting those which express a clearly favorable or unfavorable. An item is a phrase or sentence that expresses a positive or negative to a phenomenon that we want to know. 2. The more favorable attitude one has, the higher the response score given to the item. The total sum of the scores is the overall attitude measure. 3. It is convenient to have interspersed items with a positive and a negative meaning respect to the attitude, and the answer is added or subtracted in the global measurement. It is convenient to include an additional column NA (no answer or don't’ know), if the topic is complex. It is better to include an odd number of alternatives (so that it is balanced and with a neutral point), e.g. 5 or 7 points. 41 Picture scales: Variant of semantics that instead of words uses graphic symbols. They are particularly useful for children and illiterates Please, tick in the face that best shows how much you enjoyed your vacations in our resort: 42 Another example: Thermometer Scale Instructions: – Please indicate how much you like McDonald’s hamburgers by coloring in the thermometer. – Start at the bottom and color up to the temperature level that best indicates how strong your preference is. Form: Like very much 100 75 50 25 0 Dislike very much 43 Continuum classification: Please indicate your opinion regarding the kindness of the staff working in this restaurant: 1 2 3 4 5 6 7 8 9 10 Not at all friendly Very kind 44 Equal width interval. Depending on the measured magnitude and how it is built, this scale might be an ordinal or an interval scale in the sense of Stevens. For example, used to measure age or income intervals is usually interpreted as an interval scale. Please indicate in which category falls your total household income: Please indicate your age: □ Less than $10,000 □ Under 30 years old □ $10,000-$19,999 □ 30-39 years □ $20,000-$29,999 □ 40-49 years □ $30,000-$39,999 □ 50-59 years □ $40,000-$49,999 □ 60 years or older □ $50,000-$59,999 □ $60,000-$69,999 □ $70,000-$79,999 □ $80,000-$99,999 □ $100,000 -$109,999 □ $110,000 -$119,999 □ $120,000 and over When the intervals are more thorough, the scale tends to be closer to an “interval scale” in the sense of Stevens (notice that the first and last intervals are often less comparable in length). © Mercedes Esteban-Bravo & Jose M- Vidal-Sanz But if it is used this scale to evaluate preferences, then it is just an “ordinal scale” in the sense of Stevens. How do you rate the writing style of this book? Between Between Between Between Between 0 and 2 3 and 4 5 and 6 7 and 8 9 and 10 points points points points points © Mercedes Esteban-Bravo & Jose M- Vidal-Sanz Ordinal, Comparatives Forced ranking: A respondent is presented with several objects simultaneously and asked to order or rank them according to some criterion. The data obtained is ordinal. Please rank the following five brands in terms of your preference (1, denote Coke 2 the most preferred brand; 2 the second Pepsi 3 most preferred, and so on) 7-Up 1 Dr. Pepper 5 Slice 4 Drawback: it is difficult to answer if there are many options 47 Paired Comparison: A respondent is presented with two objects at a time and asked to select one objects in the pair according to some criterion. The data obtained is ordinal. Please indicate which of the Pepsi Dr. Peper following soft drinks do you Pepsi Slice 7-Up Dr. Peper Prefer, by circling your 7-Up Slice preferred brand in each pair: Dr. Peper Slice It is easier to answer than forced ranking, yet we cannot have a very large number of alternatives, of the value of pairs increases too much. With m alternatives the number of pairs is the combinatory number m   2 48 Transformation of paired comparison data into some overall order : 1. Take the reference percentages. For example, if we assume that all pairs formed with A, B, C, D (row, column). 2. Allocate 1 to cells with a value higher than 0.5, and 0 otherwise. 3. Sum the points by column, and define the order from the largest to the smallest value figure. 49 Q-sort (also known Classify the following 100 cards showing brand logos according to the as Class or following classification stacks: similarities): Prefer most Like Neutral Dislike Prefer least A procedure to simultaneously evaluate many objects or cards, classifying them in categories ordered with respect to some criterion. 50 Interval, comparatives Graded Paired Comparison (Dollar Metric): gets paired comparison judgments of both which brand is preferred and the amount (in value) by which it is preferred. For each pair, circle your preferred brad Coke, Pepsi 2c and indicate how much extra would you be Coke, 7-Up 8c willing to pay to get it with respect to the Coke, Dr. Pepper 5c less preferred brand. Coke, Slice 12 c Pepsi, 7-Up 6c 51 Ratio, comparatives Constant Sum Scale: A comparative technique in which respondents are required to allocate a constant sum of units such as points, dollars, chits, stickers, or chips among a set of stimulus objects with respect to some criterion. Distribute 100 points between Price 35 the characteristics shown Motor 20 so that it reflects what is the importance of each Cylinder capacity 15 when buying a Consumption 20 car. Exterior design 10 100 52 Ratio, non-comparatives Direct quantification: The simplest way to obtain ratio scaled data is to ask directly for quantification of a construct that is a ratio scaled. For example, –How many T-shirts do you own?____________ –How old are you?_________ The problem with this approach is that the respondent probably doesn’t know or want to reveal what the exact answer is. 53 Reference alternative or fractionating scale. It is used to measure magnitudes in which ratios are meaningful. This approach has respondents compare alternatives with a reference alternative. Look at the volume of the bottle of perfume (100 cl.) If the seriousness of a murder is assigned 100 points. How do that I am showing to you and assign 100 points to it. you rate the seriousness of each of the following crimes? Now, I will ask you to rate the volume of the following bottles, as a fraction of the original one. Manslaughter arson Alternative Bottle A _____ kidnapping burglary Alternative Bottle B _____ rape theft Alternative Bottle C _____ robbery larceny Assault and battery Forgery Alternative Bottle D _____ vandalism Narcotic violation © Mercedes Esteban-Bravo & Jose M- Vidal-Sanz Measuring with a scale, there are errors, if the error is sistematic (generates a bias with respect to what we want to measure, the scale is unvalid). A different problema is when the scale is too noisy generating too much variability, then we say that the scale is unreliable. Good scales should be valid and reliable. RELIABILIT (Yes) (No) Reliable & Valid unreliable & valid (Yes) VALIDITY unreliable & unvalid Reliable & unvalid (No) © Mercedes Esteban-Bravo & Jose M- Vidal-Sanz When designing a new scale, it is necessary to study There are widely its practical behavior (if it is valid and reliable). used standardized If existing scales are used, which are tested, it is not necessary to check their behavior, and they can be scales: used directly. There are many compilations with commonly used ones 56 © Mercedes Esteban-Bravo & Jose M- Vidal-Sanz Sampling (finite populations) is the branch of statistics concerned with the selection of a subset of elements (a sample) from a finite set (population) to infer statistical knowledge about the population from the analysis of the sampled units. The analysis of the whole population is known as a census. Typically the population is quite large and we use sampling methods to save time and money. 4. Sampling finite populations 57 Esta foto de Autor desconocido está bajo licencia CC BY-NC Basic concepts Population/Universe - The set of all the elements, sharing some common set of characteristics, that comprise the universe for the purpose of the marketing research problem. Extent: refers to the limits (geographical boundaries). Time: the time period under consideration. Census - A study based on the whole population (this is time consuming and expensive),. This is why we generally use samples. Sample - A sub-group of the elements of the population randomly selected for participation in the study. Sampling frame: A representation of the elements of the target population. It consists of a list or set of directions, for identifying the target population. For example, the telephone book, an association directory, a mailing list,… 58 Parameters: the value of some statistical measure that can be computed for the population (e.g. the average or sample variance for some quantitative variable). We think of it as a deterministic value. Estimators: It is the value of a statistical measure computed from the sample data. As samples are randomly draw, the estimators are random variables. Elevation coefficient: It is the ratio N/n between the population size (N) and the sample size (n). Its inverse is the sampling fraction, n/N. 59 Let consider a population with N elements numbered as 1,...., N  The population measures for a variable X is given by the set X 1,...., X N  The typical population parameters are  = N −1 i =1 X i , N  2 = N −1 i =1 ( X i −  )2 = N −1 i =1 X i2 −  2 N N S = i =1 X i = N  N We drawn a sample of n>>n (large universe) In the pretest, the results have given a variance of 5500. n= 845, 1. We chose 846 individuals Set the level of confidence (95% to interview (rounding numbers up ) and with a maximum error of 5 €) 83 If our focus is on a dichotomic variables (a proportion), then the exact formula is 2 Variables 0-1 z p (1-p) N n= p = proportion of ones 2 2 e N + z p (1-p) With large populations (more than 100,000), the approximation is 2 z p (1-p) n= 2 e The value p can be estimated from pretest samples, or be set equal to the value maximizing the variance p(1-p), in other words p=0.5 © Mercedes Esteban-Bravo & Jose M- Vidal-Sanz Example: Important figures: You want to determine the quality n=? and level of service offered by our e = 5% =0.05 library. Therefore it is necessary to Z = 1.96 (Normal distribution table interview the users who come to the for 95% reliability and 5% error) o service. How will we calculate the size of the sample? N= 43,700 (small universe) Establish the confidence level (95% p = 0.50 and a 5% maximum error e) The sample frame of the registry of n= 380,81. The sample size should be users of the last year is obtained 381 individuals (rounding numbers up) and that casts the figure of 43,700. 85 5. Survey Errors 86 Esta foto de Autor desconocido está bajo licencia CC BY-SA Total survey error “Total Survey Error refers to the accumulation of all errors that may arise in the design, collection, processing, and analysis of survey data. A survey error is defined as the deviation of a survey response from its underlying true value.” (Ref: Public Opinion Quarterly Volume 74 Number 5, Special Issue, 2010). 87 Total error can be decomposed in 1 2 Sampling or experimental error. Statistical fluctuation Non-sampling error. It refers to all other factors that that occurs because of chance variation in the contribute to error in the derived estimate. For elements selected for the sample. More specifically, example, poor questionnaire design, interviewer error, this error refers to the differences between an coding errors. estimate derived from a sample survey, and the "true parameter" that would be obtained if the whole survey population were enumerated. It varies with the sample size, sample design, the sampling fraction and the variability within the population. 88 The main problem arises when any of these errors (of sampling or not) introduces a systematic component that results in a bias in the estimates. A) Typical biases due to sampling errors, are the use of an impropriated sampling frame (including elements that do not belong to the population and or excluding population elements), non-response biases due to refusals, and due to non-at-home problems. Nonresponse biases happens when the individuals not responding are different from the ones participating. This happens if the reason for non-response is related to the topic being studied. A typical consequence is that extreme positions are overrepresented (those holding them are more likely to participate), and people indifferent to the problem may be underrepresented. 89 Methods to improve response rates Reducing Reducing not-at- refusals home Prior Incentives Follow-up Callbacks notification 90 B) Typical biases introduced by non-sampling errors, are administrative errors, respondent errors, and measurement errors (validity and reliability of scales). Administrative errors: The execution of the investigation is incorrect The interviewer falsifies the results The interviewer generates biases for her/his mistakes There are errors processing data Respondent errors: (respondents tend to answer questions with a certain slant that consciously or unconsciously misrepresents the truth) If it is conscious : it may be due to privacy reasons, the physical or social environment, lack of time and fatigue, the use of ambiguous or complex language in the questionnaire, fear of organization that collects information (e.g., Taxes office), the individual likes to distort the truth , etc. If it is unconscious: It may be due to the inability of the individual, his forgetfulness, a tendency to please the interviewer, the desire to show a higher social level, etc. 91 UC3M UC3M

Marketing Research II: Survey Design PDF

Document Details

Tags

Related

Summary

Full Transcript

Upgrade to continue