Wolf (1978) Social Validity - AQA PDF

Summary

This article by Montrose M. Wolf discusses social validity, a concept that's important in applied behavior analysis. The author argues for the integration of subjective measurement in the field; exploring the role of subjective experiences and societal values in research and practice.

Full Transcript

JOURNAL OF APPLIED BEHAVIOR ANALYSIS 1978, ll. 203-214 NUMBER 2 (SUMMER 1978) SOCIAL VALIDITY: THE CASE FOR SUBJECTIVE MEASUREMENT or HOW APPLIED BEHAVIOR ANALYSIS IS FINDING ITS HEARTI...

JOURNAL OF APPLIED BEHAVIOR ANALYSIS 1978, ll. 203-214 NUMBER 2 (SUMMER 1978) SOCIAL VALIDITY: THE CASE FOR SUBJECTIVE MEASUREMENT or HOW APPLIED BEHAVIOR ANALYSIS IS FINDING ITS HEARTI MONTROSE M. WOLF UNIVERSITY OF KANSAS I apologize, but I must begin making my What was the purpose of our journal? It was case for subjective measurement by recounting a question that was clearly more important than to you my own experiences with it over the past the others I had been asked. So I decided to con- few years. Almost a decade ago, when the field sult the Gods but, as usual, Don Baer, Don of applied behavior analysis was beginning to Bushell, Barbara Etzel, Vance Hall, Bill Hop- expand so rapidly, we were faced with the task kins, Judy LeBlanc, Keith Miller, Todd Risley, of putting together the Journal of Applied Be- and Jim Sherman were not in their offices. How- havior Analysis. For a period of several months ever, I did find Don Baer in the hall. So I asked Garth Hopkins, who was our managing editor, Don, "What is the purpose of JABA?" and Don presented us with a series of unexpected deci- said in his usual offhand but eloquent way, "It sions to make; like: What color should the is for the publication of applications of the anal- paper be? And did we need a paper that would ysis of behavior to problems of social impor- hold together for two thousand years or were tance." Well, that sounded so reasonable that we willing to live with a shelf-life of only a it had to be true. So that is what I put in the thousand years? And so on. Journal and it went to press. Just a couple of days before we were sched- There was only one small problem; I wasn't uled to go to press with our very first issue, sure what "social importance" meant or, worse Garth called with one more question. "What is still, how to measure it. And, as I am sure you the purpose of the Journal of Applied Behavior can appreciate, the more I thought about this Analysis?", he asked. He said we needed to put the more concerned I became. a description of the purpose on the inside front The dictionary only added to my distress. cover, as one finds in other journals. He needed According to my New Webster's Vest Pocket an answer almost immediately. Dictionary (1962) importance simply meant "having value" and of course, social meant "per- 1This manuscript was presented as an invited ad- taining to society". Thus, something of social dress to the Division of the Experimental Analysis of Behavior, American Psychological Association, Wash- importance would have to be judged by some- ington, D.C., September, 1976. Many valuable sug- one as having value to society. gestions regarding this manuscript were made by Don Unfortunately, that sounded slightly subjec- Baer, Curt Braukmann, Steve Fawcett, Dean Fixsen, tive to me. And subjective criteria have not been Bill Hopkins, Frances Horowitz, Kathi Kirigin, Jack Michael, Keith Miller, Todd Risley, Jim Sherman, very respectable in our field. We have consid- and Sandra Wolf. Preparation of the manuscript was ered ourselves a natural science, concerned about partially supported by Grants MH20030, MH13644, the objective measurement of natural events and MH13881 from the National Institute of Men- tal Health (Center for Studies of Crime and Delin- such as arithmetic problems worked correctly, quency) to the Department of Human Development litter picked up, sexual responses occurring, and and the Bureau of Child Research, University of social skills learned. We have considered our- Kansas. Reprints may be obtained from Montrose M. Wolf, Department of Human Development, Univer- selves to be like the other natural sciences: like sity of Kansas, Lawrence, Kansas 66045. physics, chemistry, and biology, which concern 203 204 MONTROSE M. WOLF themselves with the objective aspects of nature inated American university psychological and profitably abandoned the subjective dimen- life." (Watson, 1930). sions of natural events sometime in their pri- mordial past. B. F. Skinner, in Science and Human Behav- We have considered ourselves to be distinctly ior (1953), also argued forcefully against sub- purer and more objective than most of our sister jective measures of private events. He began by social sciences. We have looked especially ask- pointing out the implications of the discrimi- ance at our colleagues in sociology, anthopology, nated operant model of language. He described psychiatry, and humanistic psychology because how a community can reinforce and thus de- they often mix into their sciences difficult-to- velop reliable verbal reporting of public events digest portions of subjective measurement. because both the community and the individual But psychologists have not always been so have access to these events. On the other hand, suspicious of subjective data. For some time, he pointed out that since the community cannot and until the first decades of this century, intro- have access to private events, the use of psy- spection was the basic method of psychology. chology of introspective or subjective data leads As you no doubt remember from your history to serious questions about reliability. Skinner of psychology course, introspection is defined as continued, the observation or examination of one's own "The layman also finds the lack of a re- mental, emotional, or feeling states. The sub- liable subjective vocabulary inconvenient. jects' verbal descriptions about sensations, pri- Everyone mistrusts verbal responses which vate events, and feelings such as pleasantness describe private events. Variables are often and unpleasantness had been taken to be the operating which tend to weaken the stim- primary subject matter of psychology (Boring, ulus control of such descriptions, and the 1950). As a reaction against introspection in reinforcing community is usually power- psychology and in science generally, there arose less to prevent the resulting distortion. The positivism from Bridgeman in physics and from individual who excuses himself from an Comte, Mach, and Feigl in philosophy. To quote unpleasant task by pleading a headache Edwin Boring (1950) about its impact: cannot be successfully challenged, even "The movement was positivistic. It was an though the existence of the private event attempt to get back to basic data and thus is doubtful." to increase agreement and diminish the While defining a functional analysis for us, misunderstandings that came about from Skinner (1953) urged us to concentrate on the unsuspected differences in meaning. Expe- objective behavioral data in our science as in rience [introspection) had proved unsuc- the following quotation: cessful as the scientific ultimate." (Boring, 1950) "The objection to inner states is not that they do not exist, but that they are not rele- John Watson began page one of his book vant in a functional analysis.... In dealing Behaviorism in the following manner: with the directly observable data we need not refer to... the inner state...." "Two opposed points of view are still dom- inant in American psychological thinking Having been well trained in these traditions, -introspective or subjective psychology, we all agreed that in our journal, everything and behaviorism or objective psychology. would be measured in objective ways. We would Until the advent of behaviorism in 1912, avoid subjective measurement-that would be introspective psychology completely dom- a first priority. Some of the members of the SOCIAL VALIDITY 205 JABA Board of Editors even wanted to restrict in the field strongly suggests that these be main- us to using only mechanically recordable be- tained rigorously. Except, of course, in the spe- havior in our applied research. They wanted a cial case of everyone's own manuscripts which, microswitch under every schoolroom chair and because of their unusual significance, merit spe- under every bed. They were even suspicious of cial consideration. In any event, among the observer measurement systems that contained standards that I was entrusted to uphold was reliability checks. Yet I, in a moment of haste, that of requiring objective, reliable data. Thus, had committed our journal to a goal, to an ul- you can appreciate the concern I began to feel timate criterion, to a reason for being, that was when some of our most esteemed colleagues be- clearly and simply subjective and that we had gan submitting articles to JABA that included no good way of measuring. undisguised, blatantly subjective data. You can imagine what I expected. I prepared One of the first came from, of all people, for an onslaught of abuse, invective, and ridicule Bob Jones and Nate Azrin (1969). They had from our editors and our reading audience. "So- been conducting an exquisite series of experi- cial importance? Bah! Humbug!", I thought ments on the effects of rhythm and stimulus they would say. To my surprise and relief, what duration on stuttering behavior. They had happened was that people seemed pretty much shown, very nicely, that they could almost com- to accept it. Many even seemed to know what pletely eliminate stuttering by having the stut- it was. For example, JABA editors often re- terers synchronize their speech with a simple, ferred to it in their reviews and used it as a regular beat. They had also developed a portable basis for recommending or not recommending practical piece of apparatus that would present manuscripts for publication. The editors most the beat tactually, and privately, thus avoiding frequently reported that the particular manu- embarrassment to the wearer. Their results in- scripts that they had been asked to review didn't dicated that they were on the verge of an im- have very much of it. On the other hand, they portant solution to stuttering. There had been reported that a few manuscripts had a moderate one problem, however. The speech, although amount of it. And an occasional one or two had almost stutter-free, was complained about by a lot of it. This made me feel somewhat better. listeners as sounding artificial. [The next sen- Although I wasn't sure what it was or how to tence is to be read with a monotone with a dis- measure it objectively, it was clear that many of tinct beat.) Apparently, they did not stutter, but my colleagues had no trouble at all in recog- they did not talk very naturally, either. nizing it. To deal with this problem, Jones and Azrin I was also fearful of criticism from our read- systematically explored various beat durations. ing audience. And we did receive occasional Then,-and this was the difficult part-they complaints about social importance. But pri- asked judges to rate the "naturalness" of the marily they wanted to know why the research speech at various beat durations. The judges that appeared in JABA was not more socially reported that the speech sounded most natural important. That criticism was easy for me to live to them at between two and three seconds of with. I just blamed our authors. If the readers beat duration. had taken me to task for using a fuzzy subjec- I wanted to phone Jones and Azrin and say, tive criterion like "social importance", then I "Hey you guys, do you realize what you are do- would have had no excuse. ing to me and the journal? Do you realize what But the issue of subjective measurement con- kind of precedent you will be setting with your tinued to make my life complicated. One of the 'naturalness'? Why, the people in our field who functions of a chief editor is to uphold the are not as sophisticated as you and me and who standards of the journal. And almost everyone are easily influenced will begin to think that it 206 MONTROSE M. WOLF is possible to measure how people feel about all one says they like it or not. Besides, look at the kinds of subjective things. I know that 'natural- precedent that it will set. Before long, those ness' sounds innocent enough, but think about who don't appreciate the extreme risks of sub- it a moment. If you publish a measure of 'natu- jective data will start asking for feedback from ralness' today, why tomorrow we will begin the participants in their treatment programs. seeing manuscripts about happiness, creativity, Who knows where that will end?" affection, trust, beauty, concern, satisfaction, fair- But I felt sure that McMichael and Corey ness, joy, love, freedom, and dignity. Who knows would just say that feedback from participants where it will end? Think for just a moment. is not a trivial issue: that if the participants don't What is that going to do to us and to the field like the treatment then they may avoid it, or of applied behavior analysis?" run away, or complain loudly. And thus, society But I was sure that they would have just said will be less likely to use our technology, no that they would agree that it was going to com- matter how potentially effective and efficient it plicate our science a bit. But if those things might be. described by subjective labels were the things At the same time that I was having to wrestle that were most important to people, then those with the problems of subjective measurement were the things, even though they might be in JABA, my colleagues and I in the Achieve- complex, that we should become more con- ment Place Research Project were having some cerned with. After all, as an applied science of problems with unsolicited subjective feedback human behavior, we supposedly were dedicated on similar issues. Colleagues, editors, and com- to helping people become better able to achieve munity members were asking us about the behav- their reinforcers. ioral goals that we had chosen for training the Well, it didn't stop with Jones and Azrin. teaching-parents and the youths participating in At about the same time I received a lovely the community-based, family-style, behavioral manuscript from Jim McMichael and Jeff Corey treatment program at Achievement Place. They (1969) in which they reported the exciting would ask us: "How do you know what skills finding that college students in a Keller-type to teach? You talk about appropriate skills this PSI (Personalized System of Instruction) course and appropriate skills that. How do you know did better on the exam than the students in a that these are really appropriate?" We, of traditional lecture course. This was, of course, course, tried to explain that we were psycholo- a very important finding, as it replicated and gists and thus the most qualified judges of what substantiated Keller's research. The only prob- was best for people. Somehow, they didn't seem lem was that they also asked the students in each convinced by that logic. course how much they liked their course. The In addition, the first time we tried to replicate students in the PSI course rated their course the Achievement Place program in another com- a great deal higher than the students in the munity, that community gave us feedback in a traditional lecture sections. most drastic manner. Before we really knew "Well," I thought to myself, "What in the that they had complaints about our program they world am I going to do with this one? They had "fired" us. Finally, there were those who are asking the participants in a behavioral treat- were challenging the importance of some of the ment program how much they like it. Why, of results of the training that we were reporting. course they should like it. After all, we are do- "Yes," they would say, "there are changes in ing it to them for their own good aren't we? the behavior, but how do we know that they And even if they say they don't like it, we know are really important changes?" what is best for them. Clearly, if the procedure The message we seemed to be getting was is effective, its just not important whether any- that "social importance" was a subjective value SOCIAL VALIDITY 207 judgement that only society was qualified to Thus, in order to be responsive to our com- make. If our objective was, as described in munities and to our data, one of our challenges JABA, to do something of social importance, became to try to determine the behaviors that then we needed to develop better systems and teaching-parents need in order to "relate to their measures for asking society whether we were youths". "What do some people have that makes accomplishing this objective. The suggestion kids like them? And how were we going to seemed to be that society would need to validate find out?", we asked ourselves over and over. our work on at least three levels: "Relating" appeared to be such a complex be- havioral puzzle of subtle social behaviors that 1. The social significance of the goals. Are we were not sure how to begin our behavioral the specific behavioral goals really what analysis. We did have the Jones and Azrin ex- society wants? ample for measuring "naturalness", and we 2. The social appropriateness of the proce- came upon another method from, of all places, dures. Do the ends justify the means? the Rogerian counselling psychologists. That is, do the participants, caregivers and Haase and Tepper published an article in the other consumers consider the treatment Journal of Counseling Psychology in 1972 that procedures acceptable? was a great deal of help to us. Like so many 3. The social importance of the effects. Are Rogerians, Haase and Tepper were interested in consumers satisfied with the results? All "empathy". They wanted to see if they could the results, including any unpredicted find out what nonverbal behaviors of the coun- ones? sellor were involved in empathy in order to be better able to teach and evaluate counsellors in We have come to refer to these as judgements training. They set up simulated counselling situ- of social validity. It seems to us that by giving ations that contained various nonverbal com- the same status to social validity that we now ponents, such as level of eye contact, trunk give to objective measurement and its reliability lean (forward or backward), body orientation we will bring the consumer, that is society, into (toward the client or rotated away from the cli- our science, soften our image, and make more ent), distance from the client and various levels sure our pursuit of social relevance. of "empathic" verbal messages". Videotaped ex- An example from our own experience in the cerpts were then presented to experienced coun- Achievement Place Research Project is that we sellors, who rated the amount of overall em- were told by many communities that one of the pathy presented in each excerpt. It was found most important characteristics of teaching-par- that eye contact, trunk lean, distance, and verbal ents that they wanted was "warmth". When content were all related to the judgements of quizzed about "warmth", the community mem- empathy. One result that really seemed to sur- bers indicated that they wanted teaching-par- prise the authors was that the nonverbal be- ents who "know how to relate to youths". For haviors accounted for more than twice as much some time, our response to this request was to of the judgements of empathy than did the ver- disagree with them. We argued, "What you bal behaviors. A counsellor who was saying really need is someone who knows how to give something only moderately empathic was judged and take away points at the right time." But the to be highly empathic if he or she were also results of our research (Braukmann, Kirigin, engaging in eye contact, forward trunk lean, and and Wolf, 1976) are tending to support the were positioned close to the client. community's commonsense wisdom about the Well, it occurred to us that this model could importance of teaching-parents being able to be used to analyze the meaning of all kinds of "relate to youths". complex and subjective verbal labels. It also 208 MONTROSE M. WOLF looked like a way to find out what some of the study was that he was not able to predict the behaviors were that made some teaching-par- behaviors of the teaching-parents that were go- ents better than others in being able to "relate ing to be most liked by the youths. As a matter to youths". Alan Willner, with Curt Brauk- of fact, some of the behaviors that he thought mann, Kathi Kirigin, Dean Fixsen, Lonnie Phil- would be most important to the youths were lips, and I (Willner et al., 1977) began to never mentioned by them. He still wasn't con- attempt to identify the interaction behaviors of vinced. After all, maybe the youths just couldn't teaching-parents in Achievement Place style verbalize these subtle behaviors-which of group homes the youths liked and didn't like. course was a real possibility. In this case, how- Alan Willner had several youths look at video- ever, he cross-validated the original behaviors taped examples of a variety of teaching-parent/ by giving the youths more structured interviews, youth interactions and to list the things that in which he included more detailed descriptions they liked and the things that they disliked. of the behaviors that he thought should also be These comments were put into categories and important to them. The youths still rated those then rated by the youths on an A, B, C, D, and behaviors as much less important than the ones F basis. The youths gave A's to the following that they had earlier pointed out as important. teaching-parent behaviors: a calm, pleasant This same outcome was found with youths who voice tone, offers to help, joking, fairness, ex- were not involved in the first set of interviews. planations, concern, enthusiasm, politeness, and It has become clear to us that we cannot pre- getting to the point. F's were given to the fol- dict very well what many subjective labels of lowing teaching-parent behaviors: throwing ob- complex behavioral phenomena are going to jects, accusing, blaming statements, shouting, no mean to our judges. Nevertheless, while the task opportunity provided to speak, insulting remarks, of unravelling those social behaviors that are in- unfair point exchanges, and profanity. Willner volved in knowing how to "relate to youths" is then took some of the highest rated social be- incomplete, Alan Willner has taken us closer to haviors, taught them to teaching-parent trainees, that goal. and found that youths rated these trainees much Another example of the use of the social higher after the trainees received instruction in validation method to examine the social validity the youth-preferred behaviors.2 of behavioral goals is a study by Neil Minkin, One important sidelight of Alan Willner's with Curt Braukmann, Bonnie Minkin, Gary Timbers, Barbara Timbers, Dean Fixsen, Lonnie Phillips-and me (Minkin et al., 1976). Neil 2Jack Michael (personal communication, 1976) has Minkin wanted to determine what conversa- pointed out that some behaviors, identified as pre- tional skills of adolescent girls were relevant. ferred by this method, may have acquired their rein- forcing value by their usually being members of He took videotapes of adolescent girls in con- chains of behaviors. An example might be offers to versations with adults and of university girls in help. It is possible that if offers to help were not conversations with adults. Judges from the com- often followed by providing help, the offers them- munity were then asked to rate the effective- selves would lose their reinforcing value. Similarly, behaviors described as showing concern may have ness of each of these girls as conversationalists. the same relationship to a more complex chain of As might be expected, the community people behaviors. Thus, there appears to be an important judged the university girls to be more effective and not, as yet, well understood "sincerity" dimen- sion that should be brought to the attention of any- and ranked them higher. Minkin and others one who may want to apply these findings. On the reviewed the videotapes of all the university and other hand, some of the behaviors identified as pre- junior high-school girls several times, and de- ferred may not be dependent on later events for their termined that a composite score of three kinds reinforcing value. Examples might be joking and explanations. of behavior correlated at the 0.84 level with SOCIAL VALIDITY 209 the ratings given by the community representa- of Jones and Azrin (1969) and the work of tives. (The three behaviors were: time spent Haase and Tepper (1972), we find that we can talking, conversational questions, such as "What establish the social importance or validity of are you taking in school?", and positive feed- complex classes of behavior that have subjective back behaviors such as "Uh huh", "Yeah", and labels. By supplementing our traditional objec- "Great!") In this manner it was possible to iso- tive measures, we can determine the relationship late many of the behaviors that the community between the objectively measured behaviors and representatives clearly were responding to when the subjective labels. This procedure opens op- they rated overall quality of a conversation. portunities to explore all of the important goals Another example of the social validation of that are described by subjective labels. behavioral goals, conducted by the Achievement To summarize the method for determining Place group, was carried out by Jack Werner, goal behaviors, I quote from Minkin and his with Neil Minkin, Bonnie Minkin, Dean Fixsen, colleagues (1976): Lonnie Phillips-and me (Werner et al., 1975). Police exercise a great deal of discretion in "For example, 'affection' might be con- handling juvenile offenders. Less than one- sidered a complex social behavior. If the fourth of those youths who come into contact goal of a behavior analyst were to teach a with police officers and who could be taken parent to be more affectionate towards his into custody actually are taken into custody. Ac- or her child, it would be necessary to specify cording to Piliavin and Briar (1964), the vio- the important component behaviors of af- lation per se is usually less influential in deter- fection. Some of the components might in- mining the choice of disposition than is the clude touching, smiling, and hugging. To demeanor of the youth. It is often estimated that validate the social importance of these the social behaviors of the youths account for behaviors, four steps might be used. First, approximately 50% of all decisions regarding gathering sample parent-child interactions. prejudicial handling of youths. Jack Werner Second, developing reliable definitions and wanted to identify some of the important be- recording specific behaviors. Third, em- havioral components of youth-police interac- ploying relevant judges, that is, other par- tions so that he could teach these to youths. ents or children, to rate the sample inter- Through informal interviews and then formal actions and evaluate each parent as to the questionnaires, Werner and his colleagues iden- amount of affection shown to the child tified several apparently important behaviors, within the interaction. The evaluation in- including expression of cooperation, body orien- strument might be a bi-polar rating scale tation so that the youth was facing the officer, with the poles labelled as to the amount of and politeness. Werner found that these behav- affection shown. Step four would involve iors could be reliably measured, thus partially correlating the ratings of the judges with solving the behavioral puzzle of what objec- a composite score of the objectively mea- tively measurable youth behaviors may influence sured behaviors of the parents. The sub- police officers' decisions about custody. sequent correlation coefficient would indi- So, rather than deciding by oneself the valid- cate the level of relationship of the speci- ity of the behavioral objectives of a treatment fied objectivity measured components of program, we can approach the specific consumer affection to the common English 'meaning' or representatives of the relevant community, of affection as rated by the judges. Some of and through interviews or ratings determine the important behavioral components of much more precisely what the socially signifi- creativity, conversation, and affection, as cant problems are. And, based on the example well as other complex classes of social 210 MONTROSE M. WOLF behaviors, could probably be identified likelihood that the program will be adopted and through the use of these social validation supported by others. procedures." The third dimension of social validity is the social importance of the effects of behavioral It is clear that a number of the most impor- treatment. Are consumers satisfied with the re- tant concepts of our culture are subjective, per- sults, all of the results, including those that haps even the most important. Martin Luther, were unplanned? Behavioral treatment pro- as the story goes, was severely criticized for grams are designed to help someone with a setting Potestant hymns to the popular melodies problem. Whether or not the program is help- of songs and dances of the time. He replied, ful can be evaluated only by the consumer. Be- "Why should we let the devil have all the best havior analysts may give their opinions, and tunes?" Well, why should we let the others these opinions may even be supported with em- have all of the best human goals and social pirical objective behavioral data, but it is the problems? participants and other consumers who want to A second kind of social validity that has im- make the final decision about whether a pro- pressed its importance on us is the social ap- gram helped solve their problems. Many be- propriateness (in terms of ethics, cost, and prac- havior analysts are beginning to validate their ticality) of the treatment procedures that we objective data with systematic subjective mea- use. Again, behavior analysts are beginning to sures of consumer satisfaction. ask clients and care-givers systematically about For example, Ron Kent and Dan O'Leary the acceptability of their procedures. Foxx and (1976) found the ratings by teachers and parents Azrin (1972) found restitution procedures more of child behavior also improved when their ob- acceptable to care-givers than timeout or shock jective data showed increases in appropriate punishment. These authors have also reported school behavior. Karen Maloney and Bill Hop- over-correction to be a re-education procedure kins (1973) determined that when they modified that is acceptable to care-givers of the retarded. the sentence structure of stories written by ele- Janet Porterfield, Emily Herbert-Jackson, and mentary school children, judges' ratings of crea- Todd Risley (1976) recently determined that tivity also increased. This is to be contrasted "contingent observation" (that is, having to stop with the findings of Tom Brigham, Paul Grau- playing and just watch your playmates for sev- bard, and Aileen Stans (1972), who were also eral seconds) was not only an effective proce- attempting to improve quality of composition dure for reducing the disruptive behavior of of school children, and found that some contin- young children in a day-care setting, it was also gencies that increased objective dimensions had found to be acceptable to the care-givers and to little effect on subjective ratings of quality, while the parents of the children. other contingencies produced increases in both Our own data show that ratings by the youths objective measures and subjective ratings of in Achievement Place style homes of the fairness story quality. Steve Fawcett and Keith Miller of the program and the concern of the teaching- (1975) demonstrated that an instructional pack- parents correlate very highly with the number of age designed to enhance public-speaking behav- offenses that the youths commit while they are ior was effective in producing increases in both in treatment (Braukmann, Kirigin, and Wolf, the objectively measured public-speaking behav- 1976). It may be that not only is it important to iors and in the audience's ratings of the quality determine the acceptability of treatment pro- of the performance of the trainees. cedures to participants for ethical reasons, it We have described the Achievement Place may also be that the acceptability of the pro- research of Willner, Minkin, and Werner and gram is related to effectiveness, as well as to the their colleagues, where judges were used to de- SOCIAL VALIDITY 211 termine socially valid dimensions of teaching- How well do they represent the quality of parent/youth interaction behavior, quality of national life? How valid are they as mea- conversation components, and significant ele- sures of the goodness of life in this coun- ments in youth-police interaction. In each of try? The history of the last 25 years is not those studies, the outcomes were also socially reassuring. During this period this country validated. That is, relevant judges were also used has experienced an unprecedented rise in to assess the social importance of the changes national affluence, with a spectacular in- in the objectively measured behaviors. And it crease in average family income and an as- was found that youths rated the quality of the sociated decline in the number of families teaching-parents higher, members of the com- below the poverty line. During the same munity rated the quality of the youths' conver- period we have seen a phenomenal rise sations higher, and police officers rated the in the incidence of crime, an epidemic of quality of the demeanor of the youths higher as various forms of public violence, a greatly the objectively measured behaviors increased in increased use of drugs with associated drug each case. abuse, a continuing increase in the number At the treatment program level, Curt Brauk- of fragmented families; a sharp drop in mann with Dean Fixsen, Kathi Kirigin, Elaine public confidence in elected officials, and Phillips, Lonnie Phillips-and I (1975) de- what appears to be a substantial rise in so- scribed how feedback from consumers can be cial and political alienation. [I}... find used to provide ongoing quality control of the it hard to believe that the quality of Amer- dissemination of the Achievement Place treat- ican life has been greatly enhanced dur- ment model. The consumers of the program, ing this period." that is the youths in the program, their parents, and community members and agencies, evaluate E. F. Schumaker, in his book Small is Beauti- the teaching-parents by rating their effective- ful: Economics as if People Mattered (1973), ness, concern, etc. throughout the year of train- raised this same issue. He urged economists to ing and certification, and each year thereafter. consider what he terms the "primacy of quali- It has not been possible to demonstrate experi- tative distinctions", rather than being so con- mentally the effectiveness of this feedback sys- cerned with objective data like the gross na- tem by using it with some programs and not with tional product. others because of ethical considerations. But Recently, the Swedish medical sociologists there is one important bit of data. Since this feed- Levi and Anderson (1975) suggested that ob- back was put into effect, the Achievement Place jective measures that habitually have been used program has not been summarily "fired" from a by the United Nations to assess the quality of community, as in that first attempt at replica- life be supplemented by subjective measures. tion. Also, these consumer satisfaction ratings They proposed that the traditional objective are often highly correlated with objective mea- measures of quality of life, such as education, sures of effectiveness (Braukmann et al., 1976). employment, economy, housing, nutrition, etc. Concern for the social validity of objective be given equal emphasis with subjective criteria measures seems to be an issue in other social such as "happiness, satisfaction, and gratifica- sciences as well. At the American Psychological tion". Thus, applied behavior analysts are not Association meeting, Angus Campbell (1976) the only applied social scientists who are being raised this issue about economics: asked to validate their measures by checking with society. "None of us doubts that economic data Well, if social validity is such a good thing, have admirable qualities: the question is, why haven't we been doing more of it all along. 212 MONTROSE M. WOLF Of course, the answer is that subjective data are our treatment program, we must be very cau- risky data. Subjective data may not have any tious because we have no adequate way of relationship to actual events. A program that is checking the reliability of the verbal report in described by its consumers as well-liked or effec- an independent way. And as Skinner pointed tive may not necessarily be either pleasant or out, verbal descriptions of private events are effective. Thus, there is the danger that subjec- open to "fictional distortion" (1959). tive data will seriously mislead us. For example, in order to influence consumer For example, Berleman, Seaberg, and Stein- evaluations, it is conceivable that some of those burn (1972) conducted a delinquency preven- being evaluated might politic their consumers tion experiment with carefully matched experi- for better ratings. Similarly, it is conceivable mental and control groups, using intensive that some of those consumers giving ratings one- to two-year treatment by social workers as might fear that they will not remain anonymous the intervention procedure. The evaluation of and be afraid that those they are rating might the effectiveness during the treatment period and retaliate in some manner. One can conceive of during the eight months following treatment in- many such possibilities, but let us remember dicated "no positive impact" on disruptive be- that the reliability of objective measurement havior in school, police contacts, or rate of systems can also be manipulated, as the excel- institutionalization. The untreated control group lent series of studies by O'Leary and Kent and performed as well or better than the experi- their colleagues (O'Leary, Kent, and Kanowitz, mental group. Yet, when asked about their ex- 1975; Kent, O'Leary, Diament, and Dietz, perience in treatment, the youths ". believed.. 1972) have demonstrated. From these studies, that their school acting-out had decreased. When it seems clear that the scoring behavior of ob- asked if they would participate in a similar servers can be affected by a variety of variables, service again, 89 percent of the parents re- such as experimenter feedback. We must take sponded positively, as did 94 percent of the these into consideration whenever we design a boys". measurement system that involves observers. Behavioral researchers have reported many Thus, we know that the reliability of objective examples of a lack of correspondence between measurement procedures can be influenced by client-reported data and observer-obtained data. a number of known and probably unknown Patterson (personal communication, 1974) for variables, but we continue to use these systems example, described discrepancies between paren- because they are the only way to obtain some tal reports of improvements in the child's behav- very important data, they often work, and we ior, while objective data obtained by observers feel some confidence that we are gaining a better did not support these claims. Conrad and Wincze understanding of the conditions that may dis- (1976) reported that clients undergoing orgas- tort them. mic reconditioning verbally reported favorable Similarly, we know that social validity mea- results that were not substantiated by the ob- sures can be manipulated and abused, but we jective data. cannot allow this to lead us to neglect them. Why do these discrepancies exist? One pos- Rather, we must establish that set of conditions sibility is that the contingencies of the situation under which people can be assumed to be the create distortion. Verbal behavior, clearly, is a best evaluators of their own treatment needs, manipulable behavior. And we must be sus- procedural preferences, and posttreatment sat- picious of it because we know that we will not isfaction. True, we know little about the proper always understand the contingencies operating set of conditions, but we must attempt them on it. When we are asking for a verbal descrip- anyway. We can expect that they will involve tion of a private event, such as satisfaction with education about options, lack of coercion, an- SOCIAL VALIDITY 213 onymity, and so on. We can study the effects of better ways of teaching people to observe their these conditions on subjective data, as O'Leary behavior and their conditions and to make more and Kent and their colleagues have studied their accurate decisions about their improvement. The effects on objective observer-dependent measure- opinion poll people often seem to be able to ment systems. And then we will be better able make excellent predictions about voting behav- to control for them. ior based on verbal report. Surely we can do A second possible explanation for subjective- as well. objective discrepancies is that the consumer is Undoubtedly, there will be further important responding to changes in some behavior or con- studies that point out to us the shortcomings of dition that we are not recording with our par- certain social validity measures, just as has been ticular objective measures. For example, the done for observer-dependent objective mea- parent may say that a child has "improved", sures. But we can't despair. After all, measure- while our behavioral measure of rate of tan- ment has been our thing. In our field, we have trums does not show a decrease. The discrepancy developed so many ingenious measurement sys- may be because the child has stopped cursing, tems. There is no doubt that we could measure which was important to the parent, but not the disruptive classroom behavior of a school measured by us, perhaps because it does not of fish, if need be. Surely, we will be able to bother us. If this lack of appropriate measure- develop measurement systems that will tell us ment is one of the factors in subjective-objective better whether or not our clients are happy discrepancies, then we must become better at with our efforts and our effects. setting up our measurement systems. Earlier in our history, Watson and Skinner A third possibility, and the most serious, is argued forcefully against subjective measure- that subjective measurement is impossible be- ment because they were concerned about the in- cause humans cannot judge and report their own appropriate causal roles that hypothetical in- situation accurately enough. It may be that they ternal variables, subjectively reported, were don't know when they are better or worse off. playing in social science. As a result, many of It may be that to expect a human ever to be us concluded that all subjective measurement able to report accurately when something feels was inappropriate. A new consensus seems to good or feels bad is just more than we can hope be developing. It seems that if we aspire to so- for from our confused species. But this conclu- cial importance, then we must develop systems sion is unacceptable if our goal is to design a that allow our consumers to provide us feedback responsive consumer-oriented applied social about how our applications relate to their values, science. As Levi and Anderson (1975) argued to their reinforcers. This is not a rejection of in making their case for adding subjective mea- our heritage. Our use of subjective measures sures to objective quality-of-life indicators: does not relate to internal causal variables. In- "We believe that each individual can be stead, it is an attempt to assess the dimensions of complex reinforcers in socially acceptable and assumed to be the best judge of his own practical ways. It is an evolutionary event that situation and state of well-being. The al- is occurring as a function of the contingencies ternative is some type of 'big brother' who of the applied research environment; contin- makes the evaluation for groups and na- gencies that our founders would probably say tions. World history provides many ex- they appreciate, if we had the nerve to ask them amples of such 'expert' or 'elitist' opinions for such subjective feedback on our behavior. being at variance with what was expected by the man in the street." REFERENCES Berleman, W. C., Seaberg, J. R., and Steinburn, T. W. Therefore, we may have to try to develop The delinquency prevention experiment of the 214 MONTROSE M. WOLF Seattle Atlantic Street Center: A final evaluation. Maloney, K. B. and Hopkins, B. L. The modifica- Social Science Review, 1972, Sept., 323-346. tion of sentence structure and its relationship to Boring, E. G. A history of experimental psychol- subjective judgements of creativity in writing. ogy. New York: Appleton-Century-Crofts, 1950. Journal of Applied Behavior Analysis, 1973, 6, Braukmann, C. J., Fixsen, D. L., Kirigin, K. A., Phil- 425-434. lips, E. A. Phillips, E. L., and Wolf, M. M. McMichael, J. S. and Corey, J. R. Contingency Achievement Place: The training and certifica- management in an introductory psychology course tion of teaching-parents. In W. S. Wood (Ed), produces better learning. Journal of Applied Be- Issues in evaluating behavior modification. Cham- havior Analysis, 1969, 2, 79-84. paign, Illinois: Research Press, 1975. Pp. 131- Minkin, N., Braukmann, C. J., Minkin, B. L., Tim- 152. bers, G. D., Timbers, B. J., Fixsen, D. L., Phillips, Braukmann, C. J., Kirigin, K. A., and Wolf, M. M. E. L., and Wolf, M. M. The social validation Achievement Place: The researchers' perspective. and training of conversation skills. Journal of Paper presented at the meeting of the American Applied Behavior Analysis, 1976, 9, 127-140. Psychological Association, Washington, D.C., New Webster's Vest Pocket Dictionary. Otten- September, 1976. heimer Publishers, Inc., 1962. Brigham, T. A., Graubard, P. S., and Stans, A. An- O'Leary, K. D., Kent, R. N., and Kanowitz, J. Shap- alysis of the effects of sequential reinforcement ing data collection congruent with experimental contingencies on aspects of composition. Journal hypotheses. Journal of Applied Behavior Analysis, of Applied Behavior Analysis, 1972, 5, 421-430. 1975, 8, 43-51. Campbell, Angus. Subjective measures of well- Piliavin, I. and Briar, S. Police encounters with being. American Psychologist, 1976, 31, 117-124. juveniles. American Journal of Sociology, 1964, Conrad, S. R. and Wincze, J. P. Orgasmic recondi- 70, 206-214. tioning. A controlled study of its effects upon the Porterfield, J. K., Herbert-Jackson, E., and Risley, sexual arousal and behavior of adult male homo- T. R. Contingent observation: an effective and sexuals. Behavior Therapy, 1976, 7, 155-166. acceptable procedure for reducing disruptive be- Fawcett, S. B. and Miller, L. K. Training public- havior of young children in a group setting. speaking behavior: an experimental analysis and Journal of Applied Behavior Analysis, 1976, 9, social validation. Journal of Applied Behavior 55-64. Analysis, 1975, 8, 125-136. Schumaker, E. F. Small is beautiful: economics as Foxx, R. M. and Azrin, N. A. Restitution: A if people mattered. New York: Harper & Row, method of eliminating aggressive-disruptive be- 1973. havior of retarded and brain damaged patients. Skinner, B. F. Science and human behavior. New Behaviour Research and Therapy, 1972, 10, York: Macmillan Co., 1953. 15-27. Skinner, B. F. Cumulative record. New York: Ap- Hasse, R. F. and Tepper, D. T. Nonverbal com- pleton-Century-Crofts, Inc., 1959. ponents of empathetic communication. Journal Watson, John B. Behaviorism. Chicago: The Uni- of Counseling Psychology, 1972, 19, 417-424. versity of Chicago Press, 1930. Jones, R. J. and Azrin, N. A. Behavioral engineer- Werner, J. S., Minkin, N., Minkin, B. L., Fixsen, D. ing: stuttering as a function of stimulus duration L., Phillips, E. L., and Wolf, M. M. Interven- during speech synchronization. Journal of Ap- tion package: An analysis to prepare juvenile plied Behavior Analysis, 1969, 2, 223-230. delinquents for encounters with police officers. Kent, R. N. and O'Leary, D. K. A controlled evalu- Criminal Justice and Behavior, 1975, 2, 5 5-83. ation of behavior modification with conduct Willner, A. G., Braukmann, C. J., Kirigin, K. A., problem children. Journal of Consulting and Fixsen, D. L., Phillips, E. L., and Wolf, M. M. Clinical Psychology, 1976, 44, 586-596. The training and validation of youth-preferred Kent, R. N., O'Leary, K. D., Diament, C., and Dietz, social behaviors with child-care personnel. Jour- A. Expectation biases in observational evalua- nal of Applied Behavior Analysis, 1977, 10, 219- tion of therapeutic change. Journal of Consulting 230. and Clinical Psychology, 1972, 42, 774-780. Levi, L. and Anderson, L. Psychosocial stress: Pop- ulation, environment, and the quality of life. Received 15 October 1976. Holliswood, N.Y.: Spectrum Press, 1975. (Final Acceptance 12 August 1977.)

Use Quizgecko on...
Browser
Browser