آزمون سازي زبان - زينب صيامي - PDF

‫زينب صيامي‪.‬‬ ‫آزمون‌سازي زبان ويژه رشته زبان کارشناسى ارشد ‪ /‬زينب صيامي‪.‬‬ ‫‪ 169‬ص‪.‬تهران‪ :‬سيميا‪.1396،‬‬ ‫‪978-600-177-493-5‬‬ ‫فيپاي مختصر‬ ‫‪4446800‬‬ ‫ناشر‪.........................:‬مؤسسه آموزشي انتشاراتي سيميا‬ ‫عنوان كتاب‪.........................................:‬آزمون‌سازي زبان‬ ‫مؤلف‪........................................................... :‬زينب صيامي‬ ‫نوبت چاپ‪.................................................... :‬اول ‪1396‬‬ ‫پايگاه اينترنتي سيميا‪www. simia. ir....................:‬‬ ‫روابط عمومي‪021-82155............................................ :‬‬ ‫كليه حقوق اين اثر براى انتشارات سيميا محفوظ است‪.‬هيچ شخص‬ ‫حقيقى يا حقوقى حق چاپ و نشر تمام يا بخشى از اين اثر را به هر‬ ‫صورت اعم از فتوكپى‪ ،‬چاپ كتاب و جزوه و حتى برداشت به صورت‬ ‫دست‌نويس ندارد و متخلفين به موجب بند ‪ 5‬از ماده قانون حمايت‬ ‫از ناشرين تحت پيگرد قانونى قرار مى‌گيرند‪.‬‬ ‫فهرست‬ ‫‪2‬‬ ‫فصل اول‪ :‬مقدمه‬ ‫‪16‬‬ ‫فصل دوم‪ :‬اهداف آزمون‌هاي زباني‬ ‫‪28‬‬ ‫فصل سوم‪ :‬اشكال گزينه‌هاي تست‬ ‫‪46‬‬ ‫فصل چهارم‪ :‬ساختار تست‬ ‫‪55‬‬ ‫فصل پنجم‪ :‬تفسير نتايج آزمون‬ ‫‪74‬‬ ‫فصل ششم‪ :‬ويژگي‌هاي يك تست خوب‬ ‫‪92‬‬ ‫فصل هفتم‪ :‬ساختار تست‬ ‫‪99‬‬ ‫فصل هشتم‪ :‬آزمون لغت‬ ‫‪109‬‬ ‫فصل نهم‪ :‬آزمون تلفظ‬ ‫‪118‬‬ ‫فصل دهم‪ :‬آزمون درك مطلب شنيداري‬ ‫‪126‬‬ ‫فصل يازدهم‪ :‬آزمون مكالمه‬ ‫فصل دوازدهم‪ :‬آزمون درك مطلب خواندن ‪137‬‬ ‫‪148‬‬ ‫فصل سيزدهم‪ :‬آزمون نوشتن‬ ‫فصلچهاردهم‪:‬ارزيابيتواناييهمه‌جانبه(كلي)زبان ‪160‬‬ ‫فصل اول‬ ‫مقدمه‬ What is evaluation? ‫ارزيابي چيست؟‬ The process of gathering information for making decisions is called evaluation. ‫فرآيند جمع‌آوري اطالعات براي تصميم‌گيري را ارزيابي‬.‫مي‌نامند‬ Evaluation can be either qualitative or quantitative or both..‫ارزيابي مي‌تواند كمي يا كيفي يا تركيبي از هر دو باشد‬ Qualitative evaluation is based on observations and (non)verbal de riptions such as letters of reference or general impressions which is subjective. ‫ارزيابي كيفي بر مبناي مشاهدات و توصيفات كالمي‬ ‫معرفي‌نامه‌ها يا برداشت‌هاي‬ ّ ‫ هم‌چون‬،‫(غيركالمي) است‬.‫كلي كه شخصي هستند‬ Subjective evaluation is used as feedback to make modifications on optimum ways during a certain process is called formative evaluation. ‫ارزيابي شخصي به‌عنوان بازخورد براي انجام اصالحات‬ ،‫روي روش‌هاي مطلوب در طول يك فرآيند مشخص‬.‫به نام ارزيابي سازنده به كار مي‌رود‬ Quantitative evaluation relates to objective information obtained through measur ment. ‫كمي به اطالعات (قابل مشاهده) حقيقي كسب‬ ّ ‫ارزيابي‬.‫‌شده از طريق اندازه‌گيري اطالق مي‌شود‬ When evaluation involves quantitative information, it is called summative evaluation. ‫ آن‌را ارزيابي‬،‫كمي است‬ ّ ‫وقتي ارزيابي شامل اطالعات‬.‫فشرده مي‌نامند‬ Summative evaluation, is for the purpose or reporting on the quality of a certain process when it has already been completed. ‫ براي تهيه گزارش درباره‬،‫ تلخيصي‬/‫ارزيابي فشرده‬ ً ‫كيفيت فرآيند خاصي كه قب‬ ‫ كاربرد‬،‫ال به اتمام رسيده‬.‫دارد‬ What is measurement? ‫اندازه‌گيري چيست؟‬ Measurement refers to the process of quantifying the characteristics of individuals according to explicit rules and procedures. ‫اندازه‌گيري (سنجش) به فرآيند تعيين ويژگي‌هاي‬.‫افراد مطابق با قوانين و روندهاي مشخص اشاره دارد‬ Measurement necessitates two requireents: First, there needs to be a set of clear objectives for measruing the attribute or property, and second; the attribute or property must be quantifiable. ‫ اول بايد مجموعه‌اي از‬:‌‫سنجش مستلزم دو شرط است‬ ‫اهداف روشن براي سنجش ويژگي (صفت) يا خصيصه‬ ‫مورد نظر موجود باشد و دوم آن خصيصه يا ويژگي بايد‬.‫قابل اندازه‌گيري باشد‬ The teacher can only infer through mea- surement that learning has taken place. ‫معلمان فقط با سنجش مي‌توانند پي ببرند كه يادگيري‬.‫صورت گرفته است‬ In education, measurement comprise ratings, rankings and measuring instruments called tests. ‫ رتبه‌بندي‌ها و‬،‫ سنجش مر ّكب از ارزيابي‌ها‬،‫در آموزش‬.‫) نام دارد‬test( ‫ تست‬،‫ابزارهاي سنجش‬ Rating and ranking involve an evaluative summary of past or present experiences for the purpose of making a final judge- ment. ‫ شامل چكيده‌اي از تجربيات گذشته‬،‫ارزيابي و رتبه‌بندي‬.‫يا حال فرد به منظور تصميم‌گيري نهايي است‬ Rating and ranking are accomplished by the personal opinion and judgement of the teacher or rater. ‫ با نظر و عقيد‌ه شخصي آموزگار يا‬،‫ارزيابي و رتبه‌بندي‬.‫ارزياب همراه است‬ A test refers to any kind of device or proce- dure for measuring performance or ability. ‫تست به هر نوع ابزار يا روش سنجش توانايي يا عملكرد‬.‫گفته مي‌شود‬ What is a test? ‫آزمون چيست؟‬ Any systematic procedures for eliciting information on a specific sample of an individual's or a group's behavior is called a test. ‫ براي جمع‌آوري اطالعات درباره يك‬،‫هر روش نظام‌مند‬.‫نمونه خاص از رفتار فرد يا يك گروه را آزمون مي‌گويند‬ The term "quiz" refers to something short and informal that relates to the points covered in the assignment and the previous class session. ‫ (كوييز) به آزمون كوتاه و غيررسمي‬quiz ‫اصطالح‬ ‫اشاره دارد كه به نكات واقع در تكاليف و درس جلسه‬.‫گذشته كالس مربوط مي‌شود‬ A quiz is for the purpose of helping the learners to become familiar with the format of the test that is to come next. ‫امتحان كوتاه (كوييز) به‌منظور كمك كردن به‬ ‫ براي آشنايي با شكل امتحاني است كه‬،‫دانش‌آموزان‬.‫در آينده برگزار مي‌شود‬ A test covers a greater portion of the materials that are taught in the course and is hence a longer and more carefully prepared series of items. ‫تست سهم بزرگ‌تري از مطالبي را كه در كالس‬ ‫تدريس شده دربرمي‌گيرد و ازاين‌رو طوالني‌تر است و‬ ‫مطالبش با دقت بيش‌تري تنظيم شده‌اند‪.‬‬ ‫‪Atest is usually used for major periods such‬‬ ‫‪as the middle or end of the term, i.e. the‬‬ ‫‪midterm test or the final test.‬‬ ‫معموالً در دوره‌هاي اصلي از جمله وسط يا آخر سال‪،‬‬ ‫يعني ميان‌ترم يا پايان‌ترم از آزمون استفاده مي‌شود‪.‬‬ ‫‪When a group of several comparable‬‬ ‫‪tests are used, it is called a battery.‬‬ ‫هنگامي كه گروهي از چندين آزمون قابل قياس مورد‬ ‫استفاده قرار مي‌گيرند‪ ،‬به آن‌ها يك سلسله آزمون‬ ‫مي‌گويند‪.‬‬ ‫‪The tests of a battery can be used individually‬‬ ‫‪or in combination.‬‬ ‫سلسله آزمون‌ها‪ ،‬مي‌توانند به‌صورت تك‌‌تك يا تركيبي‬ ‫مورد استفاده قرار گيرند‪.‬‬ An examination is more comprehensive and complex than a test..‫آزمون در مقايسه با تست جامع‌تر و دشوارتر است‬ An examination includes a number of specially selected tests that are employed together to predict a single ability or trait. ‫آزمون شامل تعدادي تست‌هاي به‌طور خاص انتخاب‬ ‫‌شده است كه جهت پيش‌بيني توانايي يا ويژگي واحدي‬.‫با هم مورد استفاده قرار مي‌گيرند‬ Why do we test? ‫چرا امتحان مي‌گيريم؟‬ Teachers want to be sure that the learners have understood what has been studied and to discover how much more teaching/ learning is required. ‫معلمان مي‌خواهند اطمينان پيدا كنند كه دانش‌آموزان‬ ‫ فهميده‌اند و معلوم‬،‫آن‌چه را كه تدريس شده است‬.‫كنند چقدر آموزش و يادگيري موردنياز است‬ Teachers want to assign grades and to award certificates of competence. ‫معلمان قصد دارند امتيازاتي را تعيين و گواهي‌نامه‬.‫صالحيت اعطا كنند‬ Teachers want to identify the learners' ability in order to give them the type of education that they need and up to their level of ability. ‫معلمان مي‌خواهند توانايي دانش‌آموزان را براي ارائه‬ ‫نوعي آموزش كه نياز دارند و به ميزان توانايي‌شان‬.‫ شناسايي كنند‬،‫است‬ Teachers want to determine the extent to which the learners have benefited from instruction. They would like to diagnose the learners' strengths and weaknesses. ‫معلمان مي‌خواهند ميزان استفاده دانش‌آموزان از‬ ‫ آن‌ها مي‌خواهند به نقاط قوت‬.‫آموزش را تعيين كنند‬.‫و ضعف شاگردان پي ببرند‬ Teachers want to identify the learners' likely performance in future. ‫معلمان مي‌خواهند عملكرد احتمالي شاگردان در آينده‬.‫را شناسايي كنند‬ Teachers want to determine if the objectives of the course have been achieved and if the objectives are, in fact, attainable. ‫معلمان مي‌خواهند مشخص كنند كه آيا به اهداف‬ ‫آموزشي رسيده‌اند و آيا در حقيقت اين اهداف آموزشي‬.‫دست‌يافتني هستند‬ Teachers want to know how effective they themselves have been and how effective their teaching methods have been. ‫معلمان مي‌خواهند بدانند چقدر خودشان و روش‌هاي‬ ‫آموزشي‌شان مؤثر بوده است‪.‬‬ ‫‪Tests provide learners with the incentive‬‬ ‫‪to study steadily. Frequent announced or‬‬ ‫‪unannounced quizzes motivate the learners‬‬ ‫‪to systematically study and stay with the‬‬ ‫‪class.‬‬ ‫آزمون‌ها انگيزه مطالعه بي‌وقفه را براي شاگردان فراهم‬ ‫مي‌كند‪.‬امتحانات كوتاه متعدد اعالم شده يا نشده‪،‬‬ ‫دانش‌آموزان را وادار مي‌كند تا به‌طور منظم مطالعه‬ ‫كنند و با كالس پيش بروند‪.‬‬ ‫‪Tests provide learners with a sense of‬‬ ‫‪accomplishment.‬‬ ‫آزمون‌ها حس موفقيت را در دانش‌آموزان به وجود‬ ‫مي‌آورند‪.‬‬ ‫‪Testing indirectly acts as a way of learning.‬‬ ‫آزمون غيرمستقيم نوعي روش يادگيري است‪.‬‬ Tests help learners to obtain an objective, independent estimate of their progress and to compare themselves with their peers. ‫تست‌ها به دانش‌آموزان كمك مي‌كنند تا ارزيابي واقعي‬ ‫و مستقل از پيشرفت‌شان به‌دست آورند و خود را با هم‬.‫سن و ساالنشان مقايسه كنند‬ Who should prepare classroom tests? ‫چه كسي بايد آزمون‌هاي كالس را تهيه كند؟‬ Language teachers need to be quite knowledgeable in three areas: The language they teach, the procedure for tests construction, the techniques for interpretation of test results. ً ‫معلمان زبان بايد در سه زمينه كام‬ :‫ال مطلع باشند‬ ‫ روش‌هاي طراحي تست‬،‫زباني كه تدريس مي‌كنند‬.‫و تكنيك‌هاي تفسير نتايج تست‬ If a teacher can construct his own test, it is known as a teacher-made test. ‫ به آن‬،‫اگر آموزگاري بتواند تست خود را طرح كند‬.‫تست كالسي مي‌گويند‬ If a teacher make use of a test that has been published for general use, it is known as standardized test. ‫ استفاده كند‬،‫اگر معلمي از تست عمومي منتشر شده‬.‫به آن تست استاندارد مي‌گويند‬ ‫فصل ‪2‬‬ ‫اهداف آزمون‌هاي زباني‬ Defining the purpose of testing is necessary because the purpose for which a test is constructed directly determines its rationale, design, use, and interpretation of results. ‫ زيرا هدفي‬،‫مشخص كردن هدف تست ضروري است‬ ،‫ مستقيماً دليل‬،‫كه براي آن تست طراحي مي‌شود‬.‫ كاربرد و تفسير نتايج آن‌را تعيين مي‌كند‬،‫طرح‬ Attainment and prognostic ‫اكتساب و پيش‌بيني‬ Tests are used for two fundamental purposes: attainment and prognostic..‫ اكتساب و پيش‌بيني‬:‌‫تست‌ها با دو هدف عمده به‌كار مي‌روند‬ Attainment tests relates to what a person can do. ،‫تست‌هاي اكتسابي به آن‌چه كه فرد مي‌تواند انجام دهد‬.‫مربوط مي‌شود‬ Prognostic tests relates to what he will be able to do. ‫تست‌هاي پيش‌بيني به‌آن‌چه كه فرد قادر به انجامش‬.‫ مربوط مي‌شود‬،‫خواهد بود‬ Evaluation of attainment ‫ارزيابي اكتساب‬ The purpose of attainment testing is to determine an individual's current level of ability. ‫هدف از آزمون‌هاي اكتسابي تعيين سطح توانايي فعلي‬.‫فرد است‬ Depending on measurement rationales and techniques, three different purposes are identified in this category: achievement, proficiency and knowledge. ‫ سه هدف مهم در اين‬،‫بسته به تكنيك‌ها و داليل ارزيابي‬.‫ مهارت و معلومات‬،‫ دستاورد‬:‫مقوله تشخيص داده مي‌شوند‬ Achievement ‫دستاورد‬ A general achievement test help teachers to know how much their students have learned during the course or how successful they themselves have been learned during the course. ‫تست دستاورد عمومي به معلمان كمك مي‌كند‬ ‫كه بدانند دانش‌آموزان‌شان در طول دوره چقدر ياد‬.‫گرفته‌اند و خودشان چقدر موفق بوده‌اند‬ Sometimes teachers attempt to measure portions of the materials taught in the course. This use of achievement tests is refered to as progress testing such as midterm tests. ‫گاهي اوقات معلمان اقدام به ارزيابي بخش‌هايي از مطالب‬ ‫ اين استفاده از تست دستاورد را‬.‫تدريس‌شده مي‌كنند‬.‫ مانند امتحان ميان‌ترم‬،‫تست رشد مي‌گويند‬ A mastery test is employed for purposes of awarding certification of competence when the candidate has satisfied the minimum requirements. ‫ آزمون‬،‫هنگامي كه داوطلب حداقل شرايط را كسب كند‬.‫مهارت با هدف اعطاي گواهي لياقت استفاده مي‌شود‬ A diagnostic test is employed for the purpose of identifying what has been already learned, what has not been learned yet and why, and what needs to be taught or reviewed. ‫يك تست تشخيص با هدف مشخص كردن آن‌چه كه تا‬ ‫ آن‌چه كه هنوز آموخته نشده و دليل‬،‫به حال آموخته شده‬.‫ به‌كار مي‌رود‬،‫آن و آن‌چه كه نياز به آموزش و مرور دارد‬ Diagnostic tests differ from achievement tests in that the former answers what the students know and should help to answer why they do (not) know something and the latter answers how much the students know. ‫تفاوت تست‌هاي تشخيصي از تست‌هاي دستاورد اين‬ ،‫است كه در اولي دانش‌آموزان به آن‌چه كه مي‌دانند‬ ‫پاسخ مي‌دهند و بايد براي پاسخ به دليل دانستن يا‬ ‫ندانستن بعضي چيزها كمك كنند و دومي به اين‌كه‬.‫ پاسخ مي‌دهد‬،‫دانش‌آموزان چقدر مي‌دانند‬ Evaluation should be carried out both during and at the end of the instructional program. ‫ارزيابي بايد هم در طول برنامه آموزشي و هم در پايان‬.‫آن انجام گيرد‬ Proficiency ‫مهارت‬ A proficiency test is for the purpose of measuring global competence in a language regardless of any training the testees may have had. ‫يك آزمون مهارت صرف‌نظر از هر نوع آموزشي كه امتحا ‌‬ ‫ن‬ ‫دهندگان (داوطلبان) داشته‌اند‪ ،‬به‌منظور سنجش صالحيت‬ ‫همه‌جانبه در يك زبان به‌كار مي‌رود‪.‬‬ ‫‪Knowledge‬‬ ‫معلومات‬ ‫‪Language is a direct representation of the‬‬ ‫‪heritage of its speakers.‬‬ ‫زبان تصوير كاملي از ميراث سخنورانش است‪.‬‬ ‫‪A test used for assessing knowledge of‬‬ ‫‪culture and literature (and more broadly‬‬ ‫‪speaking, knowledge of subject-matter‬‬ ‫‪courses such as physics, chemistry,‬‬ ‫‪mathematics, and history) is called a know-‬‬ ‫‪ledge test.‬‬ ‫آزموني كه براي ارزيابي علمي فرهنگ و ادبيات (و‬ ‫به‌طور كلي‌تر‪ ،‬علم رشته‌هاي موضوع محور از جمله‬ ‫فيزيك‪ ،‬شيمي‪ ،‬رياضيات و تاريخ) به كار مي‌رود‪ ،‬آزمون‬ ‫علمي نام دارد‪.‬‬ Prognostic evaluation ‫ارزيابي پيش‌بيني‬ Prognostic evaluation is related to making predictions on acceptance or nonacceptance of applicants to the program. ‫ارزيابي پيش‌بيني به پيش‌بيني درباره قبولي يا عدم‬.‫ مربوط مي‌شود‬،‫قبولي متقاضيان در برنامه‬ In language teaching programs, prognostic evaluation relates to selection and placement tests. ‫ به‬،‫ ارزيابي پيش‌بيني‬،‫در برنامه‌هاي آموزشي زبان‬.‫آزمون‌هاي انتخابي و تعيين سطح مربوط مي‌شود‬ Selection ‫ انتخاب‬/‫گزينش‬ A test given for the purpose of screening applicants is called a selection test or an entrance test. ،‫آزموني كه براي ارزيابي متقاضيان گرفته مي‌شود‬.‫آزمون انتخابي (گزينش) يا آزمون ورودي نام دارد‬ A test that is used to determine whether or not the students are ready for instruction is called a readiness test. ‫آزموني كه جهت مشخص كردن آمادگي دانش‌آموزان‬.‫ آزمون آمادگي نام دارد‬،‫براي آموزش به كار مي‌رود‬ In a competition test based on the total number of students that the universities can serve, the applicants are accepted according to their total scores. ‫ بسته به تعداد كل دانش‌آموزاني كه‬،‫در آزمون رقابت‬ ‫ شركت‌كنندگان طبق نمره‬،‫دانشگاه مي‌تواند بپذيرد‬.‫كل آن‌ها پذيرفته مي‌شوند‬ Aptitude test indicates the potential capacity of the learners and serves a prediction function; it doesn't focus on past learning. ‫ ظرفيت بالقوه دانش‌آموزان را نشان‬،‫آزمون‌هاي استعداد‬ ‫ اين آزمون به آموخته‌هاي‬.‫مي‌دهد و پيش‌بيني مي‌كند‬.‫قبلي توجهي نمي‌كند‬ Placement ‫تعيين سطح‬ A test that is employed for the purpose of grouping students is called a placement test. ‫ آزمون‬،‫آزموني كه هدفش دسته‌بندي دانش‌آموزان است‬.‫ن سطح نام دارد‬ ‌ ‫تعيي‬ Unlike selection tests, there is no pass or fail in placement tests. ‫ در آزمون‌هاي تعيين‬،‫برخالف آزمون‌هاي گزينشي‬.‫ قبولي يا مردودي وجود ندارد‬،‫سطح‬ Multiple purposes ‫اهداف چند جانبه‬ A readiness test that is employed for the purpose of determining whether the students possess the prerequisite skills can, in addition, tell the test user who is in need of remedial instruction. ‫آزمون آمادگي با اين هدف به كار مي‌رود كه آيا‬ ‫ عالوه‬،‫دانش‌آموزان مهارت‌هاي پيش‌نياز را دارا هستند‬ ‫بر اين به امتحان‌گيرنده مي‌گويد كه چه كسي نيازمند‬.‫آموزش جبراني است‬ A placement test that is designed to sort new students into teaching groups can as well help to distinguish weak students from the strong ones. ‫ كه براي دسته‌بندي دانش‌آموزان‬،‫آزمون تعيين سطح‬ ‫ مي‌تواند‬،‫جديد به گروه‌هاي آموزشي طراحي شده‬ ‫ به شناخت دانش‌آموزان ضعيف‬،‫عالوه بر اين هدف‬.‫از قوي كمك كند‬ Speed/ power test ‫قدرت‬/‫آزمون سرعت‬ A speed test aims at determining the speed with which the testees perform certain tasks. ‫هدف آزمون سرعت‪ ،‬تعيين سرعت داوطلبان در انجام‬ ‫وظايف خاص است‪.‬‬ ‫‪A power test, is one in which the purpose‬‬ ‫‪is to determine how much an individual‬‬ ‫‪is able to do.‬‬ ‫آزمون قدرت آزموني است كه هدف آن تعيين ميزان‬ ‫توانايي انجام كار افراد است‪.‬‬ ‫‪In a power test tasks are ordinarily arranged‬‬ ‫‪in the order of increasing difficulty.‬‬ ‫در يك تست قدرت معموالً سؤاالت به‌ترتيب افزايش‬ ‫دشواري مرتب مي‌شوند‪.‬‬ ‫فصل ‪3‬‬ ‫اشكال گزينه‌هاي تست‬ A test is a collection of items..‫آزمون مجموعه‌اي از سؤاالت است‬ Subjective vs. objective items ‫سؤاالت شخصي در مقايسه با سؤاالت تستي‬ A subjectively-scored item, or a subjective item may have more than one acceptable response. ،‫سؤاالتي كه با نظر شخصي فرد تصحيح مي‌شوند‬.‫ بيش از يك پاسخ صحيح داشته باشند‬،‫ممكن است‬ An objectively–scored item or an objective item has only one answer. ‫آزموني كه به‌صورت بي‌طرفانه صحيح مي‌شود يا‬.‫سؤاالت تستي فقط يك پاسخ صحيح دارد‬ An objective item can be scored mechani- cally, by a computer or any individual who has no competence in the field under evaluation. ‫سؤاالت تستي را مي‌توان با كامپيوتر به‌طور اتوماتيك‬ ‫با توسط فردي كه سررشته‌اي در رشته مورد ارزيابي‬ ‫ندارد‪ ،‬تصحيح كرد‪.‬‬ ‫‪Objective item forms maybe divided into‬‬ ‫‪two classes:‬‬ ‫آزمون‌هاي علمي (تستي) به دو گروه تقسيم مي‌شوند‪:‬‬ ‫‪yy Items to which the subjects must answer‬‬ ‫‪by selecting from among given responses‬‬ ‫‪that are called selection forms.‬‬ ‫ سؤاالتي كه افراد بايد با انتخاب از ميان پاسخ‌هاي‬ ‫ارائه‌شده جواب دهند‪ ،‬آزمون‌هاي انتخابي نام دارند‪.‬‬ ‫‪yy Items to which the testees must supply‬‬ ‫‪the answer are called the supply forms.‬‬ ‫ سؤاالتي كه امتحان‌دهنده بايد جاي خالي را پركند‪،‬‬ ‫سؤاالت جاي خالي نام دارد‪.‬‬ Selecting tests comprise true - false, multiple - choice, and matching forms. ،‫آزمون‌هاي انتخابي از تست‌هاي صحيح و غلط‬.‫چندگزينه‌اي و تست‌هاي وصل‌كردني تشكيل مي‌شود‬ Since selection forms measure recognition and comprehension, they are also called recognition or comprehension forms. ‫از آن‌جا كه آزمون‌هاي انتخابي تشخيص و فهم را‬ ‫ به آن‌ها آزمون‌هاي درك‌مطلب و‬،‫اندازه‌گيري مي‌كنند‬.‫تشخيص گفته مي‌شود‬ Completion and short-answer items are examples of the supply or production form. ‫ نمونه‌اي از صورت‬،‫آزمون‌هاي تكميلي و پاسخ كوتاه‬.‫پُركردني يا توليدي (تشريحي) هستند‬ The distinction between subjective and objective item relates to the manner of scoring only. ‫تفاوت بين آزمون شخصي و علمي (تستي) فقط به نوع‬ ‫نمره‌دهي ارتباط دارد‪.‬‬ ‫‪Short- answer item‬‬ ‫آزمون جواب كوتاه‬ ‫‪The short-answer item involves asking‬‬ ‫‪the testees an open-ended question that‬‬ ‫‪can be answered by a word, phrase, or‬‬ ‫‪number.‬‬ ‫آزمون جواب كوتاه‪ ،‬سؤال تشريحي را در برمي‌گيرد كه‬ ‫با يك لغت‪ ،‬عبارت يا عدد مي‌توان به آن پاسخ داد‪.‬‬ ‫‪Short-answer item form is most suitable‬‬ ‫‪for informal classroom testing.‬‬ ‫آزمون جواب كوتاه اكثرا ً براي تست‌هاي كالسي غير‬ ‫رسمي مناسب است‪.‬‬ ‫‪Short- answer item works better with‬‬ ‫‪younger learners.‬‬ ‫آزمون جواب كوتاه براي فراگيران جوان بهتر كار مي‌كند‪.‬‬ Short-answer item form is easy to prepare and adaptable to various topics. ‫تهيه آزمون جواب كوتاه آسان و قابل انطباق با موضوع‌هاي‬.‫مختلف است‬ Completion item ‫تست تكميلي‬ Completion form appears useful in know- ledge testing..‫تست تكميلي در آزمون‌هاي علمي بسيار مفيد است‬ True- false item ‫تست صحيح ـ غلط‬ The true-false form comprises a statement to be judged true or false. ‫ از عبارتي تشكيل شده كه در مورد‬،‫فرم صحيح ـ غلط‬.‫درستي يا نادرستي آن نظر داده مي‌شود‬ True-false form of item is easy to prepare and can be answered quickly. ‫تهيه تست صحيح ـ غلط آسان مي‌باشد و به‌سرعت‬ ‫مي‌توان به آن پاسخ داد‪.‬‬ ‫‪True-false items are appropriate for measuring‬‬ ‫‪the recognition of factual information.‬‬ ‫تست‌هاي صحيح ـ غلط براي اندازه‌گيري تشخيص‬ ‫اطالعات واقعي‪ ،‬مناسب هستند‪.‬‬ ‫‪True-false items have two limitatioins:‬‬ ‫تست‌هاي صحيح ـ غلط‪ ،‬دو نوع محدوديت دارند‪:‬‬ ‫‪yy When an item presents a false statement,‬‬ ‫‪we are exposing the subjects to information‬‬ ‫‪that is false.‬‬ ‫ وقتي يك تست‪ ،‬عبارت غلطي را ارائه مي‌دهد‪ ،‬ما‬ ‫امتحان‌‌دهندگان را در معرض اطالعات نادرست قرار‬ ‫مي‌دهيم‪.‬‬ ‫‪yy There is the matter of guessing.‬‬ ‫ امكان حدس زدن وجود دارد‪.‬‬ In preparing true - false items, several considerations are in order: ‫ چندين عامل بايد‬،‫در تهيه تست‌هاي صحيح و غلط‬ :‫به‌ترتيب در نظر گرفته شود‬ yy First: only a single point should be tested in each item. ‫ فقط يك نكته واحد در هر تســت بايد مورد‬،ً‫ اوال‬.‫سؤال قرار گيرد‬ yy Second: the items should be randomly ordered in order to avoid response patterns that serve as strong cues. ‫ سؤاالت به‌صورت تصادفي مرتب مي‌شوند تا‬،ً‫ دوما‬.‫الگوهاي پاسخ به‌صورت سرنخ‌هاي بارز درنيايند‬ Multiple- choice item ‫سؤال چند گزينه‌اي‬ A multiple- choice item consists of a lead or stem and three or more choices or alternatives. ‫يك سؤال چندگزينه‌اي از يك ريشه و سه گزينه يا‬.‫بيش‌تر تشكيل شده است‬ The lead may be an introductory question or an incomplete statement. ‫ريشه ممكن است يك سؤال مقدماتي يا يك عبارت‬.‫ناقص باشد‬ A good multiple- choice item should be worded in such a way that it has only one acceptable response. ‫يك سؤال چندگزينه‌اي خوب بايد به روشي بيان شود‬.‫كه فقط يك جواب قابل قبول داشته باشد‬ One of the advantages of multiple-choice items is that they lend themselves readily to systematic study. ‫يكي از مزيت‌هاي سؤاالت چندگزينه‌اي اين است كه‬.‫به سادگي مناسب مطالعه نظام‌مند هستند‬ Another advantage of multiple-choice items is that they can be scored clercially or by a machine. ‫مزيت ديگر سؤاالت چندگزينه‌اي اين است كه ماشين‬.‫(دستگاه) مي‌تواند آن‌را تصحيح كند‬ Multiple choice items are difficult to write and highly time - consuming. ‫نوشتن آزمون‌هاي چندگزينه‌اي دشوار و بسيار وقت‌گير‬.‫است‬ There is the problem of guessing in multiple choice items..‫معضل حدس‌زدن در سؤاالت چندگزينه‌اي وجود دارد‬ Multiple-choice items are by far the most popular objectively- scored form. ‫سؤاالت چندگزينه‌اي تاكنون رايج‌ترين شكل آزموني‬.‫است كه بدون نظر شخصي (حقيقي) تصحيح مي‌شود‬ 8 general directions for writing items are: :‫هشت دستورالعمل عمومي براي نوشتن گزينه‌ها‬ yy Write each item as a separate entity; each should function as a whole and deal with only one central thought. ‫ هر گزينه بايد مستقل نوشته شود و هر كدام بايد‬ ‫به‌صــورت يكپارچه عمل كنند و فقط درباره يك فكر‬.‫مركزي باشند‬ yy The point in each item should concern fundamental concepts, purposes. ‫ هر نكته در هر گزينه بايد مفاهيم و اهداف اساسي‬.‫را دربرگيرد‬ yy Linguistically speaking, each item should provide greatest economy in the use of language. ‫ هر گزينه بايد در استفاده از‬،‫ از ديد زبان‌شناســي‬.‫زبان بسيار صرفه‌جويي كند‬ yy Write your items in a positive form. Use the negative format sparingly. ‫ گزينه‌هايتــان را به شــكل مثبت بنويســيد و از‬.‫حالت‌هاي منفي به‌ندرت استفاده كنيد‬ yy Adapt the level of each item and in turn the test to the testees' level of ability and the purpose of test. ‫ سطح هر تست را به نوبت متناسب با سطح توانايي‬.‫امتحان‌دهندگان و هدف تست وفق دهيد‬ yy Use plausible distractors that are thor- oughly wrong, yet attractive enough to the poorly-prepared testees. ً ‫ از گمراه‌كننده‌هاي قابل قبول كه كام‬ ‫ال غلط هستند‬ ‫و در عيــن حال به اندازه‌كافي براي امتحان‌دهندگان‬.‫ استفاده كنيد‬،‫ جذاب هستند‬،‫ضعيف‬ yy Avoid opposits or overlapping alterna- tives. Likewise, avoid choices containing irrelevant cues to the correct choice. ‫ از گزينه‌هاي متضاد و هم‌پوش و نيز از گزينه‌هاي‬ ‫شامل ســرنخ‌هاي بي‌ربط به گزينه صحيح خودداري‬.‫كنيد‬ yy Prepare a defensible response that expert critics can agree on as the best choice. ‫ جــواب قابل دفــاع آماده كنيد تا منتقــدان آن‌را‬.‫به‌عنوان بهترين گزينه قبول كنند‬ Matching item )‫تست وصل‌كردني (جوركردني‬ A matching item involves associating two things. It requires the testees to pair terms with definitions, dates with events, or persons with events. ‫تست وصل‌كردني شامل ارتباط دادن دو چيز است و‬ ‫امتحان‌دهند‌گان را ملزم مي‌كند تا عبارات را با تعاريف‪،‬‬ ‫تاريخ‌ها را با حوادث يا اشخاص را با وقايع ربط دهند‪.‬‬ ‫‪Matching form has two basic shortcomings:‬‬ ‫تست وصل‌كردني دو ضعف عمده دارد‪:‬‬ ‫‪yy It is difficult and time-consuming to‬‬ ‫‪construct.‬‬ ‫ نوشتن تست وصل‌كردني دشوار و وقت‌گير است‪.‬‬ ‫‪yy The matching item cannot be used for‬‬ ‫‪eliciting all types of information.‬‬ ‫ تست وصل‌كردني براي استخراج همه نوع اطالعات‬ ‫كاربرد ندارد‪.‬‬ ‫‪Essay item‬‬ ‫سؤال تشريحي‬ ‫'‪The essay type item measures the testees‬‬ ‫‪ability to think about and produce what‬‬ ‫‪they know.‬‬ ‫سؤاالت تشريحي توانايي امتحان‌دهندگان در به خاطر‬.‫ را ارزيابي مي‌كند‬،‫آوردن و توليد آن‌چه مي‌دانند‬ Essay items are not useful when the sole purpose of testing is knowledge testing. ‫ آزمون‬،‫سؤاالت تشريحي در مواردي كه تنها هدف تست‬.‫ مفيد نيست‬،‫علمي است‬ Interview ‫مصاحبه‬ Interview is the most popular of all oral tests..‫مصاحبه رايج‌ترين نوع از آزمون‌هاي شفاهي است‬ Interview consists of a direct, face-to-face encounter between the interviewee and examiner. ‫ رو در رو بين‬،‫مصاحبه شامل يك مكالمه مستقيم‬.‫مصاحبه‌شونده و ممتحن است‬ Interview test has been used for a number of purposes such as proficiency, achieve- menboth general and diagnostic, and research. ،‫آزمون مصاحبه براي چند هدف از قبيل آزمون مهارت‬.‫موفقيت (عمومي و تشخيصي) و تحقيق استفاده مي‌شود‬ Some advantages of interview ‫بعضي از مزاياي مصاحبه‬ Because of direct interaction between the examiner and the examinees, the interview test is more humane than written tests. ،‫به خاطر تعامل مستقيم بين ممتحن و امتحان‌دهنده‬.‫آزمون مصاحبه از آزمون‌هاي كتبي انساني‌‌تر است‬ It allows testing personal characteristics that would be impossible to evaluate on a written test: characteristics such as appearance, manner, personality and speech quality. ‫آزمون‌هاي مصاحبه ويژگي‌هاي شخصي را كه ارزيابي‬ ،‫ مي‌آزمايد‬،‫آن‌ها در امتحان كتبي غيرممكن است‬ ‫ شخصيت و كيفيت‬،‫ رفتار‬،‫ويژگي‌هايي مثل ظاهر‬.‫گفتار‬ When used in knowledge testing, the interview permits a flexibility that written tests do not. ‫ انعطافي كه در مصاحبه‬،‫در مورد آزمون‌هاي علمي‬.‫وجود دارد در تست‌هاي كتبي وجود ندارد‬ Interviews are time-consuming, expensive and subjective..‫ گران و متكي به نظر شخص هستند‬،‫مصاحبه‌ها زمان‌بر‬ Overview ‫شرح مختصر‬ Use of multiple- choice items is justified when testing needs to be done extensively or repeatedly. ‫استفاده از سؤاالت چندگزينه‌اي زماني موجه است كه‬ ‫امتحان به‌صورت گسترده يا مكرر برگزار مي‌شود‪.‬‬ ‫‪For a one-time test to be used with a small‬‬ ‫‪group the completion form or the short-‬‬ ‫‪answer form is suitable.‬‬ ‫براي آزموني كه يك بار و با يك گروه كوچك استفاده‬ ‫مي‌شود‪ ،‬تست تكميلي يا تست با جواب كوتاه مناسب‬ ‫است‪.‬‬ ‫‪The washback or backwash of a test is the‬‬ ‫‪effect it has on learning and teaching that‬‬ ‫‪precedes or follows.‬‬ ‫اثر جانبي يا پيامد يك آزمون‪ ،‬اثر آن روي يادگرفتن و‬ ‫تدريس قبل يا بعد از آن است‪.‬‬ ‫فصل ‪4‬‬ ‫ساختار تست‬ In constructing a test, the test developer faces two tasks: what to measure and how to measure what he wants to measure. ‫ چه‬:‫ طراح سؤال دو وظيفه دارد‬،‫در طراحي يك آزمون‬ ‫چيزي را و چگونه اندازه‌گيري كند؟‬ A test constructor needs to follow four steps: planning, writing, reviewing, pretesting. ،‫ نوشتن‬،‌‫ برنامه‌ريزي‬:‫ مرحله نياز دارد‬4 ‫طراح سؤال به‬.‫ پيش‌آزمون‬،‫مروركردن‬ Planning ‫برنامه‌ريزي‬ Planning is an integral part of test development..‫برنامه‌ريزي يك بخش اساسي در طرح تست است‬ Planning involves defining test purpose, preparing an outline of test content, selecting the type(s) of items to be used, the dificulty level of the test, directions to the testees, etc. ‫برنامه‌ريزي شامل تعريف هدف آزمون‪ ،‬آماده‌سازي يك‬ ‫طرح كلي از محتواي آزمون‪ ،‬انتخاب انواع گزينه‌هايي‬ ‫كه قرار است استفاده ‌شوند‪ ،‬سطح دشواري تست‪،‬‬ ‫راهنمايي‌هايي براي امتحان‌دهندگان و غيره است‪.‬‬ ‫‪When the number of examinees is small‬‬ ‫‪-e.g, 20- and the test is not to be reused, the‬‬ ‫‪completion or composition format is quite‬‬ ‫‪reasonable.‬‬ ‫وقتي تعداد امتحان‌دهندگان كم باشد‪ ،‬مث ً‬ ‫ال ‪ 20‬نفر و‬ ‫تست هم دوباره استفاده نشود‪ ،‬آزمون‌هاي تكميلي يا‬ ‫تشريحي خيلي منطقي هستند‪.‬‬ ‫‪Variety in item form has two advantages.‬‬ ‫‪First, it makes the test interesting. Second,‬‬ ‫‪it diversifies the tasks to make measurement‬‬ ‫‪of all relevant abilities possible.‬‬ ‫تنوع در فرم سؤاالت دو مزيت دارد‪ :‬اوالً‪ ،‬آزمون را‬ ‫جالب مي‌كند‪.‬دوماً‪ ،‬فعاليت‌ها را براي ارزيابي همه‬.‫توانايي‌هاي مربوط و ممكن تنوع مي‌بخشد‬ There is certainly no fixed number of items in the test..‫مشخصاً تعداد ثابتي براي سؤاالت در يك آزمون وجود ندارد‬ A weekly quiz should be short because it covers limited data. ‫ چون اطالعات‬،‫يك امتحان هفتگي بايد كوتاه باشد‬.‫محدودي را در برمي‌گيرد‬ The time required to complete test items will also vary according to the complexity, content, and form of the items. ،‫زمان الزم براي كامل‌كردن سؤاالت طبق پيچيدگي‬.‫محتوا و فرم سؤاالت متفاوت است‬ For each multiple-choice item on testing structure and vocabulary, including the time to read the directions, 1 minute is 2 sufficient. ‫براي سؤاالت چند‌گزينه‌اي در مورد لغت و ساختار‪ ،‬با‬ ‫درنظرگرفتن زماني براي خواندن دستورها‪ ،‬نيم دقيقه‬ ‫كافي است‪.‬‬ ‫‪For reading comprehension items, one‬‬ ‫‪minute for each item is adequate.‬‬ ‫ك مطلب‪ ،‬يك دقيقه براي هر سؤال‬ ‫براي سؤاالت در ‌‬ ‫كافي است‪.‬‬ ‫‪Fill– in problems can each probably be‬‬ ‫‪answered in one minute.‬‬ ‫سؤاالت پُركردني مي‌تواند احتماالً در يك دقيقه جواب‬ ‫داده شود‪.‬‬ ‫‪Writing‬‬ ‫نوشتن‬ ‫‪The individual who writes the items should‬‬ ‫‪possess four characteristics:‬‬ ‫فردي كه سؤاالت را مي‌نويسد بايد چهار ويژگي داشته‬ ‫باشد‪:‬‬ yy (1) He has to be experienced in test construction, (2) be quite knowledgeable of the content area of the test, (3) have a capacity in using language clearly and economically, (4) have readiness to sacrifice time and energy. ‫) در‬2( ،‫) او بايد در طراحي تســت باتجربه باشد‬1( ً ‫مورد محتواي تست كام‬ )3( ،‫ال معلومات داشته باشد‬ ‫بايد توانايي اســتفاده از زبــان را به صراحت و موجز‬ ‫) تمايل به وقف زمان و انرژي داشته‬4( ،‫داشــته باشد‬.‫باشد‬ The quality of a test depends on the meaningfulness, truthfulness, and relevance of the statements of ideas. ‫ صحت و ارتباط ميان‬،‫كيفيت تست به معنادار بودن‬.‫معاني بستگي دارد‬ The chance of guessing the correct answer is one in two for a true- false item. /‫شانس حدس‌زدن پاسخ صحيح در تست‌هاي صحيح‬.‫ است‬1 ‫غلط‬ 2 Reviewing ‫مروركردن‬ When test items have been written, they need to be reviewed before they are tried out. ‫ قبل از آزمايش بايد بازبيني‬،‫وقتي سؤاالت نوشته شدند‬.‫شوند‬ Pretesting ‫پيش‌آزمون‬ A detailed item–by–item analysis of the results, is technically called an item analysis. ‫ اصطالحاً تحليل‬،‫تحليل جامع و گزينه به گزينه نتايج‬.‫گزينه‌اي نام دارد‬ The goals of pretesting ‫اهداف پيش‌آزمون‬ To identify poor or defective items that need improvement and to find out nonfunctioning or implausible alternatives. ‫مشخص كردن سؤاالت بي‌اهميت و معيوب كه نياز‬ ‫به اصالح دارند و يافتن گزينه‌هاي غيركاربردي يا‬.‫غيرقابل‌قبول‬ To determine the facility level and the discrimination power of each item. ‫مشخص‌كردن سطح سهولت و قدرت بازشناسي هر‬.‫گزينه‬ To discover weaknesses in the directions and to determine the appropriate time limits for the test. ‫كشف ضعف دستورالعمل‌ها و مشخص‌كردن محدوده‬.‫زماني مناسب براي آزمون‬ Item facility index shows the percent of subjects who answered the item correctly. ‫ درصد افرادي را كه به‌درستي‬،‫شاخص سهولت سؤال‬.‫ نشان مي‌دهد‬،‫به سؤال پاسخ داده‌اند‬ Item discrimination index shows whether the item discriminates between the better and the poorer subjects. ‫ افراد‬،‫شاخص بازشناسي نشان مي‌دهد كه آيا سؤال‬.‫ضعيف‌تر و قوي‌تر را مشخص مي‌كند يا نه‬ Facility and discrimination indices are related both to the property of the items and the ability of the sample testees responding to the items. ‫شاخص‌هاي سهولت و بازشناسي (فرق‌گذاري) هر دو‬ ‫به كيفيت سؤاالت و توانايي امتحان‌‌دهند‌گان نمونه كه‬.‫ مربوط مي‌شود‬،‫به سؤاالت پاسخ مي‌دهند‬ ‫فصل ‪5‬‬ ‫تفسير نتايج آزمون‬ What is a test score? ‫نمره تست چيست؟‬ A score that an individual testee obtains on a test, called a raw score, is not by itself interpretable. ‫نمره‌اي كه هر فرد امتحان‌دهنده در يك تست كسب‬.‫ يعني نمره خام به تنهايي قابل تفسير نيست‬،‫مي‌كند‬ BASIC STATISTICAL CONCEPTS ‫مفاهيم آماري اصلي‬ Frequency distribution ‫توزيع فراواني‬ When we have a rather large number of scores, interpretation becomes easier if the scores are organized into groups. Such scores are called grouped scores or grouped data. ‫ اگر نمرات‬،‫وقتي كه تعداد نسبتاً زيادي نمره داريم‬.‫ تفسير ساده‌تر مي‌شود‬،‫در گروه‌ها طبقه‌بندي شوند‬ ‫به چنين نمره‌هايي نمره طبقه‌بندي‌شده يا اطالعات‬ ‫طبقه‌بندي‌شدهمي‌گويند‪.‬‬ ‫‪The organized form of groupings is called‬‬ ‫‪a frequency distribution.‬‬ ‫شكل سازمان‌يافته نمرات طبقه‌بندي‌شده توزيع فراواني‬ ‫نام دارد‪.‬‬ ‫‪A frequency distributin shows the frequency‬‬ ‫‪of the different scores grouped together.‬‬ ‫توزيع فراواني‪ ،‬فراواني نمره‌هاي مختلف طبقه‌بندي‌شده‬ ‫را نشان مي‌دهد‪.‬‬ ‫‪The data can be shown by a bar graph‬‬ ‫‪(also called a histogram) or by a line‬‬ ‫‪graph (called a frequency polygon).‬‬ ‫اطالعات را مي‌توان با نمودار ميله‌اي (كه نمودار ستوني‬ ‫نيز ناميده مي‌شود) يا با نمودار خطي (كه چند ضلعي‬ ‫فراواني ناميده مي‌شود) نشان داد‪.‬‬ Normal curve ‫منحني نرمال‬ When an infinite number of random scores are used, the resulting frequency line graph would take the form of a bell- like shaped curve. It is called normal probability curve or simply normal curve. ‫وقتي تعداد بي‌شماري نمره‌هاي تصادفي استفاده‬ ‫ شكل يك‬،‫ نمودار خطي فراواني حاصل از آن‬،‫مي‌شود‬ ‫ نمودار احتمال نرمال يا‬،‫ به اين نمودار‬.‫ناقوس را دارد‬.‫منحني نرمال مي‌گويند‬ When we go away from the middle to the either the right or the left, the pile drops off. ‫وقتي كه از مركز به طرف راست يا چپ نمودار حركت‬.‫ نمودار پايين مي‌آيد‬،‫كنيم‬ Normal curve is a theoretical distribution..‫منحني نرمال توزيعي نظري است‬ A normal distribution polygon is symmetrical, meaning that it can be divided into two equal halves. ‫ به اين معني كه‬،‌‫منحني توزيع نرمال متقارن است‬.‫مي‌تواند به دو نيمه مساوي تقسيم شود‬ The baseline in normal curve is divided into eight equal units marked ±1SD, ±2SD, from the zero point. These eight units are called standard deviation. ‫خط اصلي در منحني نرمال به هشت واحد مساوي‬ ‫ ! و‬1SD ‫تقسيم مي‌شود كه از نقطه صفر به‌صورت‬ ‫ اين هشت نقطه انحراف‬.‫ ! عالمت‌گذاري شده‌اند‬2SD.‫معيار نام دارند‬ Measures of central tendency ‫معيارهاي گرايش مركزي‬ The midpoint on baseline in a normal curve is the center of the distribution. ‫ مركز توزيع‬،‫نقطه وسط روي خط اصلي منحني نرمال‬.‫است‬ The center of the distribution in a normal curve represents the most frequent score in the distribution. ‫ بيانگر فراوان‌ترين نمره‬،‫مركز توزيع در منحني نرمال‬.‫در توزيع است‬ The three most widely used measures of central tendency are the mode, the mean and the median. ‫ ميانگين و ميانه پركاربردترين معيارهاي گرايش‬،‫ُمد‬.‫مركزي هستند‬ The mode is the score that occurs most frequently in a distribution of scores. ‫‌نمره‌اي است كه تكرار آن در توزيع نمرات بيش‌تر‬،‫مد‬.‫است‬ The mean refers to the arithmatic average of all the test scores..‫ گفته مي‌شود‬،‫ميانگين به معدل عددي نمرات آزمون‬ /X m ean = Xr = (in w hich Xr read asX - bar) N )‫ بار خوانده مي‌شود‬x ‫ به‌صورت‬x ( RX ‫ميانگين‬ N yy X = raw score X :‫ نمره خام‬ yy ∑ X = serocs war fo notiammus ‫ مجموع نمرات خام‬: ∑ X yy N= total mumber of scores ‫ تعداد كل نمرات‬:N The median refers to the midpoint in the score distribution..‫ نمره وسط در توزيع نمرات است‬،‫ميانه‬ In a normal curve, all three measures of central tendency are at the midpiont. ‫ در‬،‫ هر سه معيار گرايش مركزي‬،‫در منحني نرمال‬.‫نقطه وسط قرار دارند‬ When there is a lot of high scores, the scores are skewed to the left. ‫ منحني نمرات به چپ‬،‫ باال هستند‬،‫وقتي اكثر نمرات‬.‫منحرف مي‌شود‬ Skeweness refers to a piling of scores at one end and a long tail at the other. ‫انحراف به توده‌اي از نمره‌ها در يك سمت و دنباله‬.‫طوالني در سمت ديگر اشاره دارد‬ When the scores are not evenly distributed, the median represents a better index of centrality. ‫ شاخص‬،‫ ميانه‬،‫وقتي نمرات عادالنه توزيع نشده‌اند‬.‫بهتري براي نشان دادن مركزيت است‬ Measures of variability ‫مقياستغييرپذيري‬ The range is the difference between the maximum and the minimum scores..‫ اختالف بين باالترين و پايين‌ترين نمره‌ها است‬،‫دامنه‬ A more stable measure of variability is the standard deviation..‫ انحراف معيار است‬،‫پايدارترين مقياس تغيير‌‌پذيري‬ ∑ X2 S= N −1 yy S = standard deviation ‫ انحراف معيار‬:S The larger the deviations, the more varied the scores are..‫ نمرات متغيرتر هستند‬،‫هرچه انحراف‌ها بزرگ‌تر باشد‬ Variance is the sum of the squared deviation 2 scores divided by N-1. v = s N − 1 ‫واريانس جمع مربع نمرات انحراف تقسيم بر‬ r )2 R (x - x v= N-1 yy X= any observed score in the sample ‫ نمره مشاهده‌شده در هر نمونه‬:X yy X = the mean of all scores ‫ ميانگين كل نمرات‬: X yy N= number of scores in the sample ‫ تعداد نمرات در نمونه‬:N yy Variance V = S 2 ‫ واريانس‬:V Derived scores ‫نمرات اشتقاقي‬ Percentile score and standard score are examples of derived scores. ‫صدك (صد يك) و نمره استاندارد نمونه‌هايي از نمرات‬.‫اشتقاقي هستند‬ Percentile scores ‫نمره صدك‬ A percentile score describes the relative standing of a raw score in a sequence of scores. ‫صدك (صد يك) جايگاه نسبي نمره‌خام در توالي‬.‫نمرات را توصيف مي‌كند‬ Percentile units are not equal..‫واحدهاي صدك (صد يك) مساوي نيستند‬ cf Percentile score= (100 ) N yy N= number of scores in the disribution ‫ تعداد نمرات در توزيع‬:N yy cf= total number of cases within the score group and the one (s) below it. ‫ تعداد كل مواردي كه در محدوده دســته نمره‬:cf.‫يا زير آن است‬ Standard scores ‫نمره‌هاي استاندارد‬ When the mean is set at 0.0 and the standard deviation at 1.0 ,the resulting score is called a z-score. ‫ نمره حاصل‬،‫ باشد‬1 ‫وقتي ميانگين صفر و انحراف معيار‬.‫ ناميده مي‌شود‬z ‫يك نمره‬ X−X Z= Z= Z − score S yy X = raw score ‫ نمره خام‬:x yy X = mean ‫ ميانگين‬: X yy S = standard deviation ‫ انحراف معيار‬:S Correlation ‫همبستگي‬ The index which indicates the degree of a relationship is called the correlation coefficient. ،‫شاخصي كه درجه ارتباط (نسبت) را بيان مي‌كند‬.‫ضريب همبستگي نام دارد‬ The most appropriate way to campute a correlation coefficient for interval scores is the Pearson Product-moment correlation coefficient. ‫مناسب‌ترين روش براي محاسبه ضريب همبستگي در‬ Pearson ‫ ضريب همبستگي‬،‫مورد نمرات فاصله‌اي‬.‫ است‬Product-moment rxy = nRXY - RX - RY [nRX 2 - (RX) 2] [RY 2 - (RY) 2] When the data are available in ordinal or ranked form, it is appropriate to employ the Spearman rank-order correlation coefficient. 6 (RD 2) rho (t): t = 1 - N (N 2 - 1) ‫وقتي اطالعات به شكل ترتيبي يا دسته‌بندي‌شده‬ ‫ رديفي «رو» استفاده‬-‫ از ضريب همبستگي ترتيبي‬،‫باشند‬.‫مي‌كنيم‬ Standard error of measurement ‫خطاي استاندارد سنجش‬ SEM is the abbreviation from for standard error of measurement.‫ مخفف خطاي استاندارد سنجش است‬SEM X (n - X) SEM x = n- 1 yy SEM x = SEM forindividualss ' core ‫خطاي استاندارد سنجش براي هر شخص‬ = SEM X yy n = number of items in the test ‫ تعداد سؤاالت در تست‬:n yy x = individual’s observed score ‫ نمره مشاهده‌شده هر فرد‬:x yy =: an approximation ‫ تقريب‬: = Non- statistical factors affecting test scores ‫عوامل غيرآماري مؤثر در نمرات تست‬ There are some factors affecting test scores that are not statistical in nature, such as the effects of guessing, practice, coaching, ceiling and test compromise, test method characteristics, and test taker attributes. ‫بعضي از عوامل مؤثر در نمرات آزمون‪ ،‬محاسبه‌اي‬ ‫نيستند‪.‬از جمله اثرات حدس زدن‪ ،‬تمرين‪ ،‬تدريس‬ ‫خصوصي‪ ،‬حداكثر و حد وسط تست‪ ،‬بي‌اعتبار شدن‬ ‫آزمون‪ ،‬ويژگي‌هاي روش آزمون و ويژگي‌هاي شخص‬ ‫ممتحن‪.‬‬ ‫‪Effects of guessing‬‬ ‫اثرات حدس زدن‬ ‫‪Effects of guessing becomes severe‬‬ ‫‪when the number of choices is very lim-‬‬ ‫‪ited as in the case of true- false items,‬‬ ‫‪when the test is short, when the test is‬‬ ‫‪speeded, or when the items are poorly‬‬ ‫‪constructed.‬‬ ‫وقتي تعداد گزينه‌ها بسيار محدود است مثل سؤاالت‬ ‫صحيح‪ /‬غلط يا آزمون كوتاه يا در آزمون سرعت يا‬ ‫وقتي كه سؤاالت ضعيف طراحي شده‌اند‪ ،‬مشكل‬ ‫حدس زدن شديدتر مي‌شود‪.‬‬ yy Corrected score = R - W n-1 R - W = ‫ نمره تصحيح شده‬ n-1 yy W=wrong answers ‫ پاسخ‌هاي نادرست‬:W yy n= number of answers ‫ تعداد جواب‌ها‬:n yy R= true answers ‫ جواب‌هاي درست‬:R Effects of practice and coaching ‫اثرات تمرين و آموزش‬ A test score may be improved by coaching or teaching to the test. ‫نمره آزمون ممكن است با آموزش يا تدريس خصوصي‬.‫بهتر شود‬ Test taker attributes ‫ويژگي‌هايامتحان‌دهنده‬ Individual characteristics such as cognitive style and group characteristics such as sex and enthnic background affect test performance. ‫ويژگي‌هاي فردي مثل سبك‌شناختي و ويژگي‌هاي‬ ‫ اجراي آزمون‬،‫گروهي مثل جنسيت و پيشينه اخالقي‬.‫را تحت تأثير قرار مي‌دهد‬ Temporary characteristics of the test taker such as emotional state and mental alertness may affect his performance on language test. ‫ويژگي‌هاي موقتي امتحان‌دهنده مثل حالت روحي و‬ ‫ ممكن است اجراي آزمون زباني او را‬،‫هشياري ذهني‬.‫تحت تأثير قرار دهد‬ Interpretation of test results ‫تفسير نتايج تست‬ To attain interpretive results, two ways of interpretation are identifited: norm- ref- erenced and criterion - referenced. ‫ دو راه تفسير مشخص‬،‫براي كسب نتايج قابل تفسير‬.‫ رجوع به معيار و رجوع به قاعده‬.‫شده است‬ If we compare the score of a testee to those of others, this would be norm referencing. ‫اگر نمرات يك امتحان‌دهنده را با نمرات ديگران‬.‫ يعني به معيار رجوع كرده‌ايم‬،‫مقايسه كنيم‬ If we interpret a testee's performance by comparing it to some specific criterion, this would be criterion referencing. ‫اگر عملكرد يك امتحان دهنده را با مقايسه آن با يك‬.‫ به ضابطه رجوع كرده‌ايم‬،‫معيار خاص تفسير كنيم‬ ‫فصل ‪6‬‬ ‫ويژگي‌هاي يك تست خوب‬ For a test to display dependeble results, four attributes are essential: validity, reli- ability, efficiency, and relevance. ‫براي اين‌كه يك آزمون نتايج قابل اعتمادي داشته‬ ،‫ قابليت اطمينان‬،‫‌اعتبار‬:‫ چهار ويژگي الزم است‬،‫باشد‬.‫ ارتباط‬،‫كارايي‬ Validity indicates the extent to which the test measures what we actually wish to measure. ‫اعتبار نشان مي‌دهد كه تست تا چه حد قادر به ارزيابي‬.‫چيزي است كه مي‌خواهيم آن‌را بسنجيم‬ Reliability shows how accurately and precisely the test measures what it is intended to measure. ‫ آزمون با چه دقت و‬،‫قابليت اطمينان نشان مي‌دهد‬ ‫ سنجيده‬،‫صحتي آن‌چه را كه قرار بود سنجيده شود‬.‫است‬ Efficiency is concerned with the feasi- bility of the test in terms of economy, convenience, and interpretability of the results. ،‫كارآيي به امكان‌پذيري تست براساس اقتصادي بودن‬.‫راحت و قابل تفسير بودن نتايج مربوط مي‌شود‬ Relevance concerns the closeness of agreement between what the test measures and the function that it is used to measure. ‫ارتباط به نزديكي توافق بين آن‌چه كه تست مي‌سنجد‬ ‫ مربوط‬،‫و عملكردي كه براي سنجش به كار رفته‬.‫است‬ Validity ‫اعتبار‬ Validity is the single most important attribute of a good test..‫اعتبار مهم‌ترين ويژگي يك آزمون خوب است‬ Validity can be referred to as truthfulness of measuring tool..‫اعتبار مي‌تواند به صحت ابزار سنجش اطالق شود‬ There are different kinds of validity of which the following are the most common: face, content, criterion-related, and construct. ‫انواع مختلف اعتبار وجود دارد كه رايج‌ترين آن‌ها‬.‫ مبتني بر ضابطه و ساختار‬،‫ محتوا‬،‫ ظاهر‬:‫عبارتند از‬ For an achievement test, content validity is the most important kind of validity. ،‫ اعتبار محتوا‬،‫براي يك آزمون آزمايش پيشرفت‬.‫مهم‌ترين نوع اعتبار است‬ For an aptitude test, evidence of criterion related validity is essential. ‫ مدارك اعتبار مربوط به‬،‫براي يك آزمون استعداد‬.‫ الزم است‬،‫ضابطه‬ Face validity ‫اعتبار ظاهري‬ Face validity refers to how the test appears to the testees, to the teacher, to the administrator and to the testing expert. ‌ ‫اعتبار ظاهري نشان مي‌دهد كه آزمون براي امتحا‬ ‫ن‬ ‫ برگزاركننده و متخصص آزمون‌سازي‬،‫ معلم‬،‫دهنده‬.‫چطور به نظر مي‌رسد‬ Content validity ‫اعتبار محتوايي‬ Content validity is basically concerned with the relevance of the test items to the purpose of the test. ‫اعتبار محتوايي به ارتباط سؤاالت آزمون با هدف آن‬.‫مربوط مي‌شود‬ Criterion- related validity ‫اعتبار مربوط به ضابطه‬ Criterion - related validity refers to the extent to which test scores correlate with a relevant reputed outside criterion. ‫اعتبار مربوط به ضابطه به اين اشاره دارد كه نمرات‬ ،‫تست تا چه حد با يك ضابطه خارجي شناخته‌شده‬.‫همبستگي دارد‬ There are two types of criterion-related validity: concurrent and predictive. ‫ هم زماني‬:‫دو نوع اعتبار مربوط به محك وجود دارد‬ ‫و پيش‌بيني‬ Concurrent validity is obtainded by correlating test scores with the same subjects' scores on a recognized measure taken at the same time. ‫اعتبار هم‌زماني به‌وسيله ارتباط دادن نمرات يك آزمون‬ ‫ در همان‬،‫با نمرات همان افراد در يك ارزيابي مشخص‬.‫ به دست مي‌آيد‬،‫زمان‬ Predictive validity, relates to the comparison of the test performance with the same subjects scores on a criterion taken at a later date. ‫ به مقايسه نحوه عملكرد تست با نمرات‬،‫اعتبار پيش‌بيني‬.‫ ارتباط دارد‬،‫همان افراد در ضابطه معين در آينده‬ Criterion-related validity is the best proof of validity.‫اعتبار مربوط به ضابطه بهترين سند اعتبار است‬ The most common procedure for reporting criterion - related validity is the Pearson product- moment correlation. ،‫متداول‌ترين شيوه گزارش اعتبار مربوط به ضابطه‬.‫) است‬Pearson( ‫همبستگي لحظه‌اي ـ توليدي پيرسون‬ When the number of scores in the distri- bution is small, the Spearman rank- order method is used. ‫ روش رتبه اسپيرمن‬،‫وقتي تعداد نمرات توزيع كم است‬.‫استفاده مي‌شود‬ A perfect positive relationship between the test and the criterion would be represented by a coefficient of +1.0. ‫يك ارتباط مثبت ايده‌آل بين تست و ضابطه را ضريب‬.‫ نشان مي‌دهد‬+1 A perfect negative relationship is repre- sented by -1.0. and lack of relationship by a 0.0. coefficient. ‫ و‬-1 ‫ارتباط منفي كامل بين تست و ضابطه با ضريب‬.‫ بيان مي‌شود‬0 ‫عدم ارتباط با‬ Construct validity ‫اعتبار ساختاري‬ Construct validity refers to the extent to which a test measures a certain trait or theoretical construct. ‫اعتبار ساختاري نشان مي‌دهد كه آزمون تا چه حد‬.‫ويژگي خاص يا ساختار نظري را مي‌سنجد‬ Reliability ‫قابليت اعتماد‬ Reliability refers to the accuracy of measurement and the consistency of the results. ‫قابليت اعتماد به صحت سنجش و يكپارچگي نتايج‬.‫مربوط مي‌شود‬ A measurement is reliable when similar results are obtained in repeated testings. ‫يك سنجش هنگامي قابل اطمينان است كه نتايج‬.‫مشابه در آزمون‌هاي تكراري به دست آيد‬ A reliability index of 1.0 is a perfect reliabity..‫ يعني قابليت اطمينان كامل‬1 ‫شاخص قابليت اطمينان‬ A reliability of 0.0 indicates that the test has no reliability. ‫قابليت اطمينان صفر يعني آزمون قابليت اطمينان‬.‫ندارد‬ For a teacher – made test a reliability of 60 and above is adequate. ‫ براي يك آزمون كالسي‬،‫ و باالي آن‬60 ‫قابليت اعتماد‬.‫كافي است‬ Test – retest ‌reliability ‫ بازآزمايي‬-‫قابليت اعتماد آزمون‬ If a test is given twice to the same subjects and it yields similar results on the two administrations, the test is reliable. ‫اگر يك آزمون دوبار به چند نفر ارائه شود و نتايجي‬.‫ آزمون قابل اعتماد است‬،‫مشابه در هر دو اجرا بدهد‬ There are practice effect in addition to memory and administrative problems. ،]‫[در اجراي يك تست براي افراد مشابه به دفعات‬ ‫ تأثير تمرين‬،‫عالوه بر حفظ كردن و مشكالت اجرايي‬.‫وجود خواهد داشت‬ Alternate - forms reliability ‫قابليت اطمينان صورت‌هاي جايگزين‬ Alternate forms should have different items measuring the same points, presumably equal in facility and discrimination. ‫شكل جايگزين بايد سؤاالت متفاوتي براي سنجش‬ ‫همان نكات داشته باشد و قاعدتاً با سهولت و بازشناسي‬.‫يكسان‬ Split - half reliability ‫قابليت اطمينان دو‌نيمه‬ For purposes of calculating the split- half reliability, a test is devided arbitrarily into ‫‪two halves- e.g, odd and even numbered‬‬ ‫‪items- and two scores for each testee‬‬ ‫‪are obtained, one for each half.‬‬ ‫براي محاسبه قابليت اطمينان دو نيمه‌‪ ،‬آزمون به‌طور‬ ‫ال به دو نيمه سؤاالت فرد و زوج تقسيم‬ ‫اختياري مث ً‬ ‫مي‌شود و دو نمره براي هر امتحان‌دهنده به‌دست‬ ‫مي‌آيد‪ ،‬هركدام براي يك نيمه‪.‬‬ ‫‪The reliability of the whole test is corrected‬‬ ‫‪through the Spearman-Brown Formula‬‬ ‫‪given below:‬‬ ‫قابليت اعتماد براي كل آزمون از طريق فرمول اسپيرمن‬ ‫و براون محاسبه مي‌شود‪:‬‬ ‫)‪2 (reliability of half test‬‬ ‫‪Re liability of total test: +‬‬ ‫)‪1 + (reliability of half test‬‬ ‫(قابليت اطمينان نصف تست)‪ = 2‬قابليت اعتماد آزمون‬ ‫(قابليت اطمينان نصف تست) ‪1 +‬‬ Rational-equivalence reliability ‫قابليت اطمينان هم‌ارز منطقي‬ Kuder- Richardson Formula 21 (K-R 21) is a simple way of calculating approxi- mately the degree of correlation among test items. ‫) يك راه ساده‬21 K-R( ‫ ريچاردسون‬-‫ كودر‬21 ‫فرمول‬ ‫براي محاسبه تقريبي درجه همبستگي بين سؤاالت‬.‫آزمون است‬ rK - R21 = n [1 - xr - xr 2/n ] n-1 SD2 yy : X : mean score on the test ‫ نمره ميانگين در تست‬: X yy n: number of test items ‫ تعداد سؤاالت تست‬:n yy SD: standard deviation of the scores ‫ انحراف معيار نمرات‬:SD Sources of unreliability ‫منابع عدم اطمينان‬ An individual's observed score comprises a true score and an error score. ‫ به‌عالوه‬،‫نمره مشاهده شده هر فرد شامل نمره حقيقي‬.‫نمره خطا است‬ A true score refers to that portion of the obtained score that is not affected by random error. ‫نمره حقيقي درصدي از نمره به‌دست‌آمده است كه‬.‫خطاي اتفاقي تأثيري روي آن ندارد‬ Some factors contribute to test inaccuracy: characteristics of the testee(s) and char- acteristics of the test itself. ‫ خصوصيات‬:‫چند عامل بر عدم صحت آزمون اثر دارد‬.‫امتحان‌دهنده و خصوصيات خود آزمون‬ The temporary characteristics of the subject affects his performance: Poor luck at guessing, problems in concentrating, poor health, lack of practice, fatigue, and the like reduce reliability. ‫خصوصيات موقتي فرد عملكردش را تحت‌تأثير قرار‬ ،‫ مشكل تمركز‬،‫ بدشانسي در حدس زدن‬،‫مي‌دهد‬ ‫ خستگي و امثال آن قابليت‬،‫ عدم تمرين‬،‫بيماري‬.‫اطمينان را كاهش مي‌دهد‬ The second source of unreliability is the characteristics of the test itself. ‫ دومين عامل غيرقابل اطمينان‬،‫خصوصيات خود آزمون‬.‫بودن است‬ Scorer reliability is nearly perfect in the case of multiple – choice items. ‫قابل اطمينان بودن مصحح در تست‌هاي چندگزينه‌اي‬.‫تقريباً كامل است‬ Efficiency ‫كارآيي‬ Efficiency refers to the practical char- acteristics of a test such as costs, the amount of time it takes to construct and to administer, ease of scoring and ease of interpreting / reporting the results. ،‫كارآيي تست به خصوصيات عملي تست مثل هزينه‌ها‬ ‫ آساني‬،‫زماني كه براي طراحي و اجرا الزم است‬.‫ گزارش نتايج اشاره دارد‬/‫ آساني تفسير‬،‫نمره‌دهي‬ Relevance ‫ارتباط‬ The concept of relevance corresponds more or less to that of content validity. ‫مفهوم ارتباط كم و بيش با مفهوم اعتبار محتوايي‬.‫منطبق است‬ An item is relevant if it contributes to the validity of the test. ‫ مربوط‬،‫ اگر در اعتبار تست نقش داشته باشد‬،‫يك سؤال‬.‫است‬ Relevance has three aspects: balance, specificity, fairness..‫ دقت و بي‌طرفي‬،‫ تعادل‬،‫ارتباط سه جنبه دارد‬ If a test samples representatively all the important aspects of what needs to be tested effectively, it is balanced. ‫اگر تستي به‌طور انتخابي از تمام جنبه‌هاي مهم آن‌چه‬ ‫ آن تست‬،‫ نمونه‌گيري كند‬،‫ال مورد سؤال است‬ ً ‫كه عم‬.‫متعادل است‬ Specificity requires the test constructor to focus on constructing items that tap special components of the content of the test. ‫ طراح آزمون را ملزم به تمركز روي طرح سؤاالتي‬،‫دقت‬ ‫مي‌كند كه از اجزاي خاصي از محتواي آزمون بهره‬ ‫مي‌گيرند‪.‬‬ ‫‪A test that relates closely to the materials‬‬ ‫‪taught is fair to the testees.‬‬ ‫تستي كه به مطالب آموخته‌شده مربوط است‪ ،‬از نظر‬ ‫امتحان‌دهندگان منصفانه است‪.‬‬ ‫‪Administration of a test also affects its‬‬ ‫‪fairness.‬‬ ‫ت تأثير قرار‬ ‫اجراي آزمون هم‌چنين بي‌طرفي آن‌را تح ‌‬ ‫مي‌دهد‪.‬‬ ‫فصل ‪7‬‬ ‫ساختار تست‬ Grammatical structure is the most popular component in language tests because it permeates all language skills. ‫ساختار دستوري معمول‌ترين عنصر در آزمون‌هاي زبان‬.‫ زيرا هم ‌ه مهارت‌هاي زباني را در برمي‌گيرد‬،‫است‬ Grammatical structure is easier than oth- er components to test. ‫سنجش ساختار گرامري نسبت به عناصر ديگر ساده‌تر‬.‫است‬ Most experts agree on what must be included in structure tests. ‫بيش‌تر متخصصان در آن‌چه كه بايد در آزمون ساختار‬.‫ توافق دارند‬،‫گنجانده شود‬ Grammatical structure tests for native speakers of English concentrate on the structures of the written language. ‫آزمون‌هاي ساختار دستوري براي گويندگان بومي‬.‫ روي ساختار زبان نوشتاري تأكيد دارد‬،‫انگليسي‬ Structure tests for EFL learners concern with the structural patterns suitable for communicative purposes. ‫ به الگوهاي‬،‌EFL ‫آزمون‌هاي ساختار براي دانشجويان‬.‫ساختاري مناسب براي اهداف ارتباطي مربوط مي‌شود‬ Only for the most advanced testees, tests of structure aim at testing the knowledge of the grammatical system of the formal discourse. ‫ هدف از تست‌هاي‬،‫فقط براي امتحان‌دهندگان سطح باال‬ ‫ سنجش دانش نظام دستوري گفتار رسمي‬،‫ساختاري‬.‫است‬ The scrambled procedure is popular with younger learners. ‫روش درهم (مرتب كنيد) براي دانشجويان جوان معمول‬.‫است‬ Not many points can be tested by puzzle- solving tasks, especially with intermediate and advanced learners. ‫ خصوصاً براي دانشجويان‬،‫به وسيله تمرين‌هاي حل معما‬.‫ نمي‌توانيم نكات زيادي را بسنجيم‬،‫متوسط و سطح باال‬ Structure items in the form of short - answer questions or in supply forms are good for informal classroom tests or for tests when the number of testees is limited. ‫گزينه‌هاي ساختاري به شكل سؤاالت پاسخ كوتاه يا‬ ‫پركردني‌ها براي امتحانات غيررسمي كالسي يا امتحاناتي‬.‫ خوب است‬،‫كه تعداد امتحان‌دهندگان محدود باشد‬ Guidelines for item preparation ‫رهنمود‌هايي براي تهيه سؤال‬ The context must be meaningful and as natural as possible..‫متن بايد پرمعني و در حد امكان طبيعي باشد‬ The lead should be brief, clear, and straightforward, with no vocabulary that is not familiar to the subjects. ‫ بدون‬،‫ واضح و صريح باشد‬،‫صورت سؤال بايد خالصه‬.‫كلمه‌اي كه براي امتحان‌دهندگان ناآشنا باشد‬ The stem should provide sufficient context. There is no fixed rule regarding the length of context. ‫ هيچ قانون‬.‫صورت سؤال بايد اطالعات كافي بدهد‬.‫خاصي در ارتباط با طول متن وجود ندارد‬ The distractors in multiple – choice items should be plausible. ‫گزينه‌هاي انحرافي در سؤاالت چندگزينه‌اي بايد قابل‬.‫قبول باشند‬ Each item should have only one acceptable or clearly best answer. ‫هر سؤال فقط بايد يك پاسخ قابل قبول يا صريحاً‬ ‫بهترين پاسخ را داشته باشد‪.‬‬ ‫)‪The alternatives should be brief (economical‬‬ ‫‪and to the point.‬‬ ‫گزينه‌ها بايد خالصه (اقتصادي) و مربوط به موضوع‬ ‫باشد‪.‬‬ ‫‪The options should be of similar length.‬‬ ‫گزينه‌ها بايد طول يكسان داشته باشند‪.‬‬ ‫‪Each item should test only one point.‬‬ ‫هر سؤال فقط بايد يك نكته را بسنجد‪.‬‬ ‫‪Summary‬‬ ‫خالصه‬ ‫‪From structuralist's description of language,‬‬ ‫‪language testing borrowed the hierarchical‬‬ ‫‪analysis of language.‬‬ ‫در توصيف ساختارگرايان از زبان‪ ،‬آزمون‌سازي زبان از‬ ‫تحليل طبقاتي زبان بهره مي‌برد‪.‬‬ From psychometrics, language testing borrowed the objective test form and the methodology for test development. ‫ آزمون‌سازي زبان از آزمون عيني‬،‫از ديدگاه روان‌سنجي‬.‫و روش‌شناسي براي طراحي تست استفاده مي‌كند‬ From psychology, language testing bor- rowed the idea that behaviour is the sum of its parts. ‫ آزمون‌سازي زبان از اين عقيده كه‬،‫از ديد روان‌شناسي‬.‫ بهر‌ه مي‌برد‬،‫ مجموع اجزاي خود است‬،‫رفتار‬ ‫فصل ‪8‬‬ ‫آزمون لغت‬ The goal of testing vocabulary is to assess the subject's knowledge of lexical items. ‫ سنجش دانش افراد در مورد‬،‫هدف از آزمون لغت‬.‫گزينه‌هاي واژگاني است‬ In the case of achievement testing, the lexical items are chosen from the instruc- tional materials. ‫ سؤال‌هاي لغت از ميان‬،‫در مورد آزمون دستاورد‬.‫مطالب آموزشي انتخاب مي‌شوند‬ When testing language proficiency selection of the lexical items is a difficult task. ‫ انتخاب سؤاالت لغوي كار دشواري‬،‫در آزمون مهارت زبان‬.‫است‬ Passive vocabulary relates to words that the subjects recognize in a written or in an oral stimuli but they may not use them in speaking or writing. ‫لغت غيرفعال‪ ،‬به لغاتي اشاره دارد كه افراد آن‌ها را‬ ‫در متن نوشتاري يا شفاهي تشخيص مي‌دهند ولي از‬ ‫آن‌ها در گفتار يا نوشتار استفاده نمي‌كنند‪.‬‬ ‫‪Active vocaboulary concerns words upon‬‬ ‫‪which subjects have a full command in‬‬ ‫‪using them frequently in speech and‬‬ ‫‪writing.‬‬ ‫لغت فعال‪ ،‬به لغاتي اشاره مي‌كند كه افراد در استفاده‬ ‫مكرر آن‌ها در گفتار و نوشتار تسلط كافي دارند‪.‬‬ ‫‪Only content words (nouns, verbs, adjectives,‬‬ ‫‪adverbs) are included in vocabulary tests.‬‬ ‫فقط كلمات محتوايي انگليسي (اسامي‪ ،‬افعال‪ ،‬صفات‪،‬‬ ‫قيود) در آزمون‌هاي لغت مي‌آيند‪.‬‬ ‫‪Function words are included in structure‬‬ ‫‪tests.‬‬ ‫لغات كاربردي در آزمون‌هاي دستوري به كار مي‌روند‪.‬‬ At the elementary level vocabulary test items should contain basic words like the names of things. ‫در سطح ابتدايي سؤاالت لغت بايد شامل لغات اصلي‬.‫مانند اسامي اشياء باشد‬ At the intermediate level, words that are essential in oral communication should be included in lexical items. ‫ كلماتي كه در مكالمه شفاهي الزامي‬،‫در سطح متوسط‬.‫ در سؤاالت لغت مي‌آيند‬،‫هستند‬ At the advanced level, the words should be chosen from the lexicon of the written language. ‫ كلمات بايد از ميان كلمات زبان‬،‫در سطح پيشرفته‬.‫نوشتاري انتخاب شوند‬ Test designer has to take into account the frequency, scope, and availability of the words to be included in the test. ‫ زمينه و در دسترس بودن‬،‫طراح آزمون بايد فراواني‬.‫لغات را براي حضور در آزمون در نظر داشته باشد‬ Consulting word lists has some limitations: :‫استفاده از ليست‌هاي لغات انگليسي محدوديت‌هايي دارد‬ yy They are often outdated..‫ اغلب قديمي هستند‬ yy They are based on data collected from the written langage..‫ براساس زبان نوشتاري جمع‌آوري شده‌اند‬ yy They classify words according to relative frequency rather than difficulty. ‫ لغات را به‌جاي ســطح دشــواري براساس فراواني‬.‫نسبي طبقه‌بندي مي‌كنند‬ yy They do not indicate the frequency of the various meanings of the words..‫ فراواني معاني مختلف كلمات را بيان نمي‌كنند‬ yy They do not show the difficulty level of the words..‫ سطح سختي لغات را نشان نمي‌دهند‬ Guidelines for item preparation ‫رهنمود‌هايي براي طراحي آزمون‬ After the lexical items have been selected, the second task of the test constructor is to determine the form of the test items. ‫ دومين‬،‫پس از اين‌كه سؤاالت لغت انتخاب شدند‬.‫ تعيين شكل و فرم سؤال است‬،‌‫وظيفه طراح آزمون‬ A test with underlined word and four supplimentary choices has three disad- vantages: ‫تستي كه داراي كلمه زير خط‌دار و چهار گزينه تكميلي‬ :‫ سه ايراد به آن وارد است‬،‫باشد‬ yy It limits the testing of only one word in each test item. ‫ آزمون را فقط به يك لغت محدود‬،‫ در هر ســؤال‬.‫مي‌كند‬ yy Lexical items do not lend themselves to four sensible paraphrases. ‫ مناسب‬،‫ سؤاالت لغوي براي چهار عبارت قابل درك‬.‫نيستند‬ yy It allows the testees to ignore the whole context and get to the meaning of the word being tested. ‫ باعث مي‌شود كه امتحان‌دهنده به كل متن توجهي‬.‫نكند و به معني لغت مورد سؤال بپردازد‬ An item form that is generally very popular in vocabulary tests is the so - called standard vocabulary form. ‫ به‬،‫يك شكل سؤال كه در آزمون لغت معمول است‬.‫سؤاالت لغت استاندارد معروف است‬ Standard vocabulary form presents a very brief definition and asks the testees to pick up one of the four choices. ‫ تعريف مختصري ارائه مي‌كند و‬،‫شكل لغت استاندارد‬ ‫از امتحان‌دهنده‌ها مي‌خواهد كه يكي از چهارگزينه را‬.‫انتخاب كنند‬ Standard vocabulary form is very economical but it has backwash effect. ‫شكل لغت استاندارد بسيار اقتصادي است ولي تأثير‬.‫جانبي دارد‬ Guidelines for item construction ‫رهنمود‌هايي براي طراحي تست‬ The context should be clear enough to provide the testees with a clear meaning. ‫ن قدر شفاف باشد كه معناي صريح را در‬ ‌ ‫متن بايد آ‬.‫اختيار امتحان‌دهندگان قرار دهد‬ Not to include in the items any grammat- ical structures or erroneous source of difficulty that the testees may find hard to comprehend. ‫هيچ ساختار دستوري يا منبع اشتباهي كه درك آن براي‬.‫ در سؤال گنجانده نشود‬،‫امتحان‌دهنده سخت باشد‬ If the item being written is a paraphrase – type, the choices should be easier than the word being tested. ‫ گزينه‌ها بايد‬،‫اگر سؤال به‌صورت بازگويي نوشته شده‬.‫ساده‌تر از لغت مورد سؤال باشد‬ If the item is of the completion type, the distractors and the word being tested should be of the same level of difficulty. ‫اگر سؤال از نوع كامل‌كردني است‪ ،‬گزينه‌هاي انحرافي‬ ‫و لغت مورد سؤال بايد در يك سطح سختي باشند‪.‬‬ ‫‪The choices should be related to the same‬‬ ‫‪general topic or area.‬‬ ‫گزينه‌ها بايد به موضوع يا حوزه يكساني مربوط باشند‪.‬‬ ‫فصل ‪9‬‬ ‫آزمون تلفظ‬ Suprasegmentals are more critical for intelligibility than segmentals..‫درك زبرزنجيري‌ها سخت‌تر از زنجيري‌ها است‬ Recognition ‫تشخيص‬ Testing recognition of sounds, stress, and intonation can best be accomplished through multiple – choice and true – false items. ‫ تكيه و آهنگ با سؤاالت‬،‫ن تشخيص صداها‬ ‌ ‫آزمو‬.‫ بهتر سنجيده مي‌شود‬،‫ غلط‬/ ‫چندگزينه‌اي يا صحيح‬ Sound discrimination ‫بازشناسي صدا‬ Sounds can be tested through pictures or in isolation from their references. ‫صداها را مي‌توان با تصاوير يا دور از مرجعشان آزمايش‬.‫كرد‬ Pictorial items are particularly useful with beginners and children. ً ‫سؤاالت تصويري عم‬ ‫ال براي بچه‌ها و مبتديان مفيد‬.‫است‬ An oral stimulus which consists of a set of three or more words and the examinee has to identify the different one, is easy to prepare and administer and can be conveniently used. ‫تهيه و اجراي يك تست داراي محرك شفاهي كه شامل‬ ‫گروهي سه يا چند كلمه‌اي است و امتحان‌دهنده بايد‬ ‫ آسان است و مي‌تواند‬،‫كلمه متفاوت را تشخيص دهد‬.‫به‌راحتي استفاده شود‬ Stress recognition ‫تشخيص تكيه‬ Stress has been traditionally tested in isolation..‫تكيه از قديم به تنهايي سنجيده‌شده است‬ Intonation recognition ‫تشخيص آهنگ‬ Two formats are common for testing intonation: :‫دو شكل براي آزمون آهنگ متداول است‬ yy The examiner reads two or more sentences and asks the examinees to indicate which one is different. ‫ ممتحــن دو يا چند جمله را مي‌خواند و از امتحان‬.‫ نشان دهد‬،‫‌دهنده مي‌خواهد آن‌را كه متفاوت است‬ yy The testee hears a sentene and is asked to choose the meaning from among three or more choices. ‫ امتحان‌دهنده جمله‌اي را مي‌شنود و از او خواسته‬ ‫مي‌شــود كه از ميان سه يا چند گزينه معناي صحيح‬.‫را انتخاب كند‬ The problems of testing production of segmental and suprasegmental phonemes is due to the spoken response regarding test administration and scoring. ‫مشكالت آزمون توليد واج‌هاي زنجيري و زبرزنجيري‬ ‫ن به پاسخ شفاهي‬‌ ‫با توجه به اجرا و نمره‌دهي آزمو‬.‫مربوط است‬ The best way to test one's ability to produce the phonemes of a language is through interview test, but it is not the easiest. ‫بهترين راه براي آزمودن توانايي فرد در توليد واج‌هاي‬ ‫ البته اين آسان‌ترين راه‬،‫ از طريق مصاحبه است‬،‫زبان‬.‫نيست‬ Imitation ‫تقليد‬ Depending on the purpose of the test and the proficiency level of the testee, vowels, diphtongs, vowel reduction, consonants, assimilation, consonant cluster, stress and intonation can be evaluated in this method. ،‌‫بسته به هدف تست و سطح مهارت امتحان‌دهنده‬ ،‫ همگوني‬،‫ همخوان‌ها‬،‫ كاهش واكه‬،‫ دوآوايي‌ها‬،‫واكه‌ها‬ )‫ تكيه و آهنگ در اين روش (تقليد‬،‫خوشه همخواني‬.‫مي‌توانند ارزيابي شوند‬ Limitation of the imitation process is that the ability to imitate a given sound right after hearing it may not match the ability to produce it with similar precision when the model is absent. ‫محدوديت فرايند تقليد در اين است كه توانايي تكرار‬ ‫صدا درست بعد از شنيدن آن ممكن است با توانايي‬.‫ مطابقت نكند‬،‫توليد آن با همان دقت‬ Reading aloud ‫با صداي بلند خواندن‬ In this form, the examinee reads aloud a set of words, sentences or a passage of connected discourse. ‫در اين روش امتحان‌دهنده‪ ،‬گروهي از كلمات‪ ،‬جمالت‬ ‫يا پاراگراف داراي مطالب مرتبط را با صداي بلند‬ ‫مي‌خواند‪.‬‬ ‫‪Retelling‬‬ ‫بازگويي‬ ‫‪In this format, the examinees are asked to‬‬ ‫‪retell a story or an anecdote they are‬‬ ‫‪given to read prior to being tested.‬‬ ‫در اين حالت‪ ،‬از امتحان‌دهندگان مي‌خواهند تا داستان‬ ‫يا حكايتي را كه قبل از امتحان به آن‌ها داده شده‪،‬‬ ‫بازگويي كنند‪.‬‬ ‫‪Talking about pictures‬‬ ‫صحبت درباره تصاوير‬ ‫‪Pictures are used to elicit verbal responses.‬‬ ‫از تصاوير براي بيرون كشيدن پاسخ‌‌هاي كالمي (شفاهي)‬ ‫استفاده مي‌كنيم‪.‬‬ Guidelines for item construction and scoring ‫رهنمودهايي براي طراحي و نمره‌دهي تست‬ The material should represent informal spoken English with words of very high frequency. ‫مطالب بايد انگليسي گفتاري غيررسمي را كه داراي‬.‫ نشان دهد‬،‫لغات پرتكرار هستند‬ Not all sounds or stress patterns should be included in pronounciation tests. ‫لزومي بر وجود الگوي تمام صداها و تكيه‌ها در تست‬.‫تلفظ نيست‬ Testing the phonemes of the language in isolation is far from any real–life activity. ‫آزمودن واج‌هاي زبان به تنهايي از هر نوع فعاليت زندگي‬.‫روزمره به دور است‬ Pictures must be simple and free from any ambiguity generated by a difference in cultural background, age, or socioeconomic status. ‫تصاوير بايد ساده و دور از هر ابهام ناشي از تفاوت در‬ ‫ يا شرايط اجتماعي ـ اقتصادي‬،‫ سن‬،‫سابقه فرهنگي‬.‫باشد‬ Production tests should be administered to examinees individually. ‫تست‌هاي تشريحي (توليدي) بايد براي تك‌تك‬.‫امتحان‌دهنده‌ها اجرا شود‬ Scoring tests of production demands a criterion. ‫ضابطه‬/‫نمره‌دهي تست‌هاي تشريحي نيازمند يك معيار‬.‫است‬ ‫فصل ‪10‬‬ ‫آزمون درك مطلب شنيداري‬ Introduction ‫مقدمه‬ Listening comprehension is one of the most fundamental language skills an

آزمون سازي زبان - زينب صيامي - PDF

Document Details

Tags

Related

Summary

Full Transcript