14Research on ClassroomSummative AssessmentConnie M. MossAssessment is unquestionably one of theteacher’s most complex and importanttasks. What teachers assess and howand why they assess it sends a clear message tostudents about what is worth learning, how itshould be learned, and how well they are expectedto learn it. As a result of increased influences fromexternal high stakes tests, teachers are increasinglyworking to align their CAs with a continuum ofbenchmarks and standards, and students arestudying for and taking more CAs. Clearly, highstakes external tests shape much of what is happening in classrooms (Clarke, Madaus, Horn, &Ramos, 2000). Teachers design assessments for avariety of purposes and deliver them with mixedresults. Some bring students a sense of successand fairness, while others strengthen student perceptions of failure and injustice. Regardless oftheir intended purpose, CAs directly or indirectlyinfluence students’ future learning, achievement,and motivation to learn.The primary purpose of this chapter is toreview the literature on teachers’ summativeassessment practices to note their influence onteachers and teaching and on students and learning. It begins with an overview of effective summative assessment practices, paying particularattention to the skills and competencies thatteachers need to create their own assessments,interpret the results of outside assessments, andaccurately judge student achievement. Then, tworecent reviews of summative assessment practices are overviewed. Next, the chapter reviewscurrent studies of summative CAs illustratingcommon research themes and synthesizing prevailing recommendations. The chapter concludesby drawing conclusions about what we currentlyknow regarding effective CA practices and highlighting areas in need of further research.Setting the Context: TheResearch on SummativeClassroom AssessmentsAssessment is a process of collecting and interpreting evidence of student progress to informreasoned judgments about what a student orgroup of students knows relative to the identified learning goals (National Research Council[NRC], 2001). How teachers carry out this process depends on the purpose of the assessmentrather than on any particular method of gathering information about student progress. Unlikeassessments that are formative or diagnostic,the purpose of summative assessment is todetermine the student’s overall achievement in aspecific area of learning at a particular time—apurpose that distinguishes it from all otherforms of assessment (Harlen, 2004).The accuracy of summative judgmentsdepends on the quality of the assessments and235

236SECTION 4 Summative Assessmentthe competence of the assessors. When teacherschoose formats (i.e., selected-response [SR],observation, essay, or oral questioning) thatmore strongly match important achievementtargets, their assessments yield stronger information about student progress. Test items thatclosely align with course objectives and actualclassroom instruction increase both contentvalidity and increase reliability so assessors canmake good decisions about the kind of consistency that is critical for the specific assessmentpurpose (Parkes & Giron, 2006). In assessmentsthat deal with performance, reliability and validity are enhanced when teachers specificallydefine the performance (Baron, 1991); developdetailed scoring schemes, rubrics and procedures that clarify the standards of achievement;and record scoring during the performancebeing assessed (Stiggins & Bridgeford, 1985).Teachers’ Classroom Assessment Practices,Skills, and Perceptions of CompetenceTeacher judgments can directly influencestudent achievement, study patterns, selfperceptions, attitudes, effort, and motivation tolearn (Black & Wiliam, 1998; Brookhart, 1997;Rodriguez, 2004). No serious discussion of effective summative CA practices can occur, therefore, without clarifying the tensions betweenthose practices and the assessment competenciesof classroom teachers. Teachers have primaryresponsibility for designing and using summative assessments to evaluate the impact of theirown instruction and gauge the learning progressof their students. Teacher judgments of studentachievement are central to classroom and schooldecisions including but not limited to instructional planning, screening, placement, referrals,and communication with parents (Gittman &Koster, 1999; Hoge, 1984; Sharpley & Edgar,1986).Teachers can spend a third or more of theirtime on assessment-related activities (Plake,1993; Stiggins, 1991, 1999). In fact, some estimates place the number of teacher-made testsin a typical classroom at 54 per year (Marso &Pigge, 1988), an incidence rate that can yieldbillions of unique testing activities yearly worldwide (Worthen, Borg, & White, 1993). Theseactivities include everything from designingpaper–pencil tests and performance assessments to interpreting and grading test results,communicating assessment information tovarious stakeholders, and using assessment information for educational decision making. Through out these assessment activities, teachers tend tohave more confidence in their own assessmentsrather than in those designed by others. And theytend to trust in their own judgments rather thaninformation about student learning that comesfrom other sources (Boothroyd, McMorris, &Pruzek, 1992; Stiggins & Bridgeford, 1985). But isthis confidence warranted?The CA literature is split on teachers’ abilityto accurately summarize student achievement.Some claim that teachers can be the best sourceof student achievement information. Effectiveteachers can possess overarching and comprehensive experiences with students that canresult in rich, multidimensional understandings(Baker, Mednick, & Hocevar, 1991; Hopkins,George, & Williams, 1985; Kenny & Chekaluk,1993; Meisels, Bickel, Nicholson, Xue, & AtkinsBurnett, 2001). Counterclaims present a moreskeptical view of teachers as accurate judgesof student achievement. Teacher judgmentscan be clouded by an inability to distinguishbetween student achievement and studenttraits like perceived ability, motivation, andengagement that relate to achievement(Gittman & Koster, 1999; Sharpley & Edgar,1986). These poor judgments can be furtherexacerbated when teachers assess students withdiverse backgrounds and characteristics (DarlingHammond, 1995; Martínez & Mastergeorge,2002; Tiedemann, 2002).A Gap Between Perception and CompetenceFor over 50 years, the CA literature has documented the gap between teachers’ perceived andactual assessment competence. Teachers regularly use a variety of assessment techniquesdespite inadequate preservice preparation orin-service professional development about howto effectively design, interpret, and use them(Goslin, 1967; O’Sullivan & Chalnick, 1991;Roeder, 1972). Many teachers habitually includenonachievement factors like behavior and attitude, degree of effort, or perceived motivationfor the topic or assignment in their summativeassessments. And they calculate grades withoutweighing the various assessments by importance(Griswold, 1993; Hills, 1991; Stiggins, Frisbie, &Griswold, 1989). When they create and use

Chapter 14 Research on Classroom Summative Assessmentperformance assessments, teachers commonlyfail to define success criteria for the various levels of the performance or plan appropriate scoringschemes and procedures prior to instruction.Moreover, their tendency to record their judgments after a student’s performance rather thanassessing each performance as it takes placeconsistently weakens accurate conclusions abouthow each student performed (Goldberg &Roswell, 2000).In addition to discrepancies in designing andusing their own assessments, teachers’ actionsduring standardized testing routinely compromise the effectiveness of test results for accurately gauging student achievement and informingsteps to improve it. Teachers often teach testitems, provide clues and hints, extend timeframes, and even change students’ answers(Hall & Kleine, 1992; Nolen, Haladyna, & Haas,1992). Even when standardized tests are notcompromised, many teachers are unable toaccurately interpret the test results (Hills, 1991;Impara, Divine, Bruce, Liverman, & Gay, 1991)and lack the skills and knowledge to effectivelycommunicate the meaning behind the scores(Plake, 1993).Incongruities in teachers’ assessment practices have long been attributed to a consistentsource of variance: A majority of teachersmistakenly assume that they possess soundknowledge of CA based on their own experiences and university coursework (Gullikson,1984; Wise, Lukin, & Roos, 1991). Researchersconsistently suggest collaborative experienceswith assessments as a way to narrow the gapbetween teacher perceptions of their assessment knowledge and skill and their actualassessment competence. These knowledgebuilding experiences develop and strengthencommon assessment understandings, qualityindicators, and skills. What’s more, collaboration increases professional assessment language and dispositions toward reflecting duringand after assessment practices events to helpteachers recognize how assessments can promote or derail student learning and achievement (Aschbacher, 1999; Atkin & Coffey, 2001;Black & Wiliam, 1998; Borko, Mayfield,Marion, Flexer, & Cumbo, 1997; Falk & Ort,1998; Gearhart & Saxe, 2004; Goldberg &Roswell, 2000; Laguarda & Anderson, 1998;Sato, 2003; Sheingold, Heller, & Paulukonis,1995; Wilson, 2004; Wilson & Sloane, 2000).237Two Reviews of SummativeAssessment by the Evidence forPolicy and Practice Informationand Co-Ordinating CentreImpact of Summative Assessments and Testson Students’ Motivation for LearningThe Evidence for Policy and Practice Information and Co-Ordinating Centre (EPPI-Centre),part of the Social Science Research Unit at theInstitute of Education, University of London,offers support and expertise to those undertaking systematic reviews. With its support, Harlenand Crick (2002) synthesized 19 studies (13outcome evaluations, 3 descriptive studies, and3 process evaluations). The review was promptedby the global standardized testing movement inthe 1990s and sought to identify the impact ofsummative assessment and testing on studentmotivation to learn. While a more extensive discussion of CA in the context of motivationaltheory and research is presented in this volume(see Brookhart, Chapter 3 of this volume), several conclusions from this review are worthmentioning here.The researchers noticed that following theintroduction of the national curriculum tests inEngland, low achieving students tended to havelower self-esteem than higher achieving students. Prior to the tests, there had been no correlation between self-esteem and achievement.These negative perceptions of self-esteem oftendecrease students’ future effort and academicsuccess. What’s more, the high-stakes testsimpacted teachers, making them more likely tochoose teaching practices that transmit information during activities that are highly structured and teacher controlled. These teachingpractices and activities favor students who prefer to learn this way and disadvantage and lowerthe self-esteem of students who prefer moreactive and learner-centered experiences. Likewise, standardized tests create a performanceethos in the classroom and can become therationale for all classroom decisions and produce students who have strong extrinsic orientations toward performance rather than learninggoals. Not only do students share their dislikefor high-stakes tests but they also exhibit highlevels of test anxiety and are keenly aware thatthe narrow test results do not accurately represent what they understand or can do.

238SECTION 4 Summative AssessmentNot surprisingly, student engagement, selfefficacy, and effort increase in classrooms whereteachers encourage self-regulated learning (SRL)and empower students with challenging choicesand opportunities to collaborate with eachother. In these classrooms, effective assessmentfeedback helps increase student motivation tolearn. This feedback tends to be task involvedrather than ego involved to increase students’orientation toward learning rather than performance goals.Impact of Summative Assessments onStudents, Teachers, and the CurriculumThe second review (Harlen, 2004), whichsynthesized 23 studies, conducted mostly inEngland and the United States, involved students between the ages of 4 and 18. Twenty studies involved embedding summative assessmentin regular classroom activities (i.e., portfoliosand projects), and eight were either set externally or set by the teacher to external criteria.The review was focused on examining researchevidence to learn more about a range of benefitsoften attributed to teachers’ CA practices including rich understandings of student achievementspanning various contexts and outcomes, thecapacity to prevent the negative impacts of standardized tests on student motivation to learn,and teacher autonomy in pursuit of learninggoals via methods tailored to their particularstudents. The review also focused on the influence of teachers’ summative assessments practices on their relationships with students, theirworkload, and difficulties with reliability andquality. The main findings considered two outcomes for the use of assessment for summativepurposes by teachers: (1) impact on studentsand (2) impact on teachers and the curriculum.Impact on StudentsWhen teachers use summative assessmentsfor external purposes like certification for vocational qualifications, selection for employmentor further education, and monitoring theschool’s accountability or gauging the school’sperformance, students benefit from receivingbetter descriptions and examples that help themunderstand the assessment criteria and what isexpected of them. Older students respond positively to teachers’ summative assessment of theircoursework, find the work motivating, and areable to learn during the assessment process. Theimpact of external uses of summative assessment on students depends on the high-stakesuse of the results and whether teachers orienttoward improving the quality of students’ learning or maximizing students’ scores.When teachers use summative assessmentsfor internal purposes like regular grading forrecord keeping, informing decisions aboutchoices within the school, and reporting toparents and students, nonjudgmental feedbackmotivates students for further effort. In thesame vein, using grades as rewards and punishments both decreases student motivation tolearn and harms the learning itself. And the wayteachers present their CA activities may affecttheir students’ orientation to learning goals orperformance goals.Impact on Teachers and the CurriculumTeachers differ in their response to theirrole as assessors and the approach they taketo interpreting external assessment criteria.Teachers who favor firm adherence to externalcriteria tend to be less concerned with studentsas individuals. When teacher assessment is subjected to close external control, teachers can behindered from gaining detailed knowledge oftheir students.When teachers create assessments for internal purposes, they need opportunities to shareand develop their understanding of assessmentprocedures within their buildings and acrossschools. Teachers benefit from being exposed toassessment strategies that require students tothink more deeply. Employing these strategiespromotes changes in teaching that extend therange of students’ learning experiences. Thesenew assessment practices are more likely to havea positive impact on teaching when teachersrecognize ways that the strategies help themlearn more about their students and developmore sophisticated understandings of curriculargoals. Of particular importance is the role thatshared assessment criteria play in the classroom.When present, these criteria exert a positiveinfluence on students and teaching. Withoutshared criteria, however, there is little positiveimpact on teaching and a potential negativeimpact on students. Finally, high stakes use oftests can influence teachers’ internal uses of CA

Chapter 14 Research on Classroom Summative Assessmentby reducing those assessments to routine tasksand restricting students’ opportunities for learning from the assessments.Review of Recent Researchon Classroom SummativeAssessment PracticesWhat follows is a review of the research on summative assessments practices in classrooms published from 1999 to 2011 and gathered from anEducation Resources Information Center(ERIC) search on summative assessments. Studies that were featured in the Harlen and Crick(2002) or the Harlen (2004) reviews wereremoved. The resulting group of 16 studiesinvestigated summative assessment practices inrelation to teachers and teaching and/or students, student learning, and achievement. Acomparison of the research aims across the studiesresulted in three broad themes: (1) the classroom assessment (CA) environment and studentmotivation, (2) teachers’ assessment practicesand skills, and (3) teachers’ judgments of student achievement. Table 14.1, organized bytheme, presents an overview of the studies.Theme One: Students’ Perceptions of theClassroom Assessment EnvironmentImpact Student Motivation to LearnUnderstanding student perceptions of theCA environment and their relationship to student motivational factors was the common aimof four studies (Alkharusi, 2008; Brookhart &Bronowicz, 2003; Brookhart, & Durkin, 2003;Brookhart, Walsh, & Zientarski, 2006). Studiesin this group examined teacher assessmentpractices from the students’ point of view usingstudent interviews, questionnaires, and observations. Findings noted both assessment environments and student perceptions of CAs purposesinfluence students’ goals, effort, and feelings ofself-efficacy.As Brookhart and Durkin (2003) noted, eventhough high profile, large-scale assessmentstend to be more carefully studied and betterfunded, the bulk of what students experience inregard to assessment happens during regularand frequent CAs. Investigations in this themebuild on Brookhart’s (1997) theoretical modelthat synthesized CA literature, social cognitive239theories of learning, and motivational constructs.The model describes the CA environment as adynamic context, continuously experienced bystudents, as their teachers communicate assessment purposes, assign assessment tasks, createsuccess criteria, provide feedback, and monitorstudent outcomes. These interwoven assessmentevents communicate what is valued, establishthe culture of the classroom, and have a significant influence on students’ motivation andachievement goals (Ames, 1992; Brookhart,1997; Harlen & Crick, 2003).Teachers’ Teaching Experience and AssessmentPractices Interact With Students’ Characteristicsto Influence Students’ Achievement GoalsAlkharusi (2008) investigated the influence ofCA practices on student motivation. Focusing ona common argument that alternative assessmentsare more intrinsically motivating than traditionalassessments (e.g., Shepard, 2000), the studyexplored the CA culture of science classes in Muscat public schools in Oman. Participants included1,636 ninth-grade students (735 male, 901females) and their 83 science teachers (37 males,46 females). The teachers averaged 5.2 years ofteaching ranging from 1 to 13.5 years of experience. Data came from teacher and student questionnaires. Students indicated their perceptions ofthe CA environment, their achievement goals, andself-efficacy on a 4-point Likert scale. Teachersrated their frequency of use of various assessmentpractices on a 5-point Likert scale. Using hierarchical linear models to examine variations presentin achievement goals, the study suggests that general principles of CA and achievement goal theorycan apply to both U.S. and Oman cultures. Teachers became more aware of the “detrimental effectsof classroom assessments that emphasize theimportance of grades rather than learning and[focused] on public rather th

