Brave New World of Teacher Evaluation
Evaluating teachers—whether casually or more rigorously, annually or less frequently—has long been part of the job description of many a principal and assistant principal, who often have relied on occasional observations to make their judgments. What’s usually resulted are an overwhelming number of “satisfactory” ratings and the infrequent “unsatisfactory” designations.
But times are changing, thanks to a spate of state laws mandating, and raising the bar for, teacher evaluations. The quest in 2009 for significant funding from the federal $4.5 billion Race to the Top program has also spurred states to raise the ante and the standards of teacher evaluation, which was required to qualify for such funding. And the push for such evaluations is creating tension, as seen in the Chicago teachers’ strike in September, which revolved in part around such evaluations.
“Certainly, in teacher evaluation, we’ve seen the most dramatic change in the shortest amount of time since we’ve been tracking it,” notes Sandi Jacobs, the vice president and managing director for state policy at the National Council on Teacher Quality (NCTQ). That organization has kept a careful eye on the sudden growth of state evaluation laws.
According to NCTQ numbers as of last July, the number of states mandating the annual evaluation of teachers has increased from 15 to 23 in just the past three years. (And 43 states now require annual reviews of all new teachers.)
What’s more, the NCTQ survey notes, 22 states have made measures of student growth a significant part of the evaluation of all teachers, even veterans. Thirteen states, meanwhile, have identified those measures, from state- and district-created standardized tests to end-of-course exams, as the “preponderant criterion” in teacher evaluation—as much as 50 percent or more.
“States are prioritizing student achievement,” observes Jacobs, although she adds, “no state has taken the approach that student achievement scores are the only thing that matters.” In almost 20 states, teacher evaluation results may be grounds for dismissal after two or three consecutive years. “The bottom line is that these evaluations have to impact hiring, tenure, and professional development, Jacobs says. “If they don’t, we haven’t done much for children.”
So what’s a district to do?
There’s no shortage of answers as school systems from Los Angeles and Denver to Toledo, Ohio and Syracuse, N.Y., have rolled up their sleeves, worked hand-in-hand with teachers’ unions, and hammered out an evolving set of evaluation programs.
Building the Danielson Framework
Educators seem to be drawn to perhaps the most widely adopted approach—one being implemented by almost a dozen states, including Arkansas, Pennsylvania, Wisconsin, and Delaware and a growing number of districts, New York City, Chicago, Houston, and Syracuse among them. It’s an extensive framework developed by educational author and consultant Charlotte Danielson. Sixteen years ago, Danielson wrote one of the earliest books on teacher evaluation, titled Enhancing Professional Practice: A Framework for Teachers.
This framework—which depends mainly on classroom observation—covers 22 components divided among four domains: Planning and Preparation (including areas such as “Demonstrating Knowledge of Content and Pedagogy” and “Designing Student Assessments”); Classroom Environment (including “Managing Classroom Procedures and “Establishing a Culture for Learning”); Instruction (such as “Using Questioning and Discussion Techniques” and “Using Assessment in Instruction”); and Professional Responsibilities (“Participating in a Professional Community,” for example).
Districts need to ensure that school administrators are well-trained as classroom observers. “These are high stakes evaluations for teachers that put a heavy burden on the system to do it well,” Danielson emphasizes.
Danielson also says she’s concerned that a growing number of state laws may evaluate teachers simplistically, especially through standardized testing. “Legislators don’t understand how hard it is to do this well,” she complains. “If we only get better at labeling teachers as underperforming or performing satisfactorily, we won’t have done much for students.”
Part of the problem, she continues, comes from the gold rush of the federal Race to the Top program. “Everybody’s been in a hurry to implement a system by the day after tomorrow.” In contrast, Danielson points to Arkansas, which hired her as a consultant in 2009 and has taken a deliberate approach to implementing her framework as the state model. After developing the teacher evaluation program over two years, the state piloted it in about a dozen schools in the 2011-2012 academic year, and added another dozen this year.
“Those schools are truly guinea pigs,” admits Karen Cushman, the state’s assistant education commissioner for human resources and licensure, adding that a rollout in all districts will take place in the 2013-2014 academic year. “We’re trying to listen and involve all parties. There’s hard work ahead, but it’s going to be for the benefit of the kids.”
Danielson’s idea of rigor goes well beyond the teacher being evaluated. Starting with “a clear definition of what teaching is,” she issues a call for evaluators, whether administrators or peers, to be trained extensively in recognizing those defining factors. “We need to train teachers and principals so they are on the same page, and also to conduct an assessment of those preparing evaluations to ensure they know what they’re doing,” she argues.
Along those lines, Danielson has created an online program in collaboration with the educational company Teachscape to train teachers and evaluators alike. The program includes an online proficiency assessment, approximately 20 hours of training, and more than 100 master scored videos which are analyzed using Danielson’s framework, with an emphasis on what works in those lessons and what doesn’t. “It really changes the culture in the school,” she says. “You have teachers and principals talking about instruction. That’s been so rare.”
Support for teachers and administrators needs to go beyond online interventions, Danielson adds, and she calls for a fleet of professional development specialists to provide the needed support. While that approach may sound expensive, Danielson notes that recently, the Houston Independent School District simply converted 130 existing positions in its professional development department to focus specifically on teachers after their evaluations, with a keen eye toward making improvements.
Los Angeles Unified School District Superintendent John Deasy has been committed to the Danielson Framework for more than a decade since he held the same position in the Coventry (R.I.) Public Schools. “Early on it was the only thing out there,” he recalls, adding that the framework still works for him. “It’s very concise, and it provides a clear picture of what’s good teaching, so teachers and administrators are working side by side.”
While other districts and states have reconciled using this framework together with measures of student growth, that’s yet to happen in Los Angeles. A state law requiring that such measures be included in teacher evaluations has been the subject of a court battle, and Deasy is still in separate negotiations with the district’s teachers union, which has so far opposed that approach.
Change in Ohio
Toledo Public Schools has had to face its own challenges over the past year, as it attempts to integrate Ohio’s new requirement: measures of student growth count 50 percent of teacher evaluations as the district’s three-decades-old, tried-and-true approach, with classroom observations, counting another nearly 50 percent.
In the past, explains Toledo Public Schools Chief Academic Officer Jim Gault, those observations focused on areas such as teaching strategies, classroom discipline, and professionalism toward students and parents. New teachers also would be assigned to an internship program that provided mentoring, frequent classroom visits and review of lesson plans by more experienced tenured teachers—all of which figured into whether the teachers would be renewed. More experienced teachers who were not passing muster would receive professional development and could be put into the internship program for extra support.
But under the system that Toledo is implementing for the 2013-2014 school year, all classrooms will receive walkthroughs by administrators every three or four weeks, Gault says, although 50 percent of the teacher evaluation will depend on measurable student growth. “We did not use actual data until now,” Gault says. “And the biggest challenge is getting teachers to realize they do have a significant impact on students,” as measured by achievement results.
Toledo will retain the internship program on which it has depended for decades. “Statistics have shown that teachers in the program are likely to be successful,” Gault points out. Of the hybrid program the district will be deploying next year, he says, “We’re excited about it. We think it’s going to improve the quality of teacher practice.”
Denver Takes a LEAP
In Denver, meanwhile, the teacher evaluation program is undergoing a major overhaul. “We had a pretty traditional evaluation process and support system, but the pieces weren’t connected. Teacher development wasn’t necessarily connected to evaluation,” says Chief Academic Officer Susana Cordova. “And pretty much everybody (of the almost 4,500 teachers in the district) was rated “satisfactory.”
That nearly unanimous designation, she adds, did not lend itself to teacher or school improvement. In contrast, Denver’s homegrown LEAP program—which stands for Leading Effective Academic Practice and started in the 2010-11 academic year—has redefined the criteria for evaluation and established four levels of teacher performance: not meeting expectations, approaching expectations, effective, and distinguished. “We really wanted to design and develop a system that answered questions for administrators such as ‘How do you raise the bar?’ ‘How do you provide supports to reach teachers?’ and ‘How do you dramatically increase access to those supports?’ ” Cordova says.
Besides classroom observations by school administrators, Cordova adds, “We have a whole group of (specially trained) peer observers, teachers whose full-time job is to do just that,” she continues, adding that all are well-versed in the content areas of the teachers they are observing. “I think it’s a positive that classroom teachers are being observed by those with similar content experience, particularly when the (observed) teachers are more specialized. They very much appreciate it.”
From there, a series of support services kicks in, Cordova explains. “For instance, if a teacher gets feedback from an observation that he or she is not connecting with African American students in terms of culture and climate, that teacher receives (face-to face) coaching and is also directed to online resources.”
Fueled by a $10 million grant from the Bill and Melinda Gates Foundation, the program launched at 16 pilot schools in 2010-2011, expanded to 127 (or 4 percent) of the district’s schools during the last academic year, and has reached all of them this year. The Gates seed money notwithstanding, reforming teacher evaluation has not come cheaply. The almost $4 million of the 45 full-time peer observers comes straight from the district’s general fund.
Turning to Multiple Measures
Whereas Denver’s former evaluation approach consisted primarily of observation, the new system integrates Colorado’s new requirement that 50 percent of teacher evaluations be based on student growth, like in Toledo. To meet that requirement, the district uses multiple measures of student achievement from state assessments and standardized district tests to exams at the end of individual courses and portfolios of student work. “We’re working on a balance of the standardized and alternative,” Cordova says.
Also included in the overall evaluation are student perception surveys, which cover questions such as, “Does the teacher care about control in the classroom?” or “How does the teacher show that he or she cares about my learning?”
Districts around the country are scrambling to employ alternative measures to assess the so-called non-tested classes, from language arts and math in the early elementary years to foreign languages, physical education, and art. While student performances or portfolios in some of these courses can add a valuable dimension, many districts are using assessments geared to younger students, such as the Kindergarten Readiness Assessment—Literacy (KRAL) and the STAR assessments of reading, math, and early literacy in elementary grades.
Dan Weisberg, the executive vice president for performance management at The New Teacher Project, adds that, for all evaluation systems and at all grade levels, it’s important not to overlook highly-rated teachers in the process. “We’ve heard from these teachers that they preferred to be in an environment where they were consistently challenged to improve,” Weisberg reports.
Some districts see their new evaluation systems as works in progress. In the Syracuse (N.Y.) City District School District, the program was built around a K12 deployment of the Danielson Framework last year, but is switching to its own Syracuse Teaching and Learning Framework to better address the Common Core curriculum in English language arts and math in K5, says Chief Academic Officer Laura Kelley.
Denver’s program is also evolving. Teachers and principals are encouraged to give feedback, Cordova says, adding she is negotiating with teachers as to what a teacher remediation program would include. “We’ve collaborated with the union every step along the way,” she says, “from their representatives being at the table in the initial development to meeting with them regularly now.”