As the profession of teaching continues to get more attention given recent events, a growing number of school districts from New York to California are adopting "value-added" measures of teaching quality to award bonuses or even tenure. And two competitive federal grants are spurring them on.
Bill Gates recently suggested that districts end teacher pay increases based on seniority and master's degrees, in lieu of rewarding effective teachers who take on large classes or needy schools. And preliminary findings of a study, financed by the Bill and Melinda Gates Foundation, revealed in December that teacher effectiveness can be reliably estimated by gauging student's progress on standardized tests. Also in December, the National Education Association announced that it's creating a national, independent commission to study the profession and make recommendations on maximizing teacher effectiveness.
But researchers who support value-added measures advise caution. The ratings, which use a statistical method to judge teachers' contributions to their students' test scores, can't compensate for an otherwise weak teacher evaluation system, they say.
Individual value-added ratings, while informative, researchers say, can be skewed by random events, such as student illness or a neighborhood crisis, which are outside a teacher's control. "It's not a perfect, unfailing measure," explains Daniel McCaffrey of RAND Corp., and author of several studies on value-added assessment. "It should be pooled with other information and the judgment of people who know what's going on in a classroom."
The federal government will only award its Race to the Top grant to states that permit using student achievement data to evaluate teachers, even while experts continue to debate whether student test scores can be used legitimately for that purpose.
Meanwhile, journalists have been accused of adding fuel to the controversy by publishing value-added rankings of individual teachers. The Los Angeles Times did so last summer, sparking outrage from the teachers' union. Then, last October, the New York City Department of Education's teachers union filed an injunction to keep the district from releasing 12,000 value-added ratings to the press.
To measure a teacher's effectiveness, value-added models find the difference between students' expected and actual test score growth, considering that students learn at different rates. The first value-added model, designed in 1982 by University of Tennessee statistician William Sanders, calculated students' expected growth using their previous standardized test scores. Other methods, like the one used in New York City, also consider student characteristics such as poverty level, English proficiency and class size.
The fewer students in a calculation, the more likely a teacher's value-added score is due to chance events, such as having a class with an unusually disruptive or helpful student, researchers say. As a result, a teacher's score can fluctuate widely from year to year. For this reason, researchers typically recommend that scores be averaged over several years.
Economist Sean Corcoran of New York University, who studied value-added systems in the New York City Department of Education and the Houston Independent School District, thinks value-added measures can be useful to schools, but not as part of a teacher's formal evaluation, in part because of their yearly fluctuation. "At best, they can tell you who the consistently high and low performers are," Corcoran says. "But for the vast majority of teachers, you learn almost nothing from them."
Certain student characteristics also influence teachers' ratings, says Corcoran. For example, teachers of gifted students tend to have lower ratings because high-achieving kids test high on standardized tests and, thus, make smaller gains. Eric Hanushek, a Stanford University economist, supports the use of value-added measures for formal evaluation, although he agrees that they don't discriminate among most teachers who fall in the middle range.
But being able to identify the most and least effective teachers on an objective measure like their value-added ratings is valuable information that is difficult to get any other way, he says. Hanushek adds that research, including his own, has found that good teaching can wipe out achievement deficits of low-income students. Eliminating the worst teachers would have a similar effect, he notes. Replacing the bottom 6 to 10 percent with average teachers would be enough to make U.S. students the leader in math and science.
Advocates and critics of value-added measures tend to agree on one point: traditional teacher evaluation is seriously flawed. "Most teachers, even in their first or second year, are being told they're doing a fantastic job," says Timothy Daly, president of The New Teacher Project (TNTP), a national nonprofit.
In 2009, the organization released a study that revealed that only 1 percent of teachers in 12 cities, including Chicago and Denver, received unsatisfactory ratings. Teachers should be evaluated instead through ongoing classroom observations with measurable criteria, TNTP argued in an October report, Teacher Evaluation 2.0. For example, administrators should ask, "Did students' work indicate that they met the lesson objectives?" Measurable evidence of student learning, including value-added data if available, must be the primary basis on which teachers are judged, Daly insists.
Linda Darling-Hammond, education professor at Stanford University, agrees that many districts evaluate teachers on criteria that are irrelevant to student learning: Is the class quiet? Are the bulletin boards neat? But she rejects value-added scores as too variable from year-to-year to provide useful feedback.
In her October report for the Center for American Progress, she advocated for a national evaluation system that would rate teachers on practices shown by research to improve student learning, such as collaborating to improve instruction."We have a lot of evidence that what most improves student achievement is teachers working together," she says.
Linking Pay to Achievement
Two federal grants are funding new systems to link educator pay to student achievement. The Teacher Incentive Fund has awarded 95 grants since 2006, mainly to districts. The Race to the Top competition awarded $4 billion in grants to 11 states and the District of Columbia in 2010. But the grants can also fund well-rounded, research-based teacher evaluation systems, according to Darling-Hammond.
While federal programs don't encourage that, they don't discourage it either, she observes. So far, little research exists on the impact of pay-for-performance systems for educators. A three-year study released in September by Vanderbilt University found that bonus pay alone without other supports had no impact on student learning.
Hanushek of Stanford doubts that bonus pay would motivate teachers, as "the vast majority are working as hard as they can," but based on research on other industries, he believes it could attract new people to the teaching profession over time. Several districts that recently began to use value-added data to award bonuses or tenure say they are encouraged by the results so far.
In New York City, value-added assessments have encouraged public school principals to deny tenure to a larger percentage of teachers, according to Joanna Cannon, a district accountability administrator. New York first released value-added reports in 2009 and permitted their use for tenure decisions in 2010. Under pressure from the teachers' union, the state legislature barred the use of student achievement data for teacher evaluation in 2008 but repealed the law in 2010 to make the state eligible for a Race to the Top grant.
New York state won the grant in August after the teachers union agreed that students' state and district test results would count for 40 percent of a teacher's evaluation by 2012-2013. "Ultimately, tenure decisions will still be up to principals and won't be based strictly on [value-added] data," Cannon says. "But this is one valuable tool that we think schools should be using."
Teacher Retention Rises
Houston Independent School District adopted a value-added model in 2007 and began awarding annual bonuses of up to $10,300 to teachers based on their most recent value-added score, schoolwide performance and other criteria. Since the program began, teacher retention has steadily increased, especially among bonus winners, according to Assistant Superintendent Carla Stevens.
Within two years, the retention rate for awarded teachers rose from 84 percent to 92 percent. Meanwhile, retention of the bottom performers, who received no award, shrank from 13 percent to 2 percent. A downside to the bonus system is that only 34 percent of Houston teachers are eligible for the maximum bonus, Stevens explains.
Value-added scores are calculated from the results of state tests, which cover only core subjects and not art or foreign language, and only certain grade levels. "The non-core teachers feel they're being discriminated against," she says.
Stevens would like to see the bonus system based instead on the results of a new teacher evaluation the district is creating using a U.S. Department of Education Teacher Incentive Fund grant, which supports performance-pay plans for teachers and principals in high-need schools. Using Houston's current evaluation system, principals have rated only 3 percent of teachers unsatisfactory in any domain, she reports.
In 2007, the Winston-Salem/Forsyth County (N.C.) Schools began using valueadded data provided by the state as part of its teacher evaluation system. Value-added data allows the district to check how rigorously principals evaluate teachers based on classroom observations, says Superintendent Donald Martin. If the scoring doesn't align, then the principal and administrator can discuss the situation, Martin explains.
Individual value-added ratings can fluctuate and are not always reliable, Martin acknowledges. But a score significantly below average—a "red" rating—signals a principal to look for potential problems and provide support, like in-classroom coaching. "If you have three reds in a row, we're seeing a [negative] pattern," he says.
Of 1,600 value-added scores calculated last year, 15 percent were rated "green," or high, 18 percent were red, and only 2 percent were red for a third consecutive year. In September, the district won an incentive fund grant, which will pay for bonuses for high value-added scores.
Martin says it's too early to know if that will lead to higher achievement. "The jury's out, but we have three years and $20 million to test it," he says. "We'll see."
Elizabeth Duffrin is a freelance writer in Chicago.