You are here


Florida District Uses New Assessment Program to Improve Test Scores

Benchmark assessments allow teachers to target intervention efforts.

Faced with mounting pressure to improve standardized test scores, school districts across the country are seeking ways to focus intervention efforts on students who have the greatest risk of performing poorly on those high stakes tests. The challenge is how to reliably identify those students.

Now, one Florida district, Orange County Public Schools, is using a new assessment program that’s been shown to accurately predict how students will do on the FCAT, the annual standardized tests taken by all Florida students in grades 3 through 10.

" I can say that our FCAT results last year were universally better than they'd ever been before."

“We’re using an assessment program that appears to have good predictive ability in terms of FCAT scores,” says Lee Baldwin, the district’s senior director for accountability, research and assessment. He explained that the system gives teachers early warning about which students may not get a proficient score on the FCATs, as well as detailed information about each of those students’ academic weaknesses.

“There is so much riding on the FCAT that we needed to have information that would allow us to look at how well students would do on the test before they took it,” Baldwin says. “There had been a lot of attempts at solutions to this, but none were really working until we came up with this new formative assessment program.”

The heart of the program, now in its second year, is a series of benchmark tests created by The Princeton Review, of New York City. The Princeton Review was hired to create items and assessments that would accurately mirror the FCAT. “We felt comfortable that The Princeton Review had the best proposal,” Baldwin says. “We felt we could trust them to produce a high-quality assessment.”

Orange County Public Schools needed three benchmark assessments in reading and three in math for grades 3-10, for a total of 48 tests, each of which would have 25 questions. In addition, the district’s plan called for development of a series of fi ve-question “mini-assessments” that teachers could use at any time to test student progress on specific standards. However, before the first test question could be written, a team of educators from Orange County and editors from The Princeton Review worked for several weeks to establish guidelines and standards for the assessments.

Then, with the start of the test creation process, the district’s review team checked each question and provided feedback regarding content development and standard alignment. The final assessments were delivered to the district in a format that could be integrated seamlessly with a testing platform purchased from another vendor.

The tests were first administered in the 2005-06 school year. A detailed analysis is underway, but Baldwin says early signs about the program’s effectiveness are very encouraging. “It’s too early to say whether it’s cause-and-effect or coincidence, but I can say that our FCAT results last year were universally better than they’d ever been before, and our school grades last year were also better than ever.”

According to Baldwin, the assessment program created by The Princeton Review for his district could easily serve as a model for other districts. “I’ve gotten quite a few phone calls from colleagues across the state who want to know about our experience,” he says.

Recently, a statistical analysis of The Princeton Review’s test questions by an independent consultant showed that the assessments are correlated so closely to Florida’s state learning standards that they can accurately predict which students are likely to under-perform on the FCAT.

“Of course,” says Baldwin, “our goal is to arm our teachers with this information so they can then prove the predictions wrong.”

For more information, please call 800-Review2 or visit