Jeffrey Solochek with Kathleen McGrory asked some experts to comment on Florida’s current chaotic start-up of its new FSA tests which have been administered by the American Institutes of Research (AIR). In the Tampa Bay Times:
Last year, Kansas leaders threw out results from their trial run of a statewide exam. Their situation was, in many ways, parallel to Florida’s.
“Our first week we had problems that were really our fault,” tied to server errors, said Marianne Perie, director of the University of Kansas Center for Educational Testing and Evaluation. “Then we had a period of time when we got hit by a DDOS.”
Eventually, the state had a time of trouble-free testing.
But multiple studies of student performance led experts to recommend throwing out the scores.
One analysis showed that students who had problems with test access skipped an average of 15 percent of the questions, while those who had no difficulties skipped 1 percent. Another review matched students with similar academic and demographic characteristics, and found the children who tested while problems were occurring no longer compared to the ones who tested later.
“Bad scores are worse than no scores,” Perie said, explaining why she recommended losing the scores
As their name implies, standardized tests rely on uniform conditions in order to get usable information. The standards are considered a foundation of fairness, comparability and integrity in scoring
The fits and starts of last week’s 8th, 9th and 10th grade FSA writing delivered something far from uniform. This Kansas anecdotes disturbingly reflects Florida’s first week.
“It’s what we don’t know that kills us,” said Scott Marion, associate director of the National Center for the Improvement of Educational Assessment.
Not knowing the full extent of the problems, and how they affected students, makes it difficult to analyze their effects, he said. Comparing the scores becomes problematic, making the data questionable.
“If you get to take your test and you get to go through it seamlessly, and I’m going through it and I don’t, do you have an advantage? I think you probably do,” Marion said. “That becomes an issue for comparability.”
Florida could face that problem.
Indeed so. And here’s why the FLDOE and AIR shouldn’t be allowed to investigate themselves either.
The Florida Department of Education has yet to address how it will handle results from this round of testing. Meanwhile, a growing number of critics have called upon state leaders to at least abandon attaching consequences to the scores, given the swirl of uncertainties.
Perie agreed that the state should make no decisions until it can run the statistics and determine if problems are real.
“They have to be able to compare (the first week) to a clean testing window,” she said, adding that the evaluation must be conducted by an independent firm, not the state or its testing vendor.
At the same time, she added, “there’s a difference between statistical analysis and credibility.”
Perie referred to Oklahoma’s 2014 testing cycle, during which vendor software did not work properly. A review uncovered no issues with the results, she said, but the public outcry prompted officials to throw out some student scores and fire the vendor.
“At a certain point, even if you don’t like to give up on accountability, you’re risking the credibility of the system,” Marion said.
The FLDOE and AIR are far too invested in ramming these test results through to be trusted. And far too many republican legislators utter “accountability” with the same fervor as “Hail Mary.” Such a cartel of self-interest and hubris hardly generates faith from a justifiably skeptical Florida.