Future of Schools

Schools, board members ask: Are Indiana's grades fair?

Schools across Indiana received their report cards today, with the state rating the highest scorers an “A” and the lowest laggards an “F,” terminology well known to the schoolchildren they serve.

But unlike the grade a student receives from a teacher, the state’s grades are not based on daily interactions and observation but on complex mathematical formulas.

This year, perhaps more than ever before, there are reasons to ask: is your school’s grade fair?

“I think it’s worth looking at,” board member Andrea Neal said. “I’m very uncomfortable with the formula.”

Two big problems plagued this year’s grading results: test administration and unpredictability.

Testing woes

Almost 80,000 ISTEP online test takers in May experienced glitches that caused their screens to freeze, or otherwise slowed or stopped their exams. Some schools with widespread glitches have raised concerns that their grades were adversely affected. ISTEP is the backbone of Indiana’s accountability system. Student test scores in grades three through eight are central to judging students, teachers and schools. The number of students who pass and the amount of growth they make over the prior year help determine a teacher’s raise and job security and a school’s A to F grade.

But after last spring’s testing problems, many Indiana educators have raised questions about whether they and their schools can be fairly judged on this year’s scores.

Christel House Academy today was the first school to push back on its grade when it received an F after more than five years of A grades. School officials said their data showed more than 90 percent of students whose grades went from passing to failing had faced online testing trouble. About 40 percent of Christel House’s test takes faced online glitches, but the school’s appeal was denied.

State board member Dan Elsener said the school could now appeal to the state board.

“There’s some reason there’s an anomaly here,” he said. “There’s a whole cohort of schools that don’t like the grades they got because of testing interruptions.”

Elsner said there was little the board could do but approve the grades, despite concerns they might not be accurate for all schools, because an outside consultant determined that very few students were so affected by the glitches that their tests were invalid. That’s the only advice they have to go on, Elsener said.

State Superintendent Glenda Ritz said the education department doubled checked the data, going student-by-student to be certain any tests that should have been invalidated were not included in the school’s results. If all of the scores counted toward the grade were valid, then the state must affirm the grade, she said.

Claire Fiddian-Green of the Center for Education and Career Innovation, an education agency created by Gov. Mike Pence that has often been at odds with Ritz, this time backed her up.

“I’m comfortable they conducted a thorough process with all the right steps,” Fiddian-Green said.

Yet, there is every reason to believe student scores can be affected by unexpected interruptions, said Cynthia Roach, director of research, evaluation and assessment for Indianapolis Public Schools. Just 1,400 tests were invalidated because the state’s consultant determined they were adversely affected, but Roach believes there were probably more students who should have had higher scores.

“It’s almost impossible for a student to take a test and score higher than what they know,” she said. “But it’s very easy to score lower than what they know. Everything affects kids.”

Even the consultant who evaluated the testing problems for Indiana last summer acknowledged that there was no way to definitively identify all students who likely would have had higher scores, Roach said.

Roach told the story of one IPS principal who reported a huge problem with frozen computer screens during ISTEP testing that plagued the students in her school’s gifted class. Afterward questions lingered about how the school’s grade could have been affected, even though those students were likely to pass either way.

“They did fine but was the freeze enough to affect their ability to get high growth?” Roach said. “Who knows?”

Unpredictability

When schools make drastic swings, such as from A one year to F the next or from F to A, a common explanation is that there have also been big changes in the school, such as an influx of new students or heavy turnover of teachers, Roach said.

But in some cases, Indiana schools that have seen none of those sorts of changes are unable to explain sudden reversals of fortune. More than a handful of schools making such big shifts makes even the state’s superintendent, Ritz, wonder if the problem is with the system, not the schools.

“A good system will show you have a school improving or a school not improving but not extremes like we are currently seeing in the current model,” she said.

Among the big swings this year are three schools that went from an A or B to an F and 25 schools that went from an F to an A or B. In fact, the oddities of Indiana’s current A to F formula, forged under former state Superintendent Tony Bennett, have Ritz pining for a planned overhaul.

The legislature earlier this year ordered the universally disliked growth measure junked and mandated a new system be created in 2014. Ritz, one of the current system’s critics, said the new system should eliminate most big shifts in school grades.

Stopping short of saying this year’s grades can’t be trusted, Ritz  focused on a future with new grading rules.

“We’ve had many schools where we have a fluctuation between two, three or four letter grades, up or down,” she said “I am very excited we are going to be implement, not this year but next year, a new A to F system. We are working toward that, an entire new system for A to F.”

A growth measure in the grade calculation aims to identify which schools did the best job of getting students to raise their test scores. It matches up pools of kids with similar backgrounds who scored about the same on prior tests and ranks them by the progress they made over the previous year. Those with the biggest gains earned extra points for their schools. But from the beginning, a wide range of critics, including some of Bennett’s closest allies, said the measure was too complicated and worried that it could produce unfair results.

Until 2011, the first year letter grades were instituted, Indiana followed a fairly basic formula for grading schools. It required at least 60 percent of students in a school to pass both math and English on ISTEP and high school tests in order to earn at least a D. Grades went up to a C at 70 percent, a B at 80 percent and an A at 90 percent. Schools that saw their passing rates improve enough from the prior year could get extra credit and potentially move up to a higher grade on the grading scale.

In 2012, Bennett scrapped that system, adding in new factors that aimed to measure “college and career readiness” that included the growth model, based on Colorado’s grading system.

But even if they know a new grading scheme is on the way, some board members remain uneasy with this year’s grades.

Frantically calling her late Thursday, Neal said, a Gary principal was certain errors caused his school’s grade to drop but an appeal was denied.

Neal said she’d rather the state simply report each school’s state test passing rates and how much they improved over the prior year, avoiding the difficulties of explaining how the grades were determined.

“I don’t feel its working for all schools,” she said.

Neal pointed to Park Tudor, an expensive and highly regarded private school in Indianapolis, which received a D grade despite 100 percent of its graduates going on to college and a slew of academic honors, as another example of a strange report card result.

Park Tudor spokeswoman Cathy Chapelle said its grade, too, was in error.

“The assessment grade reflects issues of reporting and communication, not of academic performance,” Chapelle said in a statement. “In fact, our academic standards and results are among the highest in the state. In 2013 alone, 201 Park Tudor students in grades 9-12 took a total of 490 Advanced Placement exams; 62% of the exams earned a score of 4 or 5 and over 87% earned a score of 3 or higher.”

Chapelle did not elaborate on what the school meant by “reporting and communication” or how it could have influenced Park Tudor’s grade.

If schools like Christel House and Park Tudor decide to appeal to the state board, would they prevail? Elsener was not encouraging, suggesting the best strategy might be just to move on.

“I think I’d say this year was a hiccup,” he said. “You have to decide where to put your best investment of time.”

a high-stakes evaluation

The Gates Foundation bet big on teacher evaluation. The report it commissioned explains how those efforts fell short.

PHOTO: Brandon Dill/The Commercial Appeal
Sixth-grade teacher James Johnson leads his students in a gameshow-style lesson on energy at Chickasaw Middle School in 2014 in Shelby County. The district was one of three that received a grant from the Gates Foundation to overhaul teacher evaluation.

Barack Obama’s 2012 State of the Union address reflected the heady moment in education. “We know a good teacher can increase the lifetime income of a classroom by over $250,000,” he said. “A great teacher can offer an escape from poverty to the child who dreams beyond his circumstance.”

Bad teachers were the problem; good teachers were the solution. It was a simplified binary, but the idea and the research it drew on had spurred policy changes across the country, including a spate of laws establishing new evaluation systems designed to reward top teachers and help weed out low performers.

Behind that effort was the Bill and Melinda Gates Foundation, which backed research and advocacy that ultimately shaped these changes.

It also funded the efforts themselves, specifically in several large school districts and charter networks open to changing how teachers were hired, trained, evaluated, and paid. Now, new research commissioned by the Gates Foundation finds scant evidence that those changes accomplished what they were meant to: improve teacher quality or boost student learning.  

The 500-plus page report by the Rand Corporation, released Thursday, details the political and technical challenges of putting complex new systems in place and the steep cost — $575 million — of doing so.

The post-mortem will likely serve as validation to the foundation’s critics, who have long complained about Gates’ heavy influence on education policy and what they call its top-down approach.

The report also comes as the foundation has shifted its priorities away from teacher evaluation and toward other issues, including improving curriculum.

“We have taken these lessons to heart, and they are reflected in the work that we’re doing moving forward,” the Gates Foundation’s Allan Golston said in a statement.

The initiative did not lead to clear gains in student learning.

At the three districts and four California-based charter school networks that took part of the Gates initiative — Pittsburgh; Shelby County (Memphis), Tennessee; Hillsborough County, Florida; and the Alliance-College Ready, Aspire, Green Dot, and Partnerships to Uplift Communities networks — results were spotty. The trends over time didn’t look much better than similar schools in the same state.

Several years into the initiative, there was evidence that it was helping high school reading in Pittsburgh and at the charter networks, but hurting elementary and middle school math in Memphis and among the charters. In most cases there were no clear effects, good or bad. There was also no consistent pattern of results over time.

A complicating factor here is that the comparison schools may also have been changing their teacher evaluations, as the study spanned from 2010 to 2015, when many states passed laws putting in place tougher evaluations and weakening tenure.

There were also lots of other changes going on in the districts and states — like the adoption of Common Core standards, changes in state tests, the expansion of school choice — making it hard to isolate cause and effect. Studies in Chicago, Cincinnati, and Washington D.C. have found that evaluation changes had more positive effects.

Matt Kraft, a professor at Brown who has extensively studied teacher evaluation efforts, said the disappointing results in the latest research couldn’t simply be chalked up to a messy rollout.

These “districts were very well poised to have high-quality implementation,” he said. “That speaks to the actual package of reforms being limited in its potential.”

Principals were generally positive about the changes, but teachers had more complicated views.

From Pittsburgh to Tampa, Florida, the vast majority of principals agreed at least somewhat that “in the long run, students will benefit from the teacher-evaluation system.”

Source: RAND Corporation

Teachers in district schools were far less confident.

When the initiative started, a majority of teachers in all three districts tended to agree with the sentiment. But several years later, support had dipped substantially. This may have reflected dissatisfaction with the previous system — the researchers note that “many veteran [Pittsburgh] teachers we interviewed reported that their principals had never observed them” — and growing disillusionment with the new one.

Majorities of teachers in all locations reported that they had received useful feedback from their classroom observations and changed their habits as a result.

At the same time, teachers in the three districts were highly skeptical that the evaluation system was fair — or that it made sense to attach high-stakes consequences to the results.

The initiative didn’t help ensure that poor students of color had more access to effective teachers.

Part of the impetus for evaluation reform was the idea, backed by some research, that black and Hispanic students from low-income families were more likely to have lower-quality teachers.  

But the initiative didn’t seem to make a difference. In Hillsborough County, inequity expanded. (Surprisingly, before the changes began, the study found that low-income kids of color actually had similar or slightly more effective teachers than other students in Pittsburgh, Hillsborough County, and Shelby County.)

Districts put in place modest bonuses to get top teachers to switch schools, but the evaluation system itself may have been a deterrent.

“Central-office staff in [Hillsborough County] reported that teachers were reluctant to transfer to high-need schools despite the cash incentive and extra support because they believed that obtaining a good VAM score would be difficult at a high-need school,” the report says.

Evaluation was costly — both in terms of time and money.

The total direct cost of all aspects of the program, across several years in the three districts and four charter networks, was $575 million.

That amounts to between 1.5 and 6.5 percent of district or network budgets, or a few hundred dollars per student per year. About half of that money came from the Gates Foundation.

The study also quantifies the strain of the new evaluations on school leaders’ and teachers’ time as costing upwards of $200 per student, nearly doubling the the price tag in some districts.

Teachers tended to get high marks on the evaluation system.

Before the new evaluation systems were put in place, the vast majority of teachers got high ratings. That hasn’t changed much, according to this study, which is consistent with national research.

In Pittsburgh, in the initial two years, when evaluations had low stakes, a substantial number of teachers got low marks. That drew objections from the union.

“According to central-office staff, the district adjusted the proposed performance ranges (i.e., lowered the ranges so fewer teachers would be at risk of receiving a low rating) at least once during the negotiations to accommodate union concerns,” the report says.

Morgaen Donaldson, a professor at the University of Connecticut, said the initial buy-in followed by pushback isn’t surprising, pointing to her own research in New Haven.

To some, aspects of the initiative “might be worth endorsing at an abstract level,” she said. “But then when the rubber hit the road … people started to resist.”

More effective teachers weren’t more likely to stay teaching, but less effective teachers were more likely to leave.

The basic theory of action of evaluation changes is to get more effective teachers into the classroom and then stay there, while getting less effective ones out or helping them improve.

The Gates research found that the new initiatives didn’t get top teachers to stick around any longer. But there was some evidence that the changes made lower-rated teachers more likely to leave. Less than 1 percent of teachers were formally dismissed from the places where data was available.

After the grants ran out, districts scrapped some of the changes but kept a few others.

One key test of success for any foundation initiative is whether it is politically and financially sustainable after the external funds run out. Here, the results are mixed.

Both Pittsburgh and Hillsborough have ended high-profile aspects of their program: the merit pay system and bringing in peer evaluators, respectively.

But other aspects of the initiative have been maintained, according to the study, including the use of classroom observation rubrics, evaluations that use multiple metrics, and certain career-ladder opportunities.

Donaldson said she was surprised that the peer evaluators didn’t go over well in Hillsborough. Teachers unions have long promoted peer-based evaluation, but district officials said that a few evaluators who were rude or hostile soured many teachers on the concept.

“It just underscores that any reform relies on people — no matter how well it’s structured, no matter how well it’s designed,” she said.

First Person

With roots in Cuba and Spain, Newark student came to America to ‘shine bright’

PHOTO: Patrick Wall
Layla Gonzalez

This is my story of how we came to America and why.

I am from Mallorca, Spain. I am also from Cuba, because of my dad. My dad is from Cuba and my grandmother, grandfather, uncle, aunt, and so on. That is what makes our family special — we are different.

We came to America when my sister and I were little girls. My sister was three and I was one.

The first reason why we came here to America was for a better life. My parents wanted to raise us in a better place. We also came for better jobs and better pay so we can keep this family together.

We also came here to have more opportunities — they do call this country the “Land Of Opportunities.” We came to make our dreams come true.

In addition, my family and I came to America for adventure. We came to discover new things, to be ourselves, and to be free.

Moreover, we also came here to learn new things like English. When we came here we didn’t know any English at all. It was really hard to learn a language that we didn’t know, but we learned.

Thank God that my sister and I learned quickly so we can go to school. I had a lot of fun learning and throughout the years we do learn something new each day. My sister and I got smarter and smarter and we made our family proud.

When my sister Amira and I first walked into Hawkins Street School I had the feeling that we were going to be well taught.

We have always been taught by the best even when we don’t realize. Like in the times when we think we are in trouble because our parents are mad. Well we are not in trouble, they are just trying to teach us something so that we don’t make the same mistake.

And that is why we are here to learn something new each day.

Sometimes I feel like I belong here and that I will be alright. Because this is the land where you can feel free to trust your first instinct and to be who you want to be and smile bright and look up and say, “Thank you.”

As you can see, this is why we came to America and why we can shine bright.

Layla Gonzalez is a fourth-grader at Hawkins Street School. This essay is adapted from “The Hispanic American Dreams of Hawkins Street School,” a self-published book by the school’s students and staff that was compiled by teacher Ana Couto.