Transcription of the episode “High stakes: Teaching to the tests in K-12”

[00:00:15] Jon M: I’m Jon Moscow.

[00:00:16] Amy H-L: And I’m Amy Halpern-Laff. Welcome to Ethical Schools. Today we’ll continue our conversation with Harry Feder, Executive Director of FairTest, a national nonprofit that advocates for equitable and transparent educational testing. Harry came to FairTest after a long career in public education in New York City as a teacher, researcher, education researcher, and advocate. Last time we spoke about college admissions tests, particularly the SAT. Today we’ll focus on high stakes tests in K-12. Welcome back, Harry.

[00:00:47] Harry F: Thanks, great to be here.

[00:00:50] Jon M: How did testing and test preparation become such a huge part of the K-12 experience?

[00:00:56] Harry F: Well, testing was always there. We all took some tests in grade school and high school. There were even some standardized tests. I remember taking something called the Iowa test or something like that in the fifth grade or in the third grade just to see how we did versus everybody else. But the current emphasis on testing and high stakes testing, I would say, begins in the Reagan era, or with the Reagan era, with the 1983 report, A Nation at Risk, where there was, and this is a consistent trope: “Our systems are failing. Our kids don’t know anything.” If you do media searches going back decades, you’ll see this over and over. But this time, and this is supported by Bill Bennett, who is Reagan’s education secretary, we have to do something to make sure that students are achieving.” And it took the next 20 years. There was something in the Clinton administration also, but this coalesced nationally at the state level in the 90s. There are some states that are adding high stakes tests. Massachusetts starts this MCAS test, which is actually now maybe decoupled, at least, from high stakes consequences. That’s a later conversation.

The law that was passed in 2000 was No Child Left Behind. This is maybe the last piece of truly bipartisan legislation passed by the Congress because it was the brainchild, or the coming together, of the pre 9-11 administration of George W. Bush and Senator Kennedy. They basically said from the left’s perspective, from the Senator Kennedy perspective, the reason to have lots of standardized tests is to shine a light on inequity in school to show what we basically knew; that students in poor districts, students of racial minorities, unrepresented minorities, perform worse on these tests. The idea is if you make school systems raise test scores, that will cause rsources to be funneled into the places where they’re needed most . So it created the system that each kid had to be tested in math and English every year from grades 3 through 8 and once in high school and also tested in science in high school, and that schools, teachers, and students will be judged on these test scores and that districts need to show what schools and districts need to show something called “adequate yearly progress.” And in 10 years, everybody will be perfect, which was ludicrous of course, and the problem with this was that the, that states basically had to ramp up their testing programs. Tests started to take up a lot of time, test preparation started to take up a lot of time, and there began to be a bubbling up of complaints from teachers, from parents, particularly, interestingly, in suburban school districts. Long Island in New York is a good example of this, where parents start complaining. “Hey, what happened to school? All our kids are doing is getting ready for these tests. They spend weeks and weeks taking these batteries of tests.” And then the schools are rated and ranked based on test scores. There were provisions then for the states to come in and, a school, if they didn’t make adequate yearly progress over time could be a school or city in need of improvement and thus the state would swoop in and have some improvement plan. And this was all based on test scores.

The other thing interesting thing this did is because now suddenly you have this trove of data about test scores of schools and districts, in swoops the real estate industry and now starts grading websites like Zillow and Trulia, and they basically take test scores, rank school districts, and say, here’s now, of course, as we know, it’s a self fulfilling prophecy because the districts that are wealthy, that funnel a lot of property tax money into their school systems, have higher test scores, whether they do test prep or not. It’s a socioeconomic thing. And so that maintains the, if you will, quality and segregation of suburban neighborhoods. This is less of a problem in an urban district where all the money gets mixed together.

Now, this backlash eventually created a call for reform, and so in 2016, the end of the Obama administration, the current law, ESSA, the Every Student Succeeds Act, took away the high stakes consequences. The whole “adequate yearly progress” system got thrown out, but the mandates for each kid to be tested twice in grades 3 through 8 and then again in high school remained.

States didn’t just suddenly abandon their testing system, so standardized testing is still… There are a lot of efforts at reform, some good, some bad. There’s backlash to high stakes consequences. There was a time when 27 states had a graduation test, and now we’re down to nine, and we may soon be down to seven, which we can talk about. As with all things social policy, there’s a pendulum swing, but this is where the testing regime comes from.

[00:06:35] Amy H-L: So, Harry, how much autonomy do states and districts have in terms of how and when kids are tested?

[00:06:42] Harry F: Oh, that’s a good question. They actually had less autonomy under No Child Left Behind. They have more autonomy now, but states and districts are required to test kids once in each grade level and every kid. So you can’t do things like what the NAEP does. The NAEP is the National Assessment of Education Progress. It’s a test that’s given nationally, randomly. So not every kid takes it. You don’t know when it’s coming either so you can make the argument that it’s a true measure of what’s really going on. And it’s designed to show trends. I will say from the FairTest perspective, I don’t have that much of a problem with NAEP, because it’s the heat check, as you will, that’s the thermostat. If it shows that after COVID, the scores went down. People engage in what we call misNAEPery. They use the NAEP scores for the wrong conclusion. No, we don’t need more testing. What we need is to get kids back in school. Obviously, zoom school, no good. So the states have some flexibility in what they use as their math and English test. They can develop their own product as some states do. California has the California test. There were consortia that were created that were linked to the Common Core, PARC and Smarter Balanced Assessment. Those tests still exist. Some states are still using a Smarter Balanced test for their ESSA accountability test at the high school level. About half the states use the SAT or the ACT as their accountability test. So kids have to take those tests.

The current problem with testing time is not just those summative tests at the end of the year, but now there are sorts of what we call interim assessments or formative assessments. There are 13 states now that have gotten dispensation from the federal government to do something called through-year testing. Instead of one test at the end, you get four over the course of the year. The argument is you get more information more quickly that the teacher can use. The truth of the matter is if you’re using a commercial product produced by the NWEA… There’s a RAND study. Administrators like them, but a lot of teachers and parents and kids don’t because they can’t really use them and they’re divorced from what actually goes on in the classroom. They’re not teacher-created assessments or even district-created assessments. They’re off-the-shelf assessments. So states and localities do have lots of flexibility other than this mandate of once for every kid, math, English, three through eight.

[00:09:31] Jon M. As you’ve mentioned, things have changed from No Child Left Behind to ESSA. But in general, what would you say the consequences for students, teachers, and schools are for both low test scores and, on the other hand, what are the benefits for high test scores at grade levels?

Before I answer that question, I just want to go back to Amy’s question. I want to add one thing to that. In ESSA, there are two provisions that are supposed to allow for assessment innovation. There’s something called the innovative assessment demonstration authority that a state can buy into and say, “we want to try something different than your standard off-the-shelf assessment. “The problem is there’s a requirement in that provision that the replacement has to be comparable. So when New Hampshire tried to do a statewide system of performance assessment in the beginning, and that fell apart because of the comparability, they weren’t allowed to do enough teacher-driven, innovative stuff that makes performance assessment really work. And then there was also a political change in New Hampshire, which killed the whole thing. And so very few states now participate in that. Three I think, because it’s too complicated. It’s too much red tape. We can’t really innovate.

But there’s also a smaller program called CGSA, which is Competitive Grants for State Assessments. And so, for example, New York, but the grants are small. They’re 1M to 3M, which for a state like New York isn’t a lot of money, but New York has used CGSA to do something called a plan pilot, which is to find schools that are interested in doing performance-based assessments, career and technical education, and to have those assessments be developed in lieu of the more standard standardized test. So there’s some additional innovation that is incentivized, but it’s not enough. And the rules are– make it hard to do.

Now. Jon, your question, which is the consequences. So, obviously, there’s the first consequence for a district and school is both reward and stigma, if you will. So if the test scores are low, then it’s a school with low test scores. Does that mean make it a bad school? Not necessarily. Right, because you have to look at the socioeconomic background of the kids. You have to look at where they are in their process, curriculum. There’s a lot of things. But but once you are labeled with, and this goes what I was talking to before, it’s how the real estate sites use the test scores.

The other thing is, there’s lots of evidence that when standardized tests become too important, you wind up quote-unquote “teaching to the test,” and that crowds out the subjects that aren’t subject to standardized tests – social studies, history, art, music, all those things. And this was another complaint in the No Child Left Behind days, that in particular, in schools of more limited means, there goes the art teacher. Why? Because we need another reading specialist. Now, kids need to know how to read. I’m not suggesting that that’s not important. It’s very important. It’s crucial. But there was definitely a sense that the things that weren’t tested in that way weren’t counted and thus didn’t get any attention. And that’s just bad for education generally.

And how you perform on standardized test is not indicative of your overall capacity and intelligence. We have become a system where success on these tests have become the coin of the realm to being deemed smart or dumb, and that has very deep consequences to the psyche of a student. It also has creates a lot of pressure on teachers and schools to to get kids to do well. Not surprisingly, we got cheating scandals and things like that, like the one in Atlanta in the beginning, because of the pressure. What is it Campbell’s law, right, if you attach lots of consequences to something, you are essentially going to incentivize gaming systems. And just recently, there, there were AP tests floating around for sale. Why? Because AP test scores have become really important for the college admissions process. So, as soon as you attach those consequences, you’re basically saying we’re not doing education for the sake of education. We’re doing education for the sake of ranking and sorting you. And that’s consistent with the ethos of post-Reagan America. And this is where we see it with young people.

[00:14:23] Amy H-L: How does this impact teaching as a profession? Both teacher satisfaction and retention?

[00:14:29] Harry F: Yeah. Low satisfaction and low rates of retention. Standardized tests done in this way deprofessionalize teaching. The next thing is you get scripted curricula. You’re telling a teacher that teaching is no longer a craft or a profession, but rather you are somebody working on the assembly line. At its at its most dystopian place, it becomes a Tayloristic view of education. Because if you ever walk into, as I had the ” pleasure” of a no excuses charter school, I won’t name the brand, and the class time is literally scripted by the minute. Straight out of Henry Ford and Frederick Winslow Taylor, efficiency models for the industrial economy. Why? Because the end product is we have to get those test scores up. And the way to do it is to regulate students in this way that is very controlled. It’s interesting. Lenin was a big fan of Frederick Winslow Taylor because it was a way to industrialize. It is man as machine, as part of this collective, so we have to make sure that you play your part. And that’s the part of teachers. There is very little discretion. There’s also very little trust. Part of the problem with testing being so central to the education story is we do it in part because we don’t trust teachers to do their job. All professions are regulated. Lawyers are regulated. Doctors are regulated. You have to get certifications. But it’s different because we don’t micromanage. Now there, there are some elements of ranking hospitals. But your blood pressure is a real number that’s indicated physical forces. A test score does not give you a magic window into everything that a kid is doing. It’s different. It’s palpably different.

[00:16:32] Jon M: I wanted to go back just for a second to the consequences question. And I don’t know if this is specific to New York, or perhaps to some of the other big cities or not, but If you could talk a little bit about the ways that test scores are used for middle school and high school admissions.

[00:16:53] Harry F: Well, that’s an admissions question. When test scores become the coin of the realm for middle school admissions or high school admissions, you wind up getting segregation. And that’s been true in New York City. It’s true in Chicago. And it’s not just residential segregation breeds education segregation. As soon as you create a gifted and talented school, lo and behold, that school becomes wealthier, whiter, all of those things. So that’s one thing, and there is another argument. Do we really want to separate people in that way? Can’t you have an integrated school, both integrated in ability and, and just have different differentiation for different kinds of students? I confess, I am a product of a standardized test in some ways, because I went to the Bronx High School of Science, which you can only get into by scoring X on a test. Now this was a long time ago. And the way we viewed these things was different than the way we view it now. I barely prepared for that test number one. And so now there’s an entire testing industry that it floats around and Kumon or whatever else it is, where kids basically engage in test prep to get into these schools, both middle school and high school. The other thing is, I recognize that while I was lucky, there were a lot of people who would have done great at Bronx Science, but they didn’t have the test score. They might have shown their intelligence in lots of ways other than a timed test. There are people who test well, and there are people who don’t test as well. Human intelligence, I fail to believe, and I guess that’s why I do what I do, should not be reduced to a single instrument. Now, it may be that, if you’re going to Bronx Science, you need to have some minimal proficiency in some, in the ability to do some mathematics. Okay, fine. But you have to look at other things as well. So I think that using it as a sole instrument in particular, or a major instrument, has wound up creating both segregation and a lack of opportunity for lots of people who would benefit from the opportunity. We view seats in middle school and high school, which is certainly products of socioeconomics, how you’re doing in the 4th grade or personal chemistry, even I don’t know, we view that as being already this scarce resource, which is preposterous. We talked about that in our college admissions discussion, but it’s even worse at this stage because everybody needs an adequate education and algebra or whatever, even if you’re never going to learn calculus.

[00:19:51] Jon M: You’ve written and spoken about the relationship between commercial testing companies and textbook companies. Would you explain that?

[00:20:01] Harry F: Sure. There are synergistic things that go on with tests and then materials for the test. Just use the Common Core as an example. If Smarter Balanced has this contract with Pearson, right, then okay, here are the tests. And now, oh, if you, the kids want to do well with the test, here is the book that goes with the test. It’s obvious, I think, because assessment drives instruction, assessment drives curriculum. What you test, you have to teach or try to teach. And now we have computer technology. There are standardized tests that are what is called iterative, that depending on whether you get a question right or not, you basically get one question versus a different question. So it calibrates levels of difficulty. And so those companies – Cambridge Associates, Pearson, NWEA – they’re making a lot of money off this stuff. And what they’ll say is, okay, in order to get kids to do well, we also have this packet of materials on how to teach fractions. So here, use this, too, and districts buy lots of curriculum and textbook writ large. The word textbook doesn’t necessarily mean that big fat book anymore. There are a lot of materials, if you will, but they’re commercially produced and generated. And you could say, well, a teacher can do what she wants with that stuff going back to your deprofessionalizing. You’re basically telling the trial lawyer here’s the script for the trial. We’ve tested it out. This works. But there’s a jury there of humans, and the good trial lawyer knows that I maybe need to leave some things out for this jury. So, yes, the teacher still has the capacity to monkey around with the materials. But as soon as you say there’s a test at the end, and our results are going to matter because the principal has a meeting with the superintendent, and then the superintendent has a meeting with their supervisor, and then we have to show our test scores to the Board, and that all the people in the district who own homes want their property values to be maintained and thus they need to see high test scores, you see where the incentive structure goes. It’s not to what do you feel like learning today, Jonny. That’s not part of the equation. Jonny may not be in the mood for fractions on Monday. Jonny may be in the mood for a science experiment. I don’t know. I’m not saying that there isn’t stuff you have to teach and all that, but the craft, the profession, I’ll use the word profession, of teaching gets undermined when the bottom line…

What is the purpose of a corporation? To make money for its shareholders. Nothing else. We can talk about the ethical obligations of a corporation, but under the law, if the corporation is not making money for its shareholders, it’s violating its fiduciary duty and can be sued under securities law. It’s called shareholder derivative actions. I used to defend them sometimes. And that’s what we’ve created here. We’ve said if you don’t get the test scores up, you fail. But a school is not producing pints of Ben & Jerry’s ice cream. It’s different. First of all, you don’t get to pick your ingredients. They’re given to you. And it’s about human development, which is a much more complex process than is both captured and incentivized by standardized tests.

[00:23:42] Amy H-L: You have a fascinating paper on the FairTest website about PISA, the Program for International Student Assessment. The line that really stood out to me is “America’s problem on PISA is poverty and inequality, not curriculum and instruction.” Would you explain what it is that PISA tests and why US scores haven’t been higher?

[00:24:05] Harry F: Well, first of all, if you factor out poverty, the US does pretty damn well. I think the paper makes that point. If you compare our 10 percent of socioeconomics with the equivalent socioeconomic strata in Sweden, which is more like 80 percent of the population, we do just fine. We’re in the top five percent of everything. I will say this about PISA. PISA itself, if you listen to Andreas Schleicher lately, who’s the OECD [Organization for Economic Cooperation and Development] dude who oversees PISA, they’re moving in a direction to try to measure things that aren’t classically standardized test stuff, they’re trying to measure things like creativity and deeper thinking and problem-solving. It’s hard to do in a standardized setting, but they have a new part of the PISA in which the US didn’t participate because our school systems are so diffuse. It would be impossible to say the US does X. We’re just too big and diverse. Even a place like France, if you walk in to an Ecole in Lyon and an Ecole in Paris on the same week, you’ll probably hear a similar lesson because there’s a mandated curriculum. But PISA generally measures math, English, and, I think, science. And I think there’s a social studies thing too. It’s a standardized test. The two shining stars on the PISA tests have been Singapore, where everything is small and controlled, and interestingly, Finland, where the teaching profession is held in tremendously high regard. To be a teacher is to be a university professor, to be a doctor, or to be a lawyer. And they actually don’t start kids in school till they’re 7 and there’s all sorts of… but Finland’s scores have been going down. Why? They’ve been going down since everybody got an iPhone. My friend Sam Abrams has written about this.

[00:26:09] Jon M: We’ve interviewed Sam.

[00:26:10] Harry F: Oh, good. We taught together, same school, and he’s wonderful. I’ll just put in a plug. He’s just moved to start a new Institute in connection with NEPC at University of Colorado, Boulder, the International Partnership for the Study of Educational Privatization. He’s leaving Teachers College to do that. So you might want to talk to him again soon. So that’s the argument, that our scores are low are because of poverty and inequality, because the poor kids don’t do as well on these tests. And we’ve got a lot of poor kids.

[00:26:36] Amy H-L: So what does FairTest propose as an alternative to the current testing system in K-12?

[00:26:42] Harry F: Yeah, so a couple of things, we’re big proponents of what I would call broadly performance-based assessments, ways that kids can demonstrate their learning through problem-solving, designing experiments. Sometimes it’s called project-based learning. Now, that’s not to say they don’t take tests and quizzes and learn facts or things. It’s just how do they demonstrate their knowledge in a larger way. And there are ways where you can standardize the skills and the levels that you are assessing through well-designed rubrics. It’s more difficult than a fill-in-the-bubbles test to do that. And teachers and schools can take the overall standards and rubrics and design things that fit the needs and levels of their kids and allow them to succeed and demonstrate what they know. And there are lots of places that do this. New York has a group of schools called the New York Performance Standards Consortium. You can go to their website; they have very well-established rubrics in all the major disciplines kids have to present. Another thing that it assesses is oral skill, which doesn’t don’t get assessed on any of these standardized tests, and which last time I checked was a big part of life. So there’s that element to it.

Massachusetts has a group of districts called the Massachusetts Consortium for Innovative Educational Assessments. They’re developing common tasks. They have this task bank. Oh, you want to do 5th grade science? Here’s something that was developed in Lowell, Massachusetts on how to test the porousness of rock. You could do that. Anaheim, California is a district that has committed itself to performance-based assessment. New Mexico, Colorado. They’re places that have graduation capstones which assess kids on their final project. So that, I think, should be the major part of educational assessment.

The other thing I would keep is NAEP so we can make sure that districts and schools are actually capturing kills, et cetera, et cetera. In the earlier grades, you still want teachers to use standardized tests to, let’s say, determine whether a kid is dyslexic, in a diagnostic way, I think standardized tests absolutely have a role there, done selectively and judiciously. Under no circumstances should there be high stakes consequences attached to any of this, and I think that’s the bottom line. That’s when we misuse standardized tests as opposed to using them as diagnostic, evaluative, and thermostat-like tools. I think we also have to change ESSA, and there’s a bill out there that hasn’t gotten anywhere, a more teaching/ less testing act, which would basically change the formula, to allow states to do matrix sampling, not test every kid every year. So just reduce the amount of testing, but still get the information that you need to see how different subgroups are doing and to maintain the sunlight on the inequalities.

[00:30:01] Amy H-L: Thank you, Harry Feder of FairTest.

[00:30:04] Harry F: Thanks guys. Pleasure to be here.

[00:30:06] Jon M: And thank you, listeners. Check out our new video series, What Would YOU Do?, a collaboration with Professor Meira Levinson of the Harvard Graduate School of Education and EdEthics. Go to our website, ethicalschools.org, and click video. The goal of the series is not to provide right answers, but to illustrate a variety of ethical viewpoints.

If you found this podcast worthwhile, please share it with a friend or colleague. Subscribe wherever you get your podcasts and give us a rating or a review. This helps others to find the show. Check out our website for more episodes and articles and to subscribe to our monthly emails. We post annotated transcripts of our interviews to make them easy to use in workshops or classes.

Contact us at hosts@ethicalschools.org. We’re on Facebook, Instagram, and Thread. Our editor and social media manager is Amanda Denti. Until next week.

Click here to listen to this episode.