Weinberg: Myths about standardized testing

Steven Weinberg, a retired Oakland teacher and regular Education Report blogger, tells us what standardized tests can’t measure, in his view, and why. -Katy

Steven WeinbergIn my last 10 years working for Oakland Unified School District, I spent considerable time investigating the California Standards Tests and their results to help my school make sense of the data the tests generated. During that time I became aware of a number of myths have been built up about these tests, many propagated by the state or the test makers themselves.

Knowing the facts about these tests is important for drawing reasonable conclusions from their results and for making sound educational decisions for the future.

I know that most readers of this blog are already fairly sophisticated about the nature of standardized testing, but the results of these tests are so often misused, it is worth taking some time to review these misconceptions.

Myth 1: The California Standards Tests (CSTs) measure what teachers are supposed to teach.

Fact: Not everything measured on the CSTs is included in the standards. While the CST questions are linked to standards that they purportedly cover, they actually often require students to know information that is not included in the standards. Many test questions include above grade level vocabulary. Others require background knowledge, not included in the standards, that only some students will be familiar with because of their family backgrounds and life experiences.

In addition, important sections of the California Standards (e.g. oral language, writing at most grade levels, critical thinking, being able to conduct experiments) do not lend themselves to standardized testing, so those standards are never tested at all.

Myth 2: The CSTs were designed and scored by the State of California Department of Education.

Fact: The CSTs are written by out-of-state private testing companies under contract with the State Department of Education. The tests are even shipped out-of-state to be graded. Furthermore, these companies have often helped write the laws which control the bidding by which the contracts were awarded, thus helping them maintain a monopoly on test preparation.

Myth 3: The CSTs are carefully prepared.

Fact: The CSTs, like most of the standardized tests created in recent years, were constructed under timelines enacted by politicians who did not understand what was required to produce a quality test. The New York Times investigated this problem and concluded the tests were thrown together to meet political demands. The California High School Exit Exam, for example, had “impossible, unrealistic time lines” according to a testing company executive familiar with the process quoted in a New York Times story.

Myth 4: The CSTs are carefully examined to avoid errors.

Fact: Every year the CSTs contain mistakes, but the confidentiality rules required of anyone involved with the testing process prevent those mistakes from being publicized, or even reported. I have personally tried on 5 occasions to notify the state of errors on the test and have been told each time that I could not report them even to the State Department of Education itself, without violating the confidentiality rules. Since the state refused to let me tell them where the mistakes were, some of those errors were repeated the following years.

Myth 5: Individual student scores accurately measure a student’s achievement.

Fact: The tests are designed to give a broad picture of student achievement across a large number of students, not to be an accurate measure of any individual student’s achievement. Test makers expect that all students will guess at the answers to some questions and the scores of those who make lucky guesses will be balanced out by the scores of students who make unlucky guesses, but this only holds true if the group is large enough. That is why the test scores for groups of students under 50 and schools smaller than 100 are not given the same weight as the scores for larger groups. The state and the testing companies state that no educational decisions should be based on a single test score. The CDE information booklet for school districts and staffs says although the tests may be used to “help make decisions about student placement, promotion, retention or other considerations related to student achievement. These test scores should never be used by themselves to make…important decisions.” (p. 13, emphasis added)

Myth 6: Measuring the change in test scores of a group of students as they move from grade to grade is a good way to measure teacher or school quality.

Fact: The CSTs were not designed to allow for accurate comparisons from one grade level to another. The state itself says: “The CSTs should never be used for the following purposes: To monitor the progress of cohorts of students as they move through the grades. (Differences in state content standards tested between the grades, differences in performance level setting, and other factors prohibit cohort tracking with CST results.)”

Over the years, a consistent statewide dip in test scores in certain grades indicate that some tests (eighth grade English for example) are significantly harder than the tests taken the year before or the year after, so teachers of that grade will always appear to be doing a poor job. In spite of this fact, Oakland Unified has included these types of comparisons in their school rating procedure. Even worse, performance-pay advocates are advancing plans that ignore the limitations of our present tests and attempt to tie teacher pay or job retention to changes in student test scores from one grade to another.

I will have more to say about the limits and defects of these tests in future postings. Meanwhile, others might have their own comments to add.

Katy Murphy

Education reporter for the Oakland Tribune. Contact me at kmurphy@bayareanewsgroup.com.

  • Nextset

    I agree with Mr. Weinberg on all of this.

    It is insane to try to tie teacher pay to the testing. The tests only reflect the quality of students who took them, it cannot “score” the quality of the teaching they recently received.

    To the extent they are time pressure tests they are heavily influenced by the cognitive power of the student. Brighter student, faster processing speed, better score. You can do the same thing with the DMV testing. Yes you do need to have a clue about the subject matter. But given similar exposure to the material, these tests quickly separate the brights from the dulls. Public School Teachers are currently not specifically paid based on whether they are assigned bright or dull students.

    Some people want to change that. They want to tie pay to student scoring. Maybe this will happen. Maybe I could make an argument for or against this.

    The issue reminds me of the Auto Insurance Proposition on the ballot. It will cut prices to the professional class & upper/middle class that maintain auto continuous auto insurance, while really screwing ($1000 yr increase?) the proletariat who only periodically carry auto insurance. It will probably pass. Proles don’t vote and this is a light turnout dominated by people who do maintain continuous coverage who want the lowest prices. What to do??

    Do we try in this society to maintain policy to keep the proles afloat or do we just cut them adrift and let the chips fall where they may?

    This really does tie into this thread. Computer assisted testing and statistical/actuarial scoring is the primary way now we quickly identify the proletariat and apply the whip. This article says what the tests don’t measure – what it isn’t saying is what the tests do measure quickly, cheaply and accurately.

    Prole status.

    Brave New World!

  • Peter


    Yes, the tests are sometimes misused. However, I use them, along with the grades that our children get. Among other things, it is a bulwark against grade inflation, and a way for prospective parents to assess a schools student body.

    Isn’t it useful to have a common standard? Isn’t it true on the average, the CST/STAR show that the better students do well on the tests than the struggling students? And if there’s a major disconnect, then there should be a conversation between parent and school?

    What do you conclude, Steven? Do away with standardized tests? Adopt a national standard? Other?

  • Sue

    Every time the subject of testing comes up, I get another good laugh. All the way through my schooling – from kindergarten to completion of my college degree – every single teacher would have been thrilled to have their pay tied to my test scores. If their pay was tied to my classmates’ scores, not many teachers would have been so happy about it, though.

    My older son passed the CASHEE with nice scores, 427 on math and 409 on English, but his CST testing was always at/near the bottom. He was allowed accomodations for the CASHEE, but not for the CST. If the instructional aide who supported his CASHEE testing could get his pay tied to that CASHEE score, it would probably double. And older son had a 3.67 GPA in his junior year and a 4.00 in his senior year of high school.

    Younger son – he takes to testing like I did, and his teachers would probably also be thrilled to tie their pay to his scores. But, his last two report cards have had multiple D’s and F’s, a few C’s and maybe a B in P.E. He’s likely to be in summer school, or he’ll be repeating 7th grade next year.

    Kids’ abilities are all over the place, stronger or weaker in different areas. Testing only gives a snapshot of one skill – test-taking – on the day the test was administered. Some people just have a gift for taking tests (I took CLEP tests for 30 hours of general requirements for my degree – and generally spent only a couple of hours preparing with a basic text on each subject), and some people have other skills. Those other skills are usually much more valuable in the real world.

    But why should the state, or the legislators, and especially the test creators profits, be affected by the truth about testing?

  • Hot R

    So? We are way past challenging the efficacy of standardized tests. Every federal and state program is geared to closing the achievement gap based on standardized test scores. The tests are imperfect, but not useless. What do the SAT’s measure? They were originally designed to keep Jewish kids out of Ivy League schools… Nonetheless they are the single most important test young people take. What do AP exams measure? They consist of tests never actually given in a college classroom, but for which a high school student gets college credit. CST tests have shown that Oakland is far behind neighboring districts. Gee do you think Steven is trying to show that the tests aren’t well designed and thus detract from the value of their measurement? Despite Weinberg’s claims, each standard measured by the CST is “knowable” and thus teachable, as each California textbook is geared to this at each level, and test taking can be “taught.” The only myth is why this is still controversial.

  • Sue

    Fortunately, Hot R, we aren’t “past challenging the efficacy of standardized tests” – not yet, anyway.

    Older son took the SAT because his college of choice required it, but his score had zero influence on his college admission – again, fortunately, since we didn’t have the time to arrange for Spec. Ed. accomodations (getting college testing accomodations requires jumping through a *LOT* of hoops!), and his SAT score without accomodations was 740.

    My point is that we do look at tests, but they aren’t the only criteria used when evaluating a student’s potential.

    If test scores were a reliable predictor of fututre success, I’d be the CEO of the (major financial firm) company I work for, and the CEO would probably be sitting where I am, making only a five-figure annual salary.

    There’s *some* test-taking skills and strategies that I could teach anyone. But there’s also a lot that just “happens” because of the funny way my brain is wired. Someone whose brain isn’t wired the same way, can never learn that part – any more than a person who is colorblind can learn to distinguish red from green.

    Yes, those of us with good test-taking abilities can use our talent to our advantage. But if I could trade my testing-ability for my younger sister’s people-abilities, at least for a few days, we’d both be happier. I’d be able to negotiate the salary I’d like to have, and she’d be able to pass her Professional Engineering Exams. Passing those exams would raise her salary, too, since her employer uses a different pay scale depending on PE certification status.

    Hmmm – if she and I were both dishonest folks, I’d probably offer to take the P.E. exam for her. I don’t have even a bachelor’s in engineering, let alone her Master’s degree, but I think I could spend a couple of days with her engineering textbooks, and then get a passing score on the exam. I’m just wired that way.

    Heck, if I were that sort of person, I could have made a good income for a decade or so taking SATs and other sorts of tests for rich kids who didn’t want to have to get up so early on a Saturday morning, but wanted a good enough score for admission to their Ivy League schools.

    Colleges know about people like me – that’s why they look at more than an SAT score. But it’s awfully funny (to me, anyway), how often people who aren’t as smart as college admissions staff, will let a test score overrule their better judgement. I’m not complaining, of course, just laughing!

  • Nextset

    Hot R: How was the SAT designed to keep Jewish kids out of Ivy League??

    Jewish kids have an (huge) advantage on tests such as the SAT. They did start to overwhelm the Ivy League schools after WWII to the point that the schools put in place Jewish Quotas. That lasted about as long as it took for the name changing to take effect (John Kerry, Madeline Allbright, etc etc…).

  • Steven Weinberg

    Peter, You raise some interesting points. As a parent I used standardized test scores in just the way you described. I was happy when the results confirmed my impressions that my sons were doing well, but I don’t think I ever learned anything meaningful from them.
    There is a place for standardized tests, and that is when they are used as part of a diagnostic process for helping a student with learning disabilities, but the battery of tests needed for that process are far more intensive than the CSTs, and they require a trained person to analyze them.
    You mention the tests being a useful way for parents who are trying to pick a school to assess the student bodies of various possibilities. You could probably get all the information you need by just looking at the demographic section of the API report and ignoring the scores completely. That gives you plenty of information about the student body. However, no parent should select a school without visiting first and observing the classes.
    Finally you asked what I would do about standardized tests. First, I would try to make sure that everyone understands their limits. That is the main reason I wrote this article. Secondly, I think any standardized tests that are used need to be error-free, and the state needs to allow teachers or students who discover errors to report them. Thirdly, the high stakes connected to these tests need to be scaled back and a more holistic approach taken to evaluating schools and students.

  • Steven Weinberg

    Hot R, you are correct that standardized tests are being used widely by the state and federal government, but that is hardly a reason to stop challenging their efficacy, and I am certainly not alone in doing so. Daniel Koretz, Professor of Education at Harvard, explains in this excellent book “Measuring Up” that when standardized tests are used for high stakes measures they lose their validity and presents data to support that assertion. Debbie Meier, founder of New York’s famous Central Park East School, has a chapter in her book “In Schools We Trust” entitled “Why Tests Don’t Test What We Think They Do.” Todd Farly, whose book “Making the Grades” I reviewed on this site earlier this year, describes the ridiculous lack of professionalism in the standardized test grading industry.
    In future postings I hope to show you some of questions from the CSTs (I am, of course, limited to those questions the state has seen fit to release, and they generally don’t release their biggest screw-ups, and then you can tell me if everything on them is “teachable.”
    Finally, my motive in criticizing these tests is not to defend Oakland Unified. The problems of this district were well known before there were any CSTs. I dedicated 40 years of my life to improving education in Oakland, and the sad truth is that the CSTs, which have taken so much of the time and energy of Oakland teachers, have not helped to make those improvements.

  • http://accomplishedcaliforniateachers.wordpress.com David B. Cohen

    Excellent review of the issues, Steven. The chorus of testing experts, social policy experts, educational research and measurement experts, economists and teachers couldn’t be much clearer on this issue, and yet politicians bow to the will of the people. We have to keep shouting, loudly, until politicians begin to hear us above the sound of those who substitute intuition for facts.

    Regarding the errors and defects on the tests themselves, the confidentiality agreement is essentially coercive. Those of us closest to the situation are required to carry out the testing, required to sign agreements, and prohibited from airing the truth that would reveal more convincingly how poorly some of these test items are designed.

  • Steven Weinberg

    I appreciate the praise from David Cohen. He is a National Board Certified Teacher from Palo Alto High School. He and 2009 California Teacher of the Year Alex Kajitani wrote an excellent column for the Sacramento Bee on the same issue. Here is the link:http://www.sacbee.com/2009/09/03/2156879/david-b-cohen-and-alex-kajitani.html

  • Fancy_Nancy

    Thanks for the education campaign…

    If enough people would put their actions where their mouths are perhaps it would be easy to “shut up” and “put up” like forfeiting taking any Federal or State money as payoffs to maintain the “High Stakes Testing Economies…” that have supported many people’s current and previous (6 Digit) salaries, including yours…The same people who hide behind Non-Profits providing “PD” and other items, purportedly, as well as the people with the authority to issue such “professional services contracts…” and other “contracts…” who are no better than any other quasi-Government contractor on the hustle for kickbacks or other things that never are prosecuted or otherwise enforced…

    “put-up…or SHUT-UP!!!” Oh…I forgot, how unprofessional in the same circles of people on the hustle…

  • Jenna

    Mr. Weinberg: I agree with you that testing is not the answer. However, I do not know what to do about teachers who use the tests to teach content and who have stated they would not teach the curriculum without the tests.

    Specifically, we have elementary teachers who would not teach children to read until they are 8 years old because they personally believe that is the age that is most developmentally appropriate, yet in third grade classrooms, a student who cannot read at age 8 is at a disadvantage for learning the material that needs to be learned.

    My older son’s science teacher personally believes that too much is required to be taught in a year. He only teaches the content because of the benchmark tests.

    What would you do to ensure that ALL students are taught the content, not just what a teacher feels they should teach – not worried about my sons as I will send them to summer, fall and spring camps to make up the deficit, but I am worried about the parents who cannot or will not make up the deficit.

  • Steven Weinberg

    Jenna, As I have said in a different posting on this site, I do believe that there are too many standards at each grade level to cover well and reducing the number of standards would improve education for all students. (http://www.ibabuzz.com/education/2009/11/06/too-many-standards-too-little-time/)
    I am not an expert on elementary level benchmark tests, but I am pretty sure that even without those tests a teacher would not be allowed to skip teaching reading in the first or second grade.

  • Ms. McLaughlin

    Mr. Weinberg, I am so glad you raised the issue of the tests being thrown together in a hurry. When teachers find errors in the booklets, my suggestion would be to write the CEO of Educational Testing Service, which publishes the tests:

    Kurt Landgraf
    ETS Corporate Headquarters
    Rosedale Road
    Princeton, NJ 08541 USA

    Send your feedback via paper snail mail, and mark it urgent.

    Publishing companies exist to make money. It is true that some of them are lobbying heavily for all these tests. (Be on the lookout for new ones that you haven’t yet heard of yet.) The overall market for bound books is shrinking, so expanding the testing market is one of the publishers’ remaining avenues for corporate growth.

    And since many publishing companies are seeing decreased revenues, they will indeed cut corners on time, fact checking, and other editorial necessities if nobody demands better quality. They also know that the students are a captive audience who may not notice any errors and probably won’t stand up for their own testing needs. It’s up to us to advocate for our kids and keep the publishers honest.

  • Steven Weinberg

    Based on my discussions with the California Department of Education, I’m afraid they would consider that a violation of the security agreement teachers sign, and might take punitive action against the teacher.