Monday, February 13, 2012

In Defense of Quality

If you cannot learn to love real art, at least learn to hate sham art.

This, from William Morris.

By “real art,” let’s say James Lesesne Wells, for example, and by “sham art,” let’s say Thomas Kinkade

I’d like to apply this sentiment to the work of writing, and more specifically, to the work of test content development. Quite frankly, I’m mystified (a phrase I borrow from an assistant district attorney with whom I used to work when I was on a very different career path, and who was in the habit of using this phrase to sharpen his tongue as he prepared to slice me up for having done something with which he disagreed) by not only the deep and devastatingly obvious diminution of quality in test content in the past few years, but also by the failure of people in this silo of the industry to recognize this trend.

Quite frankly, it breaks my heart.  As silly as it sounds. But when you love, you expose yourself to the risk of heartbreak. Again, I turn to William Morris, who said, “Give me love and work – these two only.” I’ve been doing this work for 19 years now; though I got into it thinking it was a temporary rope to keep me out of the quicksand until I found my magic circle niche place on solid ground, I think we can all agree it’s become a long-term relationship.

If this lack of quality trend were limited to newcomers to the business, we could propose that them entry-level young’uns [*sigh*] are poorly educated and ill equipped to express themselves except via texting, which you can certainly see in their editorial comments (which are lamentably rich in acronyms, emoticons, and which betray an unfortunate juvenile fondness for excess punctuation and using all caps in directions, which cannot help but set one’s back up, however patient one might be, and anyone who knows me knows that overly patient I be not).

But no, all we content dev folk – ELA, math, science, and social studies, not one is immune, no, not one – have noticed, and we do talk about it, and the conversation and all the various repetitions and iterations of the same conversation bore and horrify us so that we are reduced to shaking our heads and turning our attention to some vision of an oasis, such as the cocktail that awaits the end of the day.

Back in the day, when I worked at Great Big Huge Test Publishing Company, I went to a mandatory training on root cause analysis. We used the fishbone chart. As trainings go, it was all right. Certainly better than the one at which I was accused of not doing my work and letting my teammates pick up my slack because I failed to participate in the assembling of a puzzle, which failure actually had a lot to do with my abysmally poor spatial intelligence and equally poor vision (since corrected through the wonders of Lasik surgery) and little to do with my work ethic, which, as it happens, is about as Puritan as a work ethic might be. You can take the girl out of the working class, but you can’t take the working class out of the girl. But I digress.

If we performed a root cause analysis on the wreckage of Good Ship Quality, what would we find?

To answer that we’d have to go back to the beginning. When I started as a content editor, I was dedicated to one project. That project was my one, my only, my all in all. It was the same for my co-workers. That was the early 90s. In most states, large-scale tests were restricted to reading, writing, and math, and were administered at three or four grades (usually something like 4, 6, 8, and 10, or 5, 7, and 11).

Five years later, it was a whole different and bigger but not necessarily better ballgame. More states were testing more grades, and NCLB loomed on the horizon. As a supervisor, I was responsible for five projects. No one on my team was solely dedicated to any one project; each person, from editor to supervisor, worked on several.

I had a meeting with my manager that went like this:

Manager: [peering at her clipboard] All right, so you have State V, State W, State X, and State Y.
Me: And State Z.
Manager: Oh, I forgot about Z. Right. State Z. [scribbles a note on her clipboard]
Me: What is the order of priority?
Manager: [pause] They’re all priorities.
Me: With five states, mistakes are going to be made. It’s impossible to supervise five projects of this scale. Which state is going to be the mistake state?

Test publishing companies couldn’t handle the workload. Companies that had never done any testing smelled the money and jumped into the fray. All companies got hiring fever. By then, I was a content development manager hiring entry-level candidates at more than twice my starting salary as an editor (and did that ever sting, I tell you what).

But the equation for meeting a production deadline is


If you have less time, you need more workers. Fewer workers, you need more time. I am no math expert, but this equation I know.

Deadlines got more and more compressed, development cycles shrank, and everyone starting skipping steps. Real training gave way to on-the-job training, which really means sink-or-swim training. New-hires were handed the comprehensive binder containing lists of processes and procedures, which binders were relegated to shelves in cubicles because no one had time to read them. Early field tests were cast aside. Sometimes all field tests. There were fewer internal reviews. The few remaining reviews were performed by overworked and/or underexperienced staff—and you can actually determine which is which (and which is both) when you see the editorial feedback coming out of such reviews.[1]

Another significant factor may be a corollary to the Peter Principle. The most highly skilled, knowledgeable, and experienced line staff keep getting promoted to management, where they may be doing a fantastic job, but their spots are filled either by new hires or old hands who are left behind (how can I say this delicately? Their remaining behind may not always be by their own choice). Combine this with the absence of training, and it’s a chaos cocktail.

Not to mention the dependence on freelancers. Companies started laying people off and then rehiring them as subcontractors. For some, it’s a win-win—the company don’t have to pay your benefits, and you get to work at home in your pajamas—but it do mean there are a heck of a lot of people at their keyboards writing test questions who neither have experience in education nor in publishing, let alone assessment, which some of us choose to believe is both an art and a science.

There is value in enduring years of slogging through the entire publishing cycle from first draft through bluelines over and over and over again. There is value in having logged many hours in the company of small children struggling to read. There is value in meeting with what the industry calls the stakeholders—teachers, administrators, community leaders, DOE officials. There is value in stretching to accommodate the demands of the stakeholders. There is value in educating oneself about the history and practices of one’s profession. Those learn-to-play-the-piano-in-10-minutes books aside, there is no shortcut to attaining mastery in anything.

There are so many facets to what we do in assessment content development, and when one’s experience is restricted to one tiny mirrored triangle of the great big disco ball, well, that creates a problem because one hasn’t constructed a greater context which allows for greater meaning to inform and guide the work. When the work is simply writing questions for a paycheck and meaning goes out the window, the questions get lamer and lamer, by which I mean trivial, superficial, and plagued by error.

However, the purpose of identifying a problem is not to castigate wrong-doers, nor to enjoy that most basic human pleasure of being right, but to use such identification to find a solution.

The answers are probably as clear to you as they are to me:

  • 1.     Only hire content developers (freelance or in-house, I have no axe to grind here) who either have a proven track record of providing high-quality work or who have the capability (combination of education, writing skills, content area expertise, intelligence, creativity, and persnicketiness) to learn how to do the work well
  • 2.     Provide not only adequate but excellent training
  • 3.     Employ senior content development personnel [*ahem* not naming any names] to review items and provide specific instructional feedback to writers
  • 4.     Budget sufficient time and money for the given project

[1] Overworked but highly experienced people skim text, which forces their brain to fill in the gaps. Which means erroneous assumptions and conclusions drawn from limited evidence and resulting in unnecessary, ill-advised edits. Subtleties or fine distinctions are impossible to detect when skimming. Underexperienced people often restrict their scope of what’s acceptable to their own narrow band of direct experience, and then reject what lay in the outer darkness of their ignorance. This is bad enough on its own, but they will then assume a pedantic tone and lecture the writer for having written items that exceed the demands of the specifications.  


  1. Very well said, Leslie. I'd be interested to hear more of the conversation, and what else you see as the future of testing & assesment

  2. Thanks, Erin. More to come. Please do weigh in with your thoughts as you feel inspired.