Ah, fatal words! Too late in moving here, too late in arriving there, too late in coming to this decision, too late in starting with enterprises, too late in preparing.These first guidelines of the CCSSO/TILSA Quality Control Checklist for Item Development and Test Form Construction should be considered in early stages of planning, long before item writing assignments are made:
1A. Each item should assess content standard(s) as specified in the test blueprint or assessment frameworks.
2A. Items must measure appropriate thinking skills as specified in the test blueprint or assessment frameworks.
3A. Items should be written at appropriate cognitive levels and reading levels according to the item specifications guidelines.
A test blueprint identifies the skills and/or knowledge to be assessed, provides the item-to-skill distribution, and specifies item formats.
Let's think about creating a blueprint to assess writing at grade 6. We'll base the blueprint on the Common Core State Standards.
In the CCSS, English conventions are addressed in the language standards, and what we might call writing strategies and application are addressed in the writing standards. The language standards could be assessed with a variety of formats: standalone or passage-dependent multiple choice items, standalone or passage-dependent technology-enhanced items, or as one component of an extended-constructed-response item.
Here is a writing standard:
1. Write arguments to support claims in an analysis of substantive topics or texts,using valid reasoning and relevant and sufficient evidence.a. Introduce precise claim(s), distinguish the claim(s) from alternate oropposing claims, and create an organization that establishes clearrelationships among claim(s), counterclaims, reasons, and evidence.
b. Develop claim(s) and counterclaims fairly, supplying evidence for eachwhile pointing out the strengths and limitations of both in a manner thatanticipates the audience’s knowledge level and concerns.
c. Use words, phrases, and clauses to link the major sections of the text,create cohesion, and clarify the relationships between claim(s) and reasons,between reasons and evidence, and between claim(s) and counterclaims.
d. Establish and maintain a formal style and objective tone while attending tothe norms and conventions of the discipline in which they are writing.
Generally the above standard would be assessed with an extended-constructed-response item, because multiple-choice items and short constructed-response items don't allow students sufficient opportunity to demonstrate the ability to "write arguments to support claims...." However, the subskills may be (and frequently are) assessed with multiple-choice items; this is more common at the district or classroom level than at the state level. You might see a question that addresses W.1.a by asking the student to choose the best opposing claim for a given argument. Such multiple-choice items may help teachers isolate specific areas in which a student needs instruction and support.
Here are language standards:
1. Demonstrate command of the conventions ofstandard English grammar and usage whenwriting or speaking.
e. Recognize variations from standard Englishin their own and others’ writing andspeaking, and identify and use strategies toimprove expression in conventional language.*
2. Demonstrate command of the conventions ofstandard English capitalization, punctuation, andspelling when writing.
All of the above language skills may be assessed with multiple-choice questions. These could be standalone, or could offer a stimulus: an editing passage with embedded errors. More on language items as previously discussed here.
For our imaginary grade 6 writing test, we might decide that we'd like to use multiple measures in order to obtain as much information as possible in as many different ways as we can, so we're going to create a blueprint that specifies a combination of item formats and includes x number of multiple-choice and technology-enhanced items, along with one extended-constructed-response to a writing prompt; this response will be scored with a holistic rubric that addresses organization, style and voice, and conventions. We would develop a test blueprint that specified the standards and subskills to be assessed, along with the number of items and item formats for each standard or subskill.
In our blueprint, we may also use Bloom's Taxonomy or Norman Webb's Depth of Knowledge Guide to determine the cognitive level for each item. Although the cognitive levels of some skills are relatively simple to determine, based on what is required from students, some skills may be addressed at multiple levels of cognitive complexity.
We may instead indicate the cognitive levels, item difficulty, and content or domain limits, and reading levels in the item specifications, as suggested in the CSSO/TILSA checklist.
In a typical statewide high-stakes assessment program, the decisions that inform the development of a test blueprint and item specifications are made by committees, which is as it should be, and committees should include classroom teachers. Committees often include other stakeholders, e.g., business leaders who may be asked to identify skills and knowledge necessary in the workplace.
Once all of that preparation is complete, item development begins.
Now let's say we've received an assignment to write those multiple-choice language items and that ECR writing prompt. We've read all of the project documentation and support materials; we have the item specifications in front of us.
It is a truth universally acknowledged that a test item should target one and only one skill or bit of content knowledge. Each idea should have one big idea; every part of the item should support that focus.
If we were going to write a multiple-choice item for W.2.b, our big idea would be how to spell grade-level appropriate words. We might write an item that looks like this:
Which word is spelled correctly?
This item clearly targets one skill: correctly spell grade-level-appropriate words. The stem tells the student exactly what to do. The item is phrased simply and concisely. The content is neutral; there are no highly-charged words. All of the answer choices are grade 6 words (according to EDL Core Vocabularies); all are words likely to be known to grade 6 students and are words that are significant to academic content areas. There are no tricky or esoteric rare words. The answer choices appear in a logical order (here we use alpha order). All of the distractors address common spelling mistakes: using s instead of c, using e instead of a, and writing phonetically. None of the words are homonyms and so none are context-dependent; each of these words have one correct spelling.
Here is a poor item addressing the same skill:
Which word is written correctly?
This item has multiple flaws. First, the big idea is not specified in the stem; the student doesn't know what s/he is expected to do until s/he reads the answer choices. The answer choices are not grade-level-appropriate; "machine" is a grade 2 word, while "anchor" is grade 3. The word "rabbi" may not be familiar to grade 6 students. Answer choice A ("musheenz") is plural, while the other ACs are singular. Answer choice A also offers mistakes that are unlikely to be made by students at the targeted grade level. The answer choices do not appear in any logical order. Finally, the correct response is a type of weapon.
As bad as this item is, though, we could make it even worse by
- increasing the reading load by burying the spelling words in sentences and offering four sentences as the answer choices;
- obscuring the targeted skill by adding in other types of conventions errors, such as mistakes in capitalization and punctuation;
- using homonyms, or words that are spelled differently depending on the context;
- using above-grade-level vocabulary.
Item writing is both an art and a science. There's so much to consider, even in writing the simplest spelling item.