Measure What Can Be Measured

I'm in DC for a joyous occasion, and even so--even here, in the midst of the joy, even in the thrill of riding the Metro and the suspense of wondering if this be the ride during which I become irretrievably lost in the bowels of the city--I'm thinking about readability.

In assessment publications, readability, or the grade at which we might reasonably expect a student to be able to read a given text, is determined by one or more of these measures:

Because of the Common Core Standards, one can no longer talk about readability without text complexity pulling up a chair to the table. Which is all to the good.

Here is an explanation of Lexile by Jason Turner at MetaMetrics (by posting this, I intend neither to promote nor disparage Lexile, simply to offer information). I'd like to clarify that when Turner identifies Lexile as a measure of text complexity, he means that it is a quantitative measure--there is no readability formula capable of providing a qualitative measure. 

This is such a vital point that I feel compelled to repeat that there is no readability formula capable of providing a qualitative measure. 

Readability formulae cannot interpret nor analyze meaning. No theme, motif, trope, metaphor, symbol--no beauty, no lyricism--can be interpreted nor analyzed by a readability formula, which greatly diminishes the likelihood that such a formula could provide an accurate measure of literary text. Readability formulae can count.

What is counted--word frequency, word length, sentence length--may vary from formula to formula. It doesn't seem to matter much what is counted; like the yellow, green, orange, and blue lines of the DC Metro trains that all pull up to L'Enfant Plaza, the readability formulae tend to arrive at the same conclusions: "No one of the quantitative measures performed significantly differently than the others in predicting student outcomes."

These formulae, therefore, are most useful in evaluating the suitability of informational text for students at a certain grade level, much less useful in evaluating the suitability of literary text, and not at all useful for evaluating the suitability of poetry or drama.

Here is an explanation published by the Council of Chief State School Officers and the National Governors' Association:

There will be exceptions to using quantitative measures to identify the grade band; sometimes qualitative considerations will trump quantitative measures in identifying the grade band of a text, particularly with narrative fiction in later grades. Research showed more disagreement among the quantitative measures when applied to narrative fiction in higher complexity bands than with informational text or texts in lower grade bands. Given this, preference should sometimes be given to qualitative measures when evaluating narrative fiction intended for students in grade 6 and above. For example, some widely used quantitative measures rate the Pulitzer Prize-winning novel Grapes of Wrath as appropriate for grades 2–3. This counterintuitive result emerges because works such as Grapes often express complex ideas or mature themes in relatively commonplace language (familiar words and simple syntax), especially in the form of dialogue that mimics everyday speech. Such quantitative exceptions for narrative fiction should be carefully considered, and exceptions should be rarely exercised with other kinds of text. It is critical that in every ELA classroom students have adequate practice with literary non-fiction that falls within the quantitative band for that grade level. To maintain overall comparability in expectations and exposure for students, the overwhelming majority of texts that students read in a given year should fall within the quantitative range for that band.
It seems clear, then, that for literary text we must rely on a second opinion. And yet, from what I see, reviewers (content editors at test publishing companies, panels of teachers convened to evaluate test materials) continue to rely primarily on quantitative formulae rather than qualitative considerations when deciding to accept or reject literary passages. Just as we can all agree that Grapes of Wrath is hardly a book for second- and third-graders, I think we can all agree that this is a mistake.

I understand why and how this mistake is made. Unless trained in the study of literature (which is and has been a bona fide field of study for centuries upon centuries), who feels qualified to make decisions about a literary work? And yet, in the immortal words of Galileo, here we must find some way to "make measurable what cannot be measured."

The good news is that there are many people who are so trained. Why not add their voices to the mix? This would surely help us remain true to the goal of creating the best fit possible between reader and text.

And now, I'd like to return to the beginning and wish Kia and Roy, the two most brilliant people I know, two who are blessed with every gift of intellect and heart, two who are not just brilliant writers but purely lovely human beings, every possible happiness. The Donne poem is from the ceremony.

by John Donne

I WONDER by my troth, what thou and I
Did, till we loved? were we not wean'd till then ?
But suck'd on country pleasures, childishly ?
Or snorted we in the Seven Sleepers' den ?
'Twas so ; but this, all pleasures fancies be;
If ever any beauty I did see,
Which I desired, and got, 'twas but a dream of thee.

And now good-morrow to our waking souls,
Which watch not one another out of fear;
For love all love of other sights controls,
And makes one little room an everywhere.
Let sea-discoverers to new worlds have gone;
Let maps to other, worlds on worlds have shown;
Let us possess one world; each hath one, and is one.

My face in thine eye, thine in mine appears,
And true plain hearts do in the faces rest;
Where can we find two better hemispheres 
 Without sharp north, without declining west?
Whatever dies, was not mix'd equally;
If our two loves be one, or thou and I
Love so alike that none can slacken, none can die.

Donne, John. Poems of John Donne. vol I. E. K. Chambers, ed. London: Lawrence & Bullen, 1896. 

NOTE: Just for fun, here are the Flesch-Kincaid measures for both the John Donne poem and this blog post.
Flesch-Kincaid measure for "The Good-Morrow": 2.1
Flesch-Kincaid measure for "What Can Be Measured": 12.0

NOTE THE SECOND: I listened to this bit with Martha Nussbaum on the value of the humanities and wanted to share.

