I wrote the other day about the need for a Technology Reading Ease Index. I’ve been thinking about how it might work so here are my current thoughts.
The problem I am looking to solve is that I want to be able to rate the readability of different technology vendors’ marketing copy. As I explained here, traditional readability indexes do a good job at rating the complexity of text, but do not recognise the domain knowledge that the author expects a reader to posses in order to understand it . So I’ve decided to build something on top of the existing readability indexes that takes into account the frequency and density of technological acronyms that are used.
This technology reading ease index will go through a few iterations before it’s fit for purpose I’m sure. For example, in addition to looking for acronyms, I think I could also add specific technological terms too. But for the moment I’ll simply look for acronyms.
So here is a list of the design criteria for a readability index that measures traditional readability measurements (word/syllable complexity, word length, etc.) as well as the degree of technical knowledge and domain expertise required to understand a specific text.
- Must take a representative amount of text – minimum length of say, 300 words
- Should count the total number of acronyms found
- Should consider the number of different acronyms that are found
- Should ignore the length of the acronym itself (I don’t believe that there is a correlation between the number of letters and the complexity of the acronym)
- Should use the density of acronyms found in the text (acronyms per 100 words or something like that) as the basis of the rating
- The index should be able to be used separately from other readability indexes as well as in addition to them
- The index should not try to compute a ‘grade years’ equivalence as other generic readability indexes do. Instead it should produce a rating that needs to be interpreted.
So ideally I will reach a rating that describes the amount of domain-specific knowledge the reader would need to posses; and by combining that with a generic readability index such as Flesch-Kincaid, Gunning Fog, Coleman-Liau; I will reach an overall rating for readability and domain knowledge.
The question is how to construct the result and how to interpret it. I’ll keep you posted.
P.S. The image in the top right was created by wordle.net and shows an acronym soup.