Greg Lanier posted a couple of days ago an application of quantitative analysis to the acquisition of Greek vocabulary for New Testament reading. He runs through some statistics about the number of words in the GNT, the number of hapax legomena, the number of words one has to acquire to reach 80% of the word occurrences, etc. It is all very interesting and informative, and there are graphs. I love graphs! His point is to encourage students to focus on learning core vocabulary. His numbers and graphs help to make this task less daunting than it might appear at first. His summary is thus:
- Investing significant time up front to acquire the 50+ words pays significant dividends.
- There is a tremendously “long tail” whereby going from 90% to 100% of words by occurrence (that is, ability to read that percentage of the NT without relying on a dictionary) requires learning >80% of the total vocabulary! That’s a huge step!
- Perhaps the best idea is to focus on the 882 words that get you to 90%.
882 words doesn’t seem like all that much . . . if that were all there was to it! I always think I am giving students too rosy an outlook when I tell them how few words they have to memorize to read a majority of the NT. Inflected languages actually require students to memorize a handful of words when they are learning “one” word. For instance, it feels dishonest to tell students they only have to learn one word in the definite article to have nearly 20K (19,867 to be exact) word occurrences under their belt. That one little word actually has 17 different forms, 17 different words to memorize, if we’re being honest. And some of those words pull double duty (or triple, in the case of plural genitive).

It sounds less daunting to think we only have to memorize 882 words to get to 90% of the GNT, but in reality it is much more than that.
I’m by no means a statistician and I do not claim the same abilities in quantitative analysis as Greg. So you will not be getting any charts from me. But I want to explore my point a bit further. Using Greg’s post, the list provided by the Institute of Biblical Greek, a calculator, and a grammar to check my paradigms, I want to take a look at the 20 “unique” words that occur 900 times or more in the GNT.
![Screen Shot 2015-01-29 at [Jan 29] 9.04](http://www.runningheads.net/wp-content/uploads/2015/01/Screen-Shot-2015-01-29-at-Jan-29-9.04-1024x486.png)
According to the traditional spiel about Greek vocabulary learning, one would only need to memorize 20 words to account for 64,486 (somebody check my calculator work!) word occurrences out of a total of only 138,150 total words in the GNT. That’s 20 words for over 46% of the GNT! Sounds pretty good doesn’t it? Of course, you would have 5400 more words to learn to acquire the nearly 54% outstanding, but let’s focus on the surmountable part for now. Here’s the rub, though. As I said earlier, there are actually 17 different, truly unique words that make up what most vocabulary lists put down as only one word in their listing of the definite article. In fact, if I am counting right, there are only 9 words in this list of 20 that do not inflect, decline, morph, augment, presto-change-o somehow. Wait, strike that. There are a few of those prepositions that will end differently depending on the word they precede. When I set out on this blog post, I had in mind to try to count the actual number of words hidden away inside this list of 20, but the idea of trying to count all the verbal forms of a couple of the words overwhelmed me, and I have actual editing work to do today. So , I never did make use of the Greek grammar I had on my desk in order to count all the different forms/words. Still, the point I want to make explicitly now is that learning enough Greek vocabulary to account for 90% of the GNT is a good deal more than learning 882 words.
And before pedants start putting me in my place about what a “word” is or about what we mean by “learning” a vocabulary word or about how memorizing a few paradigms make it so that one need only memorize one word to have all 17+ forms (AKA unique words!) at hand, let me appeal to my Merriam-Webster dictionary for an example or two. You will find there an entry for the indefinite article ‘a’ on one page, and an entry for ‘an’ on another page. One could argue that it is just one word with different forms, but there are two separate words for English-language learners to memorize! Yes, I understand that if English-language learners memorized a few rules (easier rules than Greek since English inflects only minimally), then they would only have to memorize ‘dog’ to get ‘dogs’ and ‘inflect’ to get ‘inflected’ and ‘inflecting’ and ‘inflects’ and ‘inflection’ and so on. I would say, however, that learning these rules only makes learning related words easier. It is not that they’ve memorized one word. It’s that they’ve memorized several related words with the assistance of standard rules. And it is a hard argument to make that English-language learners are memorizing only ONE word when they learn ‘go’ and ‘went’ or ‘goose’ and ‘geese’ or any number of irregular word sets. Greek is filled with irregular words of all sorts. It is a hard argument to make that Greek-language learners are memorizing ONE word when they learn all of the crazy forms of εἰμί.
I don’t mean to be discouraging of students learning Greek vocabulary. When I taught, I had a reputation for requiring more than the normal number of vocabulary words. And now in retrospect I see that I was actually assigning many more words than were listed. I don’t regret it. My students might. I don’t. I’m writing all of this as a way to think out loud, as it were, about how we “market” vocabulary learning to students. Is it fair to say only 882 words and you’ve got 90% of the GNT? Most students, I would guess, hear the word “words” and assume it is something like looking up an unfamiliar word in Merriam-Webster. They have no idea that Greek is just not that easy.