How many words do I need to know? The 95/5 rule in language learning, Part 1/2

Old books on a bookshelfA very common question that people ask when starting the study of a foreign language is “How many words do I need to know in order to be conversationally fluent for everyday talk in X language?” This is a very good question, and one that I will try to answer to the best of my ability in part 2 of this post, but first of all, let me ask you this: Have you ever wondered how many words there are in your language? Well, this is the wrong question, in fact, since there is no single sensible answer to this question. Why is that? Simply put, it’s impossible to count the number of words in a language, because it’s so hard to decide what actually counts as a word.

For example, it is said that the word “set” in English has 464 definitions in the Oxford English Dictionary. Would we count a word with multiple definitions as one single word, or would we count each definition has an individual word? And what about phrasal verbs, such as “set up,” “set about,” “set apart,” and so on? Or what about so-called open compound words like “hot dog,” “ice cream,” and “real estate”? Lastly, if you consider the plural and singular forms of words, different verb conjugations, together with different endings, prefixes and suffixes, you will quickly understand the difficulty in counting the number of words in a language.

So the question really should be: Do you know how many words there are in your language’s largest dictionary? Well, since I wanted to get a rough idea of the number of words in some of the world’s major languages, and compare this number to the average number used 90 to 95% of the time in everyday life and in common news articles, this is a question I spent quite a bit of time searching answers for. And I’m sure you are curious too.

As I said before, many language learners wonder the number of words they will have to learn before gaining intermediate or advanced fluency in a given foreign language, and I will answer that question a bit later on in this article. So after doing quite a bit of research, I did manage to find the number of words the major dictionaries of the world’s major languages, numbers which you will find in part 2 of this post. But hey, don’t stop reading here, because I have some other important stuff to discuss!

The Pareto Principle and Language Learning

Vilfredo Pareto portrait

Italian economist Vilfredo Pareto

So what is the purpose of my “research”? Well, some of you might have heard about the Pareto Principle, also known as the 80-20 rule. If you’d like to learn more about this, I encourage you to check out my post that partly deals with this subject. In a nutshell, though, the Pareto Principle is as follow: after having observed numerous phenomena ranging from land ownership to pea pods, Italian engineer and philosopher Vilfredo Federico Damaso Pareto came up with what became known as Pareto’s Law: for many events, roughly 80% of the effects come from 20% of the causes. In other words, in the context of work or study, 20% of the efforts bring in 80% of the results.

In the context of language learning, then, I wanted to find out the approximate percentage of words you would have to learn to understand 90 to 95% of the most commonly used words in everyday life. Why 90 to 95% of the most commonly used words? Simply put, this is the rough amount of comprehension needed in order to understand what is being said quite well in a language. Plus, by understanding this much of the vocabulary, you’ll be able to guess the remaining 5 to 10% of words that you do not know simply through context. The numbers are not exactly the same as the 82-20 rule, as you’ll see in my next post, but the principle is similar: only a small fraction of your efforts will bring in the biggest results.

This is very important, because after having reached a level of understanding high enough in a language, I believe it’s time to drop the dictionary and to truly start (or continue at an increasing speed) learning “inductively”, through context and through good guesswork. You do that every day in your own language, since nobody knows the meaning of every single word in their language (wait until you see the number of words that the Oxford English Dictionary defines!), very far from it in fact; so why not do the same in a foreign language?

Developing Good Guessing Skills

A few weeks ago I read an article in The Telegraph entitled “Learning a foreign language: five most common mistakes.” It’s a short and rather informative article, so I encourage you to give it a quick read. One of the most common mistakes that the author listed in there was that of “Rigid Thinking”. The excerpt is worth quoting at length:

Linguists have found that students with a low tolerance of ambiguity tend to struggle with language learning.

Language learning involves a lot of uncertainty – students will encounter new vocabulary daily, and for each grammar rule there will be a dialectic exception or irregular verb. Until native-like fluency is achieved, there will always be some level of ambiguity.

The type of learner who sees a new word and reaches for the dictionary instead of guessing the meaning from the context may feel stressed and disoriented in an immersion class. Ultimately, they might quit their language studies out of sheer frustration. It’s a difficult mindset to break, but small exercises can help. Find a song or text in the target language and practice figuring out the gist, even if a few words are unknown.

[bold emphasis mine]

Rigid thinking is in fact extremely common among language learners, and extremely uncommon when it comes to your native language! After all, do you really reach for a dictionary often when reading in your native language? My guess is, not so often, even if, I am sure, you do not know the meaning of several words you come across (especially in novels, where the descriptive vocabulary is very literary and uncommon at times).

Yet good guessing skills are truly important when it comes to acquiring a foreign language, for the simple reason that it’s not possible (and even if it were, it woulConfused lookd be highly impractical) to learn every single definition of a even a single word (such as “set”) in English. If you can’t learn the definitions of a single word in a given language, why even imagine the need to learn the definition of every single word you come across?! What happens is that you will eventually learn words through repeated exposure, in different contexts, at different places. This is called assimilation. And this is your aim when acquiring a foreign language. Check out my post entitled “Memory Tip #4: Learn From Context” for more info on how to use context to also help with memorization.

Let me give you this example sentence: “We put a tremendous amount of effort to finish this project, and we finally succeeded.” Now, let’s say that you understand everything here except for the word “tremendous”. Chances are you get can a rough idea of the meaning of “tremendous” through the context given here. You understand 92.5% of this sentence (14 words out of 15), and the remaining 7.5% can be understood contextually. Keywords include “effort”, “project”, and “finally succeeded”, and through guesswork, it’s not that hard to come up with a meaning that will be similar to what you would find in a dictionary. If you couldn’t guess the meaning of the word “tremendous,” by the way, it simply means “a lot”, “a great amount”.

Assimilating the Language

So the point I’m trying to make here, is that if you can achieve a 95% understanding of the most common words found in a given language, it will become possible to acquire the remaining unknown words contextually, by a process called assimilation (the method Assimil works around a similar philosophy). Now, of course simply knowing words does not equal to a perfect understanding of what you listen/read, since there is also grammar/idioms/figures of speech/etc. involved in the language, and these can provide wonderful barriers to understanding. You could very well know every single word in a sentence and still not understand what is being said because of unfamiliarity with these aspects of the language. Nevertheless, most of the time, by knowing 90 to 95% of the words in a sentence, and by being provided with sufficient context, you should have very few problems understanding and communicating in the language, especially if you are learning a language that is part of the same language family as that of your mother tongue.

—————————-

So we’ll stop here for today’s post. In part 2, we’ll take a look at the number of words contained in the major dictionaries of some of the most widely spoken languages. We will also try to answer the question “how many words should I ‘learn’ before gaining conversational competency in a given foreign language?”

Have you enjoyed this article? Have you learned something out of it? A “like” on Facebook means a great deal to me. And if you want to be even more awesome, like, super awesome, share this article with your friends on Facebook and Twitter!

By Sam Gendreau

  • Han

    Thanks for your article.
    I have enjoyed reading this article.
    I totally agree with what you wrote.

    However, I sometimes feel it not easy for me to guess the word that I don’t know in sentence I am reading.
    I think I is better for me to find key word so that I understand what I am reading.

    As mentioned already, I should not find the word that is not key word, like “tremendous”..
    Anyway, thanks again.
    I want to let you know I have really been enjoying reading your article since I came to know you are in Korea.

    I am looking forward to part 2.
    thanks…

    • http://www.lingholic.com sgendreau

      Thanks for the kind words Han!

      Your English is fantastic by the way! Where did you learn it?

  • Nanushka

    Hi Sam,
    I loved the article. In facts, it tells what I’ve experienced hhh
    I used to look at the dictionary for every single word I came across. It took me hours to read and understand a short text!! I was trying to “absorb” the definitions … That was overworking and fruitless :<
    Then, I let it down, and found out a new method … read read read, listen listen listen … by the end, some words/idioms/expressions just sticked in my mind, at that moment, I took the dictionary, try to find the definitions, and to match them with what I understood/guessed from the context … and guess what, it's always close to the definition given by the dictionary, and it stays in my mind for ever, I even begin to use it spontaneously ;)
    I can't wait to read the second part :)

    • http://www.lingholic.com sgendreau

      Yes, your experience seems to be similar to what many language learners experience. Eventually, you have to give up the dictionary and just read read read, and listen listen listen.

      Thanks for the kinds words :)

  • Alfredo

    hi Sam!!

    Great article as always :)
    It´s fantastic that you show us this things. By the way, I have a question, from where can I found the words to learn? from the dictionary? I´ve came to Greece to work and I would like to learn the language :)

    Thanks!!

    • http://www.lingholic.com sgendreau

      Hi Alfredo! Usually, you don’t need to “look” for the words to learn. You will simply acquire them through exposure, by reading and listening and speaking as much as you can in the language. In the beginner stages this means working mostly with a textbook and, perhaps, with a tutor. But once you reach an intermediate level you’ll start getting exposed to native content, such as blogs, news stories, books, podcasts, and so on.

      In any case, you can check out the “Wiktionary Frequency List” which is a wonderful resource to find the most common words in a given language. For Greek, follow this link to access the 5000 most used Greek (Ελληνικά) words based on contents of http://www.opensubtitles.org. However, I would not recommend anyone from learning words from this kind of list. It’s always much better to learn from context!

  • Evan Jacop

    Really useful article , I’d like to thank you

    • http://www.lingholic.com/ lingholic

      The pleasure is mine! Thanks for your comment, Evan.

  • Wai

    Good point, but this just proves why languages like Chinese or Japanese that use characters that are unreadable are the hardest languages to learn. You can guess the meaning of the words you don’t know but you won’t really learn a new word if you don’t even learn the pronunciation of it.

    • http://www.lingholic.com/ lingholic

      Hi Wai.

      Well, of course, this pre-supposes that the only way to learn a language is through reading. Plenty of people learn languages by listening and speaking them, in which case knowing characters is not a necessity.

      I’d also like to add that technology and the internet has made it considerably easier to read languages such as Chinese or Japanese because it’s easy to hover over a word to see the pronunciation. Lots of app are really useful in this respect.

  • sid

    intelligent article