How many words do I need to know? The 95/5 rule in language learning, Part 1/2

Old books on a bookshelfA very common question that people ask when starting the study of a foreign language is “How many words do I need to know in order to be conversationally fluent for everyday talk in X language?” This is a very good question, and one that we will try to answer in part 2 of this post, but first of all, let me ask you this: Have you ever wondered how many words there are in your language? Well, this is the wrong question, in fact, since there is no single sensible answer to this question. Why is that? Simply put, it’s impossible to count the number of words in a language, because it’s so hard to decide what actually counts as a word.

For example, it is said that the word “set” in English has 464 definitions in the Oxford English Dictionary. Would we count a word with multiple definitions as one single word, or would we count each definition has an individual word? And what about phrasal verbs, such as “set up,” “set about,” “set apart,” and so on? Or what about so-called open compound words like “hot dog,” “ice cream,” and “real estate”? Lastly, if you consider the plural and singular forms of words, different verb conjugations, together with different endings, prefixes and suffixes, you will quickly understand the difficulty in counting the number of words in a language.

So the question really should be: Do you know how many words there are in your language’s largest dictionary? Since I wanted to get a rough idea of the number of words in some of the world’s major languages, and compare this number to the average number used 90 to 95% of the time in everyday life and in common news articles, this is a question I spent quite a bit of time searching answers for. And I’m sure you are curious too.

As I said before, many language learners wonder the number of words they will have to learn before gaining intermediate or advanced fluency in a given foreign language, and I will answer that question a bit later on in this article. So after doing quite a bit of research, I did manage to find the number of words the major dictionaries of the world’s major languages, numbers which you will find in part 2 of this post. But hey, don’t stop reading here, because I have some other important stuff to discuss!

The Pareto Principle and Language Learning

Vilfredo Pareto portrait

Italian economist Vilfredo Pareto

So what is the purpose of my “research”? Well, some of you might have heard about the Pareto Principle, also known as the 80-20 rule. If you’d like to learn more about this, I encourage you to check out my post that partly deals with this subject. In a nutshell, though, the Pareto Principle is as follow: after having observed numerous phenomena ranging from land ownership to pea pods, Italian engineer and philosopher Vilfredo Federico Damaso Pareto came up with what became known as Pareto’s Law: for many events, roughly 80% of the effects come from 20% of the causes. In other words, in the context of work or study, 20% of the efforts bring in 80% of the results.

In the context of language learning, then, I wanted to find out the approximate percentage of words you would have to learn to understand 90 to 95% of the most commonly used words in everyday life. Why 90 to 95% of the most commonly used words? Simply put, this is the rough amount of comprehension needed in order to understand what is being said quite well in a language. Plus, by understanding this much of the vocabulary, you’ll be able to guess the remaining 5 to 10% of words that you do not know simply through context. The numbers are not exactly the same as the 82-20 rule, as you’ll see in my next post, but the principle is similar: only a small fraction of your efforts will bring in the biggest results.

This is very important, because after having reached a level of understanding high enough in a language, I believe it’s time to drop the dictionary and to truly start (or continue at an increasing speed) learning “inductively”, through context and through good guesswork. You do that every day in your own language, since nobody knows the meaning of every single word in their language (wait until you see the number of words that the Oxford English Dictionary defines!), very far from it in fact; so why not do the same in a foreign language?

Developing Good Guessing Skills

A few weeks ago I read an article in The Telegraph entitled “Learning a foreign language: five most common mistakes.” It’s a short and rather informative article, so I encourage you to give it a quick read. One of the most common mistakes that the author listed in there was that of “Rigid Thinking”. The excerpt is worth quoting at length:

Linguists have found that students with a low tolerance of ambiguity tend to struggle with language learning.

Language learning involves a lot of uncertainty – students will encounter new vocabulary daily, and for each grammar rule there will be a dialectic exception or irregular verb. Until native-like fluency is achieved, there will always be some level of ambiguity.

The type of learner who sees a new word and reaches for the dictionary instead of guessing the meaning from the context may feel stressed and disoriented in an immersion class. Ultimately, they might quit their language studies out of sheer frustration. It’s a difficult mindset to break, but small exercises can help. Find a song or text in the target language and practice figuring out the gist, even if a few words are unknown.

[bold emphasis mine]

Rigid thinking is in fact extremely common among language learners, and extremely uncommon when it comes to your native language! After all, do you really reach for a dictionary often when reading in your native language? My guess is, not so often, even if, I am sure, you do not know the meaning of several words you come across (especially in novels, where the descriptive vocabulary is very literary and uncommon at times).

Yet good guessing skills are truly important when it comes to acquiring a foreign language, for the simple reason that it’s not possible (and even if it were, it woulConfused lookd be highly impractical) to learn every single definition of a even a single word (such as “set”) in English. If you can’t learn the definitions of a single word in a given language, why even imagine the need to learn the definition of every single word you come across?! What happens is that you will eventually learn words through repeated exposure, in different contexts, at different places. This is called assimilation. And this is your aim when acquiring a foreign language. Check out my post entitled “Memory Tip #4: Learn From Context” for more info on how to use context to also help with memorization.

Let me give you this example sentence: “We put a tremendous amount of effort to finish this project, and we finally succeeded.” Now, let’s say that you understand everything here except for the word “tremendous”. Chances are you get can a rough idea of the meaning of “tremendous” through the context given here. You understand 92.5% of this sentence (14 words out of 15), and the remaining 7.5% can be understood contextually. Keywords include “effort”, “project”, and “finally succeeded”, and through guesswork, it’s not that hard to come up with a meaning that will be similar to what you would find in a dictionary. If you couldn’t guess the meaning of the word “tremendous,” by the way, it simply means “a lot”, “a great amount”.

Assimilating the Language

So the point I’m trying to make here, is that if you can achieve a 95% understanding of the most common words found in a given language, it will become possible to acquire the remaining unknown words contextually, by a process called assimilation (the method Assimil works around a similar philosophy). Now, of course simply knowing words does not equal to a perfect understanding of what you listen/read, since there is also grammar/idioms/figures of speech/etc. involved in the language, and these can provide wonderful barriers to understanding. You could very well know every single word in a sentence and still not understand what is being said because of unfamiliarity with these aspects of the language. Nevertheless, most of the time, by knowing 90 to 95% of the words in a sentence, and by being provided with sufficient context, you should have very few problems understanding and communicating in the language, especially if you are learning a language that is part of the same language family as that of your mother tongue.


So we’ll stop here for today’s post. In part 2, we take a look at the number of words contained in the major dictionaries of some of the most widely spoken languages. We will also try to answer the question “how many words should I ‘learn’ before gaining conversational competency in a given foreign language?”

Click here to read part 2 of this article.

By Lingholic

  • Han

    Thanks for your article.
    I have enjoyed reading this article.
    I totally agree with what you wrote.

    However, I sometimes feel it not easy for me to guess the word that I don’t know in sentence I am reading.
    I think I is better for me to find key word so that I understand what I am reading.

    As mentioned already, I should not find the word that is not key word, like “tremendous”..
    Anyway, thanks again.
    I want to let you know I have really been enjoying reading your article since I came to know you are in Korea.

    I am looking forward to part 2.

    • Thanks for the kind words Han!

      Your English is fantastic by the way! Where did you learn it?

  • Nanushka

    Hi Sam,
    I loved the article. In facts, it tells what I’ve experienced hhh
    I used to look at the dictionary for every single word I came across. It took me hours to read and understand a short text!! I was trying to “absorb” the definitions … That was overworking and fruitless :<
    Then, I let it down, and found out a new method … read read read, listen listen listen … by the end, some words/idioms/expressions just sticked in my mind, at that moment, I took the dictionary, try to find the definitions, and to match them with what I understood/guessed from the context … and guess what, it's always close to the definition given by the dictionary, and it stays in my mind for ever, I even begin to use it spontaneously 😉
    I can't wait to read the second part :)

    • Yes, your experience seems to be similar to what many language learners experience. Eventually, you have to give up the dictionary and just read read read, and listen listen listen.

      Thanks for the kinds words :)

  • Alfredo

    hi Sam!!

    Great article as always :)
    It´s fantastic that you show us this things. By the way, I have a question, from where can I found the words to learn? from the dictionary? I´ve came to Greece to work and I would like to learn the language :)


    • Hi Alfredo! Usually, you don’t need to “look” for the words to learn. You will simply acquire them through exposure, by reading and listening and speaking as much as you can in the language. In the beginner stages this means working mostly with a textbook and, perhaps, with a tutor. But once you reach an intermediate level you’ll start getting exposed to native content, such as blogs, news stories, books, podcasts, and so on.

      In any case, you can check out the “Wiktionary Frequency List” which is a wonderful resource to find the most common words in a given language. For Greek, follow this link to access the 5000 most used Greek (Ελληνικά) words based on contents of However, I would not recommend anyone from learning words from this kind of list. It’s always much better to learn from context!

  • Evan Jacop

    Really useful article , I’d like to thank you

    • The pleasure is mine! Thanks for your comment, Evan.

  • Wai

    Good point, but this just proves why languages like Chinese or Japanese that use characters that are unreadable are the hardest languages to learn. You can guess the meaning of the words you don’t know but you won’t really learn a new word if you don’t even learn the pronunciation of it.

    • Hi Wai.

      Well, of course, this pre-supposes that the only way to learn a language is through reading. Plenty of people learn languages by listening and speaking them, in which case knowing characters is not a necessity.

      I’d also like to add that technology and the internet has made it considerably easier to read languages such as Chinese or Japanese because it’s easy to hover over a word to see the pronunciation. Lots of app are really useful in this respect.

  • sid

    intelligent article

  • If you didn’t interrupt yourself so many times to plug your other articles, I would have finished reading before commenting. Seriously – let the piece speak for itself. If it’s good, readers will naturally look for your others. Just really annoyed me.

    Also. Those arrows on the sides (mobile) are driving me nuts. They are so sensitive, they are easy to hit when scrolling. I really wanted to read your article, but it’s like you’re going out of your way to repel the reader. I see you even plug yourself again at the end of the article! Please. Stop. Ugh. I’m going to go redo my Google “how many words / “learn language” search.

  • Jaehaerys I

    Hey Sam,

    It’s a great article, I liked so much. In my point of view and own experience is a waste of energy always look at the dictionary new words that you don’t know(sometimes neither the dictionary knows the word), I’ve been doing this for a long time in everything that a I read, like you said, sometimes you know every words in a sentence and don’t know the meaning, haha it’s really true, so I’ll try not always open the dictionary and search for something that I can understand for context.

    Thank you and take care.

  • seeker

    good shit

  • Mark

    Fascinating. I was surprised that English wasn’t #1. We have such a lofty opinion of our language, us English speakers.

    • Herbertificus

      And for good reason — it IS the best language in the history of the human race. The mishmash of conquests, clashing histories and culturo-linguistic convolutions that created the English language is both a story AND a result that has no equal in all of human history.

      Let us all, on bended knee, praise heaven for the four major historical events that resulted in the creation and ascendance of Englysch . . .

      The invasion of the Angles and the Saxons. (Foundational language.)

      The invasion of the Vikings. (Vocabulary and grammer.)

      The invasion of Bill and his Frenschies. (Vocabulary. ) (Special shoutout to true badass king Harold Godwinson, who unknowingly sacrificed his life to allow the largest single extragenic force in the historical development of Englysch. Though your reign was short, your worthiness was equal to that of Bill, but Cosmic Linguistic Manifest Destiny won the day.)

      The Bubonic Plague. (Re-ascendance of Englysch.)

      The first and predominant settling of America by the Englysch rather than by any other nation. (Creation of the future globally dominant culture, economy, technological and industrial powerhouse . . . and therefore the globally dominant language.)

  • miso

    typo in your article: “you get can a rough idea”

  • Herbertificus

    Hay . . . I mean . . . HEY, Lingholic (too many dadgum homonyms in English), have you ever seen the documentary about the development of the English language called “The Adventure Of English?” It was written and narrated by the chancellor of Leads College, Melvyn Bragg. First aired around 2001 – 2003. It is fantastic. A definate Must See. I’ve watched it about four times through, and I’ve watched the first four episodes, covering the Anglo-Saxons through the Tudor period, about 10 more times. It really IS just stupendously interesting. My son is sick of hearing me repeat a passage from William Of Nassyngton’s “Speculum Vitae,” which is quoted in the doc:

    Latyn can no one speak I trowe
    But those who it from school do know
    And somme know Frensche but not Latyn
    Who are used to court and dwellyn therein
    And somme know Latyn — though just in part
    Whose use of Frensche is . . . less than art
    And somme who understonde Englysch
    Neither Latyn know nor Frensche
    But unlettered or learned, olde or yonge
    Alle understonden the Englysch tongue.

    (circa 1325)

    I don’t know about you, but I laughed hysterically at the line, “less than art.” Isn’t that a trope called “litotes?” Understatement?

    Sorry. I had to paused and laugh hysterically some more. Actually, the “less than art” part is the work of a modern translator — possibly Bragg himself.

    I assume you’ve read “The Mother Tongue,” by Bill Bryson ? Fantastically interesting and entertaining.

    Now I’m going to have to read all of your articles. Basically, discovering your website amounts to a homework assignment ! There’s just too much to learn in life. Is it possible to make a Faustian deal with the devil to be able to read and learn for twenty years and not have it count against your lifespan?