How to Learn Vocabulary Fast: The Power of Zipf’s Law

The human brain is wired to look for shortcuts. In language learning, such a shortcut is figuring out how to learn vocabulary fast. As I found out, you can get by very well despite a frightening Russian accent and no knowledge of grammar – but, alas, not without words. So every time I attack a new language I start with learning its lexicon. I purposely ignore all other aspects of the language until I grasped the basics of vocabulary.

Obviously, such an approach puts you in a very awkward position, and you want to get past this stage as quickly as possible. There are two questions to be asked in order to radically cut off time spent memorizing vocabulary:

  1. What words should I learn?
  2. In what order?
  3. How should I learn them?

These three questions should be dealt with in the above order. The reason is simple. No mnemonic technique or spaced repetition app will save you much time if you mindlessly go memorizing every new word you encounter. It is counter-productive, as you should soon see.

What’s productive is to follow the good old Pareto principle. 20% of efforts bring 80% of results not only in economics. The same goes for language learning, too – because human language is constructed this way.

So if you wonder how to learn vocabulary fast, you want to define what’s worth learning in the first place. Here is how.

Zipf’s Law

In 1930-s, an American linguist George Kingsley Zipf was working on the distribution of words in natural languages when he noticed a curious phenomenon. Just over a hundred English words account for almost half of both spoken and written language. The linguist dived into studying this topic and came up with a power law that continues to amaze both linguists and statisticians:

The frequency of a word is inversely proportional to its frequency rank.

If you, like me, have a blind spot in this area, and mathematics already slipped out of grasp, let me put it simpler. The most frequent word (it’s “the” in English) shows up twice as often as the second most frequent word (“of”). Similarly, the latter occurs twice as often as the fourth (“and”); that one – twice as often as the fifth, and so on.

The real-life application of Zipf’s law is quite impressive. If you were to take any English book and count the frequency of each word, you would find out that the first 135-138 most frequent words account for the good half of the book. Same story with spoken language: 50% of our conversations are made up of just two hundred words.

Zipf’s law is not limited to English. The same would hold true for many other languages1. So if you want to know how to learn vocabulary fast, master items that give you access to the heart and soul of your target language. How?

Get your hands on a frequency list.

Frequency lists


That was the only thought that came to my mind as I was looking at Spanish frequency list for the first time. How this can help me to learn Spanish to any degree?

And indeed, what can you do with these words as a learner of Spanish, English or French:

10 Most Frequent Spanish Words 10 Most Frequent English Words 10 Most Frequent French Words
de the de
la of la
que and le
el to et
en a en
y in l’
a that les
los it est
se is des
del was du

Seriously, what the hell is de?

The reason this de is number one in the list is precisely that it has one hundred and seven various usages that absolutely cannot be explained without reference to Routledge Dictionary of Language and Linguistics. Similarly, how would you explain what is the to a learner of English? This word just wouldn’t make sense for native speakers of languages that don’t even have definite articles (I can testify it as a native speaker of Russian).

So why do these on the first sight meaningless words hit all records of frequency?

Three levels of frequency

The frequency lists of different languages are of essentially the same nature. The same structural pattern reveals itself over and over again as the Zipf’s curve progresses from its upper to the middle to lower parts.

Words that make up the upper portion of the curve have basically the same function in all languages where Zipf’s law applies. The same goes for the middle and the lower segments. So what are those functions?

Upper segment – function words

The upper part of the curve comprises function words: determiners, prepositions, auxiliaries, and conjunctions. All these words are there only to serve as syntactic cement, so to say. They help us to put a sentence together in a grammatical manner. Because it would be hard to build phrases without them.

Middle segment – basic concepts

If we were to weed out the most frequent (read: function) words from a language, we would hit the first content words. These are words that have concrete semantic meaning and refer to a specific concept in your mind. Time. Like. More. People. Middle segment words refer to these basic categories and concepts essential to human nature.

At the same time, the middle segment is context-dependent. Words that form this part of Zipf’s curve tend to change depending on the text or the topic of conversation. Words like “scuba”, “dive” or “buoyancy” can reveal being surprisingly frequent if you’re reading PADI Divemaster manual. But not so in a book about finance, for example.

Lower segment – low-frequency words

Finally, everything that is not a functional word or a basic concept falls into the third category: low-frequency words. The larger the language corpus, the lesser would be their frequency. This is because they neither carry a syntactic function nor hold a broad semantic meaning. Such words have a rather restricted application. If items like “thing” can crawl in literally any sentence, how often would you utter something like “mouthwatering” or “onomatopoeia”?

By this point, the mystery of how to learn vocabulary fast should begin to unveil. Your 20% of effort that brings 80% of results here is learning words from the upper and middle segments of the frequency curve.

How to learn vocabulary fast

To be as efficient as possible in vocabulary learning, it is useful to think as minimalist. Here, the fewer words you have to learn consciously – the better.

The main reason is that vocabulary learning is simply boring. Each word requires multiple repetitions, and you will have to get your brain through numerous recall sessions before something sticks to it. People – la gente. When – Cuando. I go – voy. How exciting! Nevertheless, you can’t avoid such “artificial” learning at the first stage because you need something to start with.

Your goal, however, is to pass this stage as quick as possible and turn on your subconscious incidental learning mechanism. This way, your brain acquires vocabulary automatically while you read, listen or do other fun things in your target language.

It can be done in three steps.

Step 1: Conscious Learning

Start with memorizing just 150-200 most frequent words. (Lexiteria has free 200-word frequency lists in 40 languages). Typically, these words are ones that make up half of any book in any language. For the first 1-2 weeks, this is the best investment of your time.

As you learn these words, they become accessible to automatic processing and activation. This means that every time you hear or see one of them, your brain recognizes it and you recall words meaning. The more often you encounter each word, the more effortless becomes its recognition. Similarly, the more often you use a word or a phrase in your speech, the easier it is for you to use it on the next occasion.

So… it makes sense to start using them, right?

Step 2: Transition

Once you nailed the first two hundred words, it’s time to shift towards immersion. Spend the next 2-3 weeks learning another 200-300 words from the frequency list but supplement your vocabulary diet with a daily dose of input. Find a podcast, or a YouTube channel, or a blog in your target language and commit to listening, watching, or reading it for 30 minutes a day.

It’s okay if you don’t understand a big deal. The idea is to get your brain used to hearing, seeing and trying to make sense of the language. Plus, with this routine, you get into a natural spaced repetition cycle that allows you to refresh vocabulary you have already mastered.

Step 3: Immersion

Your next step is to put your flashcards down and open yourself to as much target language input as possible. With this goal in mind, I usually buy any of my favorite books available in that language both in paperback and in audiobook format. Then, each morning I simultaneously read and listen to it for one hour.

It’s an extremely effective method for gaining vocabulary en masse. When you read and listen to large amounts of foreign language input, your brain automatically collects statistics on word frequency. If a word is frequent enough, you can’t fail to notice it. Once I see that a certain word shows up in the text over and over again, I quickly look up its definition and continue reading.

As a consequence, I end up not learning but acquiring the middle segment words without much conscious effort on my part. Needless to say that going over my favorite novels makes the process very pleasant.

Zipf’s law is a very powerful tool for those who want to know how to learn vocabulary fast. It frees up your hands and allows you to learn language the natural way. Obviously, you still can complete a deck of flashcards with 5000 most frequent French words, if you feel like doing so. But because of the frequency effect you’d learn the same 5000 words having finished several books. So what do you prefer?

