Chinese Vocabulary by Frequency
Mandarin Chinese words ranked by corpus frequency: top 100, top 500, top 3000. Source: SUBTLEX-CH subtitle corpus.
Frequency lists based on the SUBTLEX-CH corpus (Cai & Brysbaert 2010), derived from ~33 million words of Chinese subtitle text.
Coverage Statistics
| List | Words | Text Coverage |
|---|---|---|
| Top 100 words | 100 | ~65% of everyday text |
| Top 500 words | 500 | ~75% of conversation |
| Top 1,000 characters | 1,000 | ~90% of common written text |
| Top 3,000 words | 3,000 | ~90% of standard modern text |
| Top 10,000 words | 10,000 | ~99%+ of modern text |
Key insight: Learning the top 1,000 characters gives you access to 90% of any Chinese text you encounter. This is achievable in 6–12 months of focused study.
Top 100 Most Frequent Words
| Rank | Chinese | Pinyin | English | Notes |
|---|---|---|---|---|
| 1 | 的 | de | possessive/attributive particle | Most common character in Chinese |
| 2 | 一 | yī | one; a | Also used in phrases |
| 3 | 是 | shì | to be | Copula (before nouns only) |
| 4 | 不 | bù | not, no | Changes to bú before 4th tone |
| 5 | 了 | le | completion/change aspect | NOT a past tense marker |
| 6 | 人 | rén | person, people | |
| 7 | 我 | wǒ | I, me | |
| 8 | 在 | zài | at, in; to be at | |
| 9 | 有 | yǒu | to have; there is/are | |
| 10 | 他 | tā | he, him | Also 她 (she), 它 (it) |
| 11 | 这 | zhè | this | 这个, 这里 |
| 12 | 中 | zhōng | middle; China | 中国, 中文 |
| 13 | 大 | dà | big, large | |
| 14 | 来 | lái | to come | |
| 15 | 上 | shàng | up, above; on | Also: to go to |
| 16 | 国 | guó | country, nation | 中国, 美国 |
| 17 | 为 | wéi/wèi | for; to be (formal) | |
| 18 | 以 | yǐ | with; by means of | Formal |
| 19 | 到 | dào | to arrive; to | Direction/resultative |
| 20 | 说 | shuō | to say, speak | |
| 21 | 和 | hé | and; with | |
| 22 | 时 | shí | time; when | |
| 23 | 地 | dì/de | earth; adverb particle | |
| 24 | 出 | chū | to exit, come out | |
| 25 | 就 | jiù | then; precisely; as soon as | Very versatile adverb |
| 26 | 你 | nǐ | you | |
| 27 | 年 | nián | year | |
| 28 | 着 | zhe | ongoing state aspect | |
| 29 | 那 | nà | that | |
| 30 | 要 | yào | want; will; need | Context-dependent |
| 31 | 会 | huì | can; will; meeting | |
| 32 | 去 | qù | to go | |
| 33 | 都 | dōu | all; both | |
| 34 | 没 | méi | not (for 有/past) | 没有 = don't have |
| 35 | 也 | yě | also, too | |
| 36 | 对 | duì | correct; facing; toward | |
| 37 | 里 | lǐ | inside; unit of distance | |
| 38 | 可 | kě | can; may | 可以, 可能 |
| 39 | 后 | hòu | after, behind | |
| 40 | 很 | hěn | very | Also copula intensifier |
| 41 | 什么 | shénme | what | |
| 42 | 我们 | wǒmen | we, us | |
| 43 | 生 | shēng | to be born; life | |
| 44 | 自 | zì | self; from | 自己, 自然 |
| 45 | 行 | xíng/háng | to walk; OK; row | |
| 46 | 做 | zuò | to do, make | |
| 47 | 这个 | zhège | this one | |
| 48 | 看 | kàn | to look, see | |
| 49 | 只 | zhǐ | only | Also: measure word (zhī) |
| 50 | 知道 | zhīdào | to know (a fact) |
Full top-500 and top-3000 lists will be generated by the chinese vocab frequency CLI command.
Data Sources
- SUBTLEX-CH: Cai, Q. & Brysbaert, M. (2010). SUBTLEX-CH: Chinese Word and Character Frequencies. Download available from the University of Ghent.
- Jun Da Character Frequency: Chinese Character Frequency List for character-level (not word-level) data
- CC-CEDICT: Definitions and translations from the community dictionary