Blog

Day 8: CNY Eve in Seoul

Happy Year of the Pig

Despite a relaxing weekend in the boonies, I moved into Seoul proper in search of 热闹 (re4nao5, “hot & noisy”, i.e. bustling). I have a dorm room booked for the entirety of the Spring Festival (as determined by the official holiday in China). The place is more cramped and smelly, but the cheapest accommodation yet. I did discover on my way out of the last guesthouse, that Ctrip was charging me nearly 50% more over what the base rate would be if paying cash. This presents something of a pickle as I don’t want to be overpaying, but so long as I can pay for accommodation from my Chinese bank account, that may be the only way I can ever get my money out of China.

Planning to feast in the evening, I wanted to skip lunch, but after arriving at the hostel, my stomach started growling something fierce. I set out to find some food in the Hongdae neighborhood where I was staying and was surprised to see just how many shops, restaurants, and cafes had paper signs on their windows announcing their closure for a number of days. The lunar new year (새해 “saehae”) is only a single day public holiday in Korea, and I’ve heard conflicting reports as to whether it is a big deal or not.

“Spicy Mixed Noodles,” not spicy, but cold

Like many a hostel in Seoul, the Bird’s Nest is mostly staffed by an assortment of “volunteers” who are friendly enough, but it was difficult to get more than a couple words out of any of the guests. It’s funny how much the vibe of youth hostels varies from country to country.

Towards nightfall, I started researching food options in the area, wanting to do something fairly extravagant, even if I was dining alone. I settled on a pork bbq all-you-can-eat place and headed through the Hongkik University night market strip to reach it. The cluster of lanes with restaurants, bars, food stalls, and street musicians certain check off the box for an exciting atmosphere, though it was hard to tell if the holidays had any affect or if it was a typical Monday night. At the restaurant I asked for a table and was informed that I would have to pay for two portions. I decided to have bbq another night. This conformed with a vague notion I have that anywhere where you are cooking at the table has a two person minimum.

I settled for a Japanese curry house with tonkatsu (a breaded fried pork chop) ordered one notch down from the spiciest available and topped with roasted garlic flakes. I wouldn’t exactly call it settling, though doing Japanese food in Korea to celebrate Chinese New Year is a bit of a stretch. At least I had pork in honor of the Year of the Pig.

Back at the hostel, the common areas were full of people eating, drinking, and playing cards. It was like a completely different hostel than the afternoon. I played cards for a couple hours, stuck to water, and was in bed well before midnight


Lunch (Mixed noodles with miso)5,500 W
Curry Set13,000 W
Hostel (6 nights)367 RMB
Total: 479 RMB
(USD 71)

Running Total: 2720 RMB (USD 403)
Daily Average: 340 RMB (USD 50.4)

Day 6/7: Taking it easy

I would say it was a lazy weekend, but that means I just spent all my time on my phone, in books, and on my computer. I spent two hours on Saturday morning catching up on the news of the week, and I think I will try to enforce a schedule where I only read up on the news once a week (rather than spending 30-60 minutes every morning). Nevertheless, it is disturbing to see Donald Trump on the news broadcasts every time one passes a TV. I obviously can’t understand what is being said, but when it switches from him shaking hands with Kim Jong Un to stock video of nuclear explosions, it leaves one with a lingering doubt of WTF is happening?!

I spent Saturday afternoon in a coffee shop near the subway station (where it is a bit more developed) and had some snacky food on my way back to the guesthouse apartment.

A Hazy Shade of Winter

Sunday was a rainy day so I didn’t even bother going outside. After a productive morning, I took two hours to finish Season 2 of Luke Cage (I know, I’m way behind on TV), and got back to studying. They say you never master a language until you start dreaming in it. I call BS on that. I never dream in Chinese, but I was definitely dreaming in Korean last night. Having spent close to 40 hours in study this past week, I’d be surprised if my brain wasn’t trying to sort itself out.

When I finally wandered out to get some dinner, I looked at all the restaurants in the vicinity, but I wasn’t too hungry and none of them really called to me. Instead, I went to check out a supermarket to see if I could work up an appetite and had a head smacking realization when I saw the ready-to-eat salads in the produce section. I even bought a little bottle of dressing to carry with me so I keep eating fresh greens.

Is it racist to say “Oriental” when describing salad dressing?

I’ve managed to save a lot of money these two days by fully availing myself of cereal and ramen noodles. It also doesn’t hurt that every time I step on the scale in the morning, I’m shocked at how high the number is. I definitely need to make salads an integral part of my diet.


Latte (Holly’s Coffee)4,800 W
Set Meal8,000 W
Total: 12,800 W
Supermarket Salad5,400 W
Grand Total:18,200 W
(110.1 RMB)
(16.3 USD)

Running Total: 2241 RMB (USD 332.34)
Daily Average: 320 RMB (USD 47.47)

I’m expecting my expenses to spike around Chinese New Year, but otherwise the daily average should keep ticking down as I figure out a groove.

WOTD: Gimbap

Korean sushi?

I don’t really have much to say about 김밥 (gimbap or kimbab), other than that it is probably one of the first words I learned in Korean and the cause of my initial awareness into the morphemes. 밥 (bap) is the word for cooked rice (as in bibimbap) and although 김 (gim) is the same sound as in kimchi, it actually means “laver” (the proper name for that seaweed wrapping). Also, gimbap is quite delicious.

Though it looks like a sushi roll, there is usually no fish and definitely no raw fish in there. The pictured rolls both contain scrambled egg, ham, pickled radish, some other veggies, and their respective titular ingredients: tomato and chili pepper. The chili pepper gimbap was way better than the tomato one, despite being the place’s signature item.

My Chinese Learning Story, Part 2

Previously, I anecdotally recalled my earliest encounters with Chinese (outside the context of Crouching Tiger, Hidden Dragon) and covering the critical first year in China. I did forget to mention a few critical aspects of my learning program.

In the year between my first visit and ultimately moving to China, I did buy some Chinese learning PC software from a discount book store. Though I spent quite some time doing the vocabulary drills, the whole experience slipped my mind because of how ultimately stupid the endeavor was. The software didn’t teach “blue,” it taught “deep blue” and “light blue;” the food section focused entirely on Western foods like toast and pizza.

Furthermore, I should give credit where credit was due. I got a significant boost from a PC-based dictionary which both included animated stroke order of characters and had a hand drawn look up function. This is what helped me figure out writing on my own. (Mind you, these were pre-smart phone and Pleco days). The creator of the dictionary, Pablo, also made a useful character learning game called Pingrid.

You can find them both here. The latest version is fairly different from what I played 10 years ago, but the basic idea of having to find a target character given its pinyin and/or English definition (drawn from CEDICT) from a set of characters really helps one build sight recognition. Spaced repetition was involved and there was an incentive to find the character quickly. One could change the fonts and even select random fonts to really train visual recognition. The best part was probably the ability to upload your own vocabulary list and the system could handle up to 2-character long words. I highly recommend Pingrid for anyone looking to improve their reading speed.

Recognizing characters became a game I played in my daily life. From menus to bus stops, shop signs to the subtitles on televisions, I was in a constant quest to see what I recognized and was encouraged by the fact from one week to the next, I would find new characters and words had lodged themselves into my brain.

In China, it wasn’t just a matter a self studying some textbooks, listening to tons of podcasts, and interacting in my daily life, I also invested a lot of time in building vocabulary lists. For every textbook chapter and podcast, I took notes of the vocabulary words used, collecting everything in a master Excel file. There were three main purposes to this. First, to see if I had already “learned” the word from a different source (or even a previous chapter). Secondly, to index the words by the characters contained within them. This allowed me to notice that a particular character was being used in multiple words and build up an internal semantic representation of the characters. For example, in an early lesson of Short Term Spoken Chinese, the vocabulary list included 书包 (shu1bao1, “book bag”) and 包子 (bao1zi “steamed stuffed bun”), it was obvious that 包 was contributing to the meaning of both words and as I came across more words with 包, my understanding of the character slowly shifted. [I would be uploading the Excel files and/or screen shots, but those are on an external hard drive in a suitcase in Beijing.] Finally, and possibly most importantly, writing characters out helps one remember them. I only occasionally put any effort into hand writing, but typing pinyin into an IME which brings up a list of homophone is a process that forces on to really focus on the character as written in a textbook and match it to the screen. I took the typing task even further by transcribing the dialogues (textbook only) into a Word document.

Anyways, eventually I ran out of Short Term Spoken Chinese textbooks that could be purchased at Book City, and being a cheapskate I didn’t want to invest a lot of money in other textbooks. So, I got a library card. That allowed me to access lots of 对外汉语 books (Chinese as a Foreign Language), including both other series of textbooks and various graded readers. I became voracious, and was willing to put in a little time skimming over books that were too easy for me. I found that quite helpful because it was a review/consolidation of the language in novel forms, and yet one occasionally came across new words to jot down.

At some point, I bought a used iPhone from a friend and discovered Pleco and a bunch of new podcasts. Though the phone was stolen a month later, I was hooked to the even more efficient studying tool and picked up an iPod Touch in Hong Kong.

As my listening and reading comprehension grew, I did dive into Chinese media, watching several television series, including 《家有儿女》、《男人帮》、《喜羊羊与灰太狼》、《北京青年》、《北京爱情故事》、《康熙微服私访记》以及《还珠格格》。I gave up on 《蜗居》 for being too melodramatic (not that the other serials weren’t soapy), and only watched about half of the episodes of 《我爱我家》, which for some reason didn’t have subtitles. I’m not sure how well the links will work outside of China, and its crazy to see how the internet companies are not so diligent about protecting IP. They were all free to watch back in the day, now most of the streaming sites want you to pay for a membership in order to watch a 20 year old TV show.

And things continued mostly like that for the 4 years I lived in Shenzhen. Though I became more busy with work, I would always spend at least one morning or afternoon a week in a coffee shop self studying because it was a past time I enjoyed.

Day 5: Moving to the ‘burbs

Modern Korean living room

I thought my Airbnb was in the “suburbs,” but I was wrong. I moved to a guesthouse (also in Incheon) that is practically in the countryside. I mean that quite literally there are fields in between clusters of high rise apartments. I went for a run in the afternoon, but there was no where to go… and it started snowing.

The guesthouse is a converted apartment. It’s quite spacious and the only other guests are a father/daughter pair from Brazil (of Korean descent). Portuguese is a weird language; it often sounds like they are speaking Russian, or French, or something I can’t quite put me finger on.

Nothing exciting to report on the travel front. Spent most of the day working on the website (you may notice that I’ve moved all the Chinese content to a separate subdomain [zh.linguistese.com]).

Good news for the guesthouse is that breakfast (cereal) is included and there is an entire cabinet full of ramen, so I can be extra cheap for a couple of days.


Gimbab (Lunch)6500 W
Guesthouse (3 nights)270 RMB
Total:309.4 RMB
(USD 45.88)

Running Total: 2131 RMB (USD 316)
Daily Average: 426 RMB (USD 63.2)

Duolingo

I’ve been wanting to talk about Duolingo for a while because I have a lot of feelings regarding its teaching methodology. I tried it first a few years ago, but quickly burned out for three reasons: a) the sentences were weird, b) the grading was tough on spelling/formatting, and c) there was no instruction. So, one had to start a lesson and make mistakes to at least get a sentence level translation, but the 3-strikes-and-you-are-out system made learning a real chore. When I started up again over a year ago, it was clear they had worked out a lot of the flaws, and the “relaunch” last spring shows the company is continually trying to optimize the system. Nevertheless, there are some major flaws with Duolingo:

  1. Over reliance on full sentence translations. Sometimes the question bank pulls up a word-word pair or a word-image pair, but it seems the vast majority of Duolingo’s question bank involves full sentences. While the context allows greater flexibility in learning expressions (when context is needed for disambiguation) and this allows for testing of grammar (S-V agreements, conjugation, etc), sometimes it can be useful just to drill the smaller elements of a language. For a mobile first learning platform, full sentences involve a lot of typing on a phone keyboard. Maybe I’m just old, but that’s a pain in the neck. On the other hand, when selecting from the word bank, there is usually only one grammatically sentence that can be formed (I’m thinking translating from target language to source language), so one could essentially ignore the target language.
  2. Limited question bank. Not only are most vocabulary items stuck within the context of a sentence, they only exist in one or two sentences. The main exceptions of which are the earlier levels where you drill “he sits, she sits, the dog sits, the duck sits, … ad nauseum.” One needs to see words in a variety of contexts to really master it. However, after a few times through the set of questions in a particular lesson, one has memorized the meanings of all the possible sentences and can hack it.
  3. Rigid lesson structures. Duolingo segregates its question bank into discrete lesson units. So topic A has 2-6 lessons, and each lesson draws from its own question bank. Because the current crown system requires repeating all the lessons in the topic up to 11 times, one can quickly over learn. For example, I know from memory that Japanese Hiragana 1 Lesson 1 has dog, cat, and bird. If I do a random review, and come across bird, then I automatically know the next questions are going to be about dogs and cats.
  4. Weak on Asian languages*. Having thoroughly explored Duolingo’s Korean and Japanese units as well as briefly playing around with Chinese, I can attest that Duolingo doesn’t understand the need for a stronger drilling of the language basics. I started with Japanese because I wanted to improve my recognition of hirigana and kitakana, but because of the rigid lesson structures and poor overall design, I can learn groups of 4-5 characters in discrete chunks, only needing to distinguish among those 4 characters (instead of against ones that are visually similar and easily confused). When testing character to romanization, Duolingo says the character out loud, undercutting the point of training sight recognition (and complains if you try muting your phone). Finally, there are a bunch of characters that aren’t even taught in the early sections or on their own at all, only being mixed into to full sentences further down the lesson tree.
  5. No sense of learner’s vocabulary level. Though Duolingo offers the option to “test out” of chunks of lessons and touts the use of spaced repetition, the system really doesn’t have a clue what words I know and what words I don’t, meaning I sometimes repeatedly am answering questions that are way too easy which is both boring and a waste of time. As far as testing out goes, it merely unlocks chunks of levels, so one could theoretical start learning midway through the tree. But the gamified design still incentivizes one to max out the earlier lessons.

Despite all this, I’ve been sticking with Duolingo for over a year. Firstly, because I do find it quite useful with Spanish and German. Secondly, despite its flaws, one does get exposed to the target language and learning does occur. Thirdly, most of these flaws are common to every language learning app/software I’ve seen.

*I have seen Lingodeer presented as an alternative to Duolingo with a better treatment of Asian languages. I’ve only used it a little so far, but other than substituting a deer for an owl, it seems to be similarly structured.

WOTD: 炸酱面

Would you like some noodles with your sauce?

炸酱面(zha2jiang4 mian4, “fried sauce noodles”), aka “noodles with fried bean sauce” is one of those dishes that I strongly associate with Beijing, though its origins lie in Shandong. Aficionados of Chinese cuisine would note that Beijing food (and all cooking in northeast China) is rooted in 鲁菜 (Lu3cai4 “Shandong cuisine”), one of the 8 great schools of Chinese cooking.

I’ve always loved proper 炸酱面 because the noodles are thick and chewy, and remind me of freshly made spätzle, while the freshly shredded cucumber and radish (or even more veggies if you get lucky) offset the rich and salty sauce.

Zhajiang mian is also a Korean dish (in a sense). Alternately called 자장면 짜장면 (jajangmyeon / jjajangmyeon), it is clear we are talking about the same dish linguistically. In culinary terms, Jajangmyeon is its own dish in Korea. The sweeter sauce is full of onions and seafood (instead of just salty beans and pork scraps).

As a certain podcast host would say, always read the plaque. Interestingly enough, Incheon claims Jajangmyeon as one of its specialties, though it gives credit to the influx of Chinese workers in 1884. Apparently, they would make the sauce in China and carry it across on merchant ships, serving it on freshly made noodles.

Day 4: Around Incheon

My Airbnb was a small apartment I would describe as off in the suburbs, but I really can’t tell whether Incheon really has a central part. The city is somewhat shaped like a starfish, and if you include the islands, then its a starfish that just lost a fight to a shark. In any case, it was a comfortable little apartment with heat coming through the floor. I kept expecting the host to show up at some point as I was sleeping in the library, but I never met her.

In the morning I set off to explore, taking a long and not very scenic walk towards the Incheon City Hall and the green strip with the Arts and Music Hall. I bumped into a couple of missionaries on the way and saw a demonstration on the steps of the city hall, both sights I’ve long grown accustomed to not seeing.

Why is Chinatown always on a hill?

There wasn’t much to do around the central strip, so I hopped on the subway to head back towards the area around the port and main train station. More of a historic district, there is a well maintained Chinatown, a Japan town, and lots of monuments to various historic firsts, such as the opening of US-Korea relations in December 1882, or the introduction of Christianity to Korea in April 1885 (by Americans, of course). Being Incheon, a statue of General MacArthur was to be expected, but I found him overshadowed by the memorial to the student volunteer army.

My first order of business in Chinatown was lunch, and Korean style Chinese food is both insanely expensive and strange.

Helter Skelter

On the other side of Chinatown was some sort of “fairy tale village,” where several streets fully decorated the buildings, benches, and sidewalks with imagery from fairy tales (i.e. knock off Disney). I imagine it would be quite bustling in the summer, but there was a stinging wind coming in off the water and I’m not big into selfies.

I wondered over by the Sipo Culture street, which mainly seemed to be an area with an above average density of shops, restaurants, and cafes. I imagine Korea has a per capita coffee shop that would make Howard Schultz proud, and I settled on an indie place with a Chilean theme. Since the espressos were so much cheaper than yesterday’s mainstream chain, I got two to justify my long working stint.

In the evening, I headed to to grab some food in the Sipo International Market, but it was bitter cold and already on the wane. I settled for a “loaf of bread” the shape of a basketball that I felt satisfied the need for 特色 (special-ness) because I got to watch the production process through a window. Flat tortillas were inserted into a kiln to cook, and somehow puff up.


Lunch (Chinatown)7,000 W
Espresso x2 (Cafe de Santiago)5,000 W
Bread Balloon3,000 W
Total:15,000 W
90 RMB
(USD 13.37)

Running Total: 1822 RMB (USD 270.75)
Daily Average: 455.5 RMB (USD 67.69)

Man, I was really hoping spending so little one day would bring down the average more than that. I am starting to worry that my coffee habit is too expensive (never mind the fact that I have 3 bags of coffee with me).
Soju may be a more affordable writing juice.

Is Korean hard?

I’ve got this crazy notion that Korean is an easy language to learn. It’s probably just hubris, but I’d like to explore the idea. Mind you, I consider Chinese an easy language, so I may be peculiar to begin with.

At this point, I’ve finished reviewing the first half of the Korean textbook, almost finished review Talk To Me in Korean Level 1, and have 82 crowns in Duolingo. (For more on these, see here). Also, when it occurred to me to download a Korean dictionary on my phone, I also went ahead and picked up a couple other language learning apps to check out.

Why Korean should be hard

  • Politeness levels. Korean has several levels of speaking/writing ranging from very informal to very formal. Not only does one have to learn entire sets of grammar for each level, but it is also necessary to use them appropriately depending on who you are speaking to.
  • Grammar. Like Chinese, Korean does tend to be “telegraphic,” i.e. leaving a lot of information (especially about the subject of the sentence) to context. However, Korean is highly inflected with the particles attached to the ends of nouns to indicate whether they are the subject, theme, object, location, etc. and endings to verbs reflecting not only politeness level but also tense and aspect.
  • Pronunciation. Korean phonology is quite different from English, with more explicit use of unaspirated consonants and a ton of vowels that are rather difficult to distinguish between. Like Japanese, Korean has a fuzzy boundary between “l” and “r” as represented by the letter “ㄹ”. (btw, I’m not familiar with any part of China where there is confusion between “l” and “r”, so if you are going to make racist jokes, at least get the stereotype correct).
  • Numbers. Korean uses two sets of numbers: Native Korean numbers and Sino-Korean numbers which were borrowed from China back when “Chinese” sounded more like Cantonese. The native Korean numbers are crazy like English where there is only a murky connection between “two”, “twelve”, and “twenty.” (Unlike Chinese which is uber-logical with the “two”, “ten-two”, and “two-ten”). Some of the numbers also change when paired with measure words. (I believe Japanese also does this, and Chinese only does it for the number 2). Also, measure words!
  • Phonological Rules and Coarticulation. So, unlike Chinese where a syllable is a syllable, Korean–despite being written in blocks of syllables–has complicated rules where phonemes shift to the next syllable or change depending on if it is being followed by another syllable or what the next syllable starts with. These rules are even reflected in the grammar where the particles used depend on the sound structure of the word or there are direct spelling changes in the conjugation to make it easier to pronounce. Most European languages do this to, but when it’s your native language you don’t think about it.

Why it is “easy”

Despite all that, I still feel that Korean will be easy to learn. This is primarily do to the fact that Korean not only has an alphabet, but it is a beautifully designed alphabet that seriously only takes a day to learn. (Compare that to the half a semester in college dedicated to learning the Arabic alphabet). Once you have an alphabet you can wander the streets looking at signs and playing your own personal version of “Hooked on Phonics” and this is where the “Korean is easy” notion really kicks in. I’m honestly astounded by how many loan words are floating around, both from English and Chinese. Maybe I’ve been in China too long, where loan words are highly “sinified”, but I really didn’t expect to see so much English. Consider the following vending machine:

How about a cuppa?

A quick little Korean lesson first: 커피 (keo pi) is coffee (“p” usually substitutes for “f”). The top row says 밀크커피 (mil keu coffee) , 살탕 커피 (sal tang coffee),트림 커피 (keu rim coffee), and 블랙 커피 (beul laek coffee). So saltang is the Korean word for sugar, but the other three just sound out milk, cream, and black. I know for a fact that Korean has its own words for milk and black. In fact, milk is available on the bottom row, where it is properly written as 우유 (woo yoo).

I think a lot of the English loan words are just market-level stuff, but the Chinese goes deeper. Riding on the subway, every stop is written in Hangeul (Korean script), hanzi (Chinese characters, though sometimes I feel they are pulling double duty as kanji also), and English. With the hanzi and Korean syllables aligned, one quickly notices that more than 90% of the subway stop names sound a lot like the Chinese, e.g. yongsan instead of longshan, dongdaemun instead of dongdamen. I’m just guessing here, but I think there was an analog of the Norman invasion (which infused English with French vocabulary) between China and Korea.

So, to answer the question is Korean easy? If you already speak English and Chinese, maybe. Being “literate” allows one to engage with much more of the language much quicker, not only can one shortcut a lot of vocabulary by recognizing loan words, the higher volume of linguistic input powers incidental learning.

I have also heard that speaking Korean makes it much easier to learn Japanese.

How many characters do you need to know?

This really depends on what you mean by “know” and what you intend to do with your knowledge of Chinese characters.

Reading the Newspaper

There is a common factoid bandied about in the Chinese education world, that of the tens of thousands of characters out there, one only really needs to know about 2,000 (or 2,500 or 3,000) in order to read a newspaper. This is 废话 (fei4hua4 “garbage talk”). Firstly, what on earth do you want to read a Chinese newspaper for? Secondly, newspapers are hard. They are full of proper nouns, e.g. names of countries, world leaders, and companies, and either overly rely on abbreviations or get cute with wordplay. So, frankly unless “knowing” a character implies a deep knowledge of the morphemes it represents (i.e. a broad knowledge of the words it appears in and/or the ability to interpret it in new contexts), then any of the cited numbers of characters is going to be insufficient to read a Chinese newspaper with full comprehension.

Though one often encounters the cited factoid when someone is trying to sell you some newfangled way to learn a bunch of essential hanzi and master Chinese literacy, there is at least a logical basis to the myth.

Frequency Analysis

Firstly, frequency analysis of hanzi pretty consistently finds that a “smallish” set of hanzi cover a significant proportion of a given text or corpus. Chinese scholar Zhou Youguang compiled several character frequency studies to find that the first 1,000 most common characters have a 90% coverage, the first 2,400 most common characters have a 99% coverage, etc. (see table below)

# Characters10002400380052006600
% Coverage90%99%99.9%99.99%99.999%

So, let’s say you know those 2,400 characters and are reading some text. Well, on average 1 in 100 characters is going to be unfamiliar. That really does not sound so bad, but again, recognizing a character doesn’t do much good if you don’t know the word it is part of. By way of personal anecdote, though I have a pretty extensive vocabulary (both in terms of raw characters and words), I’m constantly running into things that have me want to check the dictionary.

So which exactly are the most frequent characters? It depends on the corpus of texts you look at. I’d recommend the frequency list put together by Jun Da, since the scope far exceeds any reasonable expectations of vocabulary, covering 9933 hanzi (there are a handful of characters that appear in their traditional form as well as some which aren’t currently part of modern Chinese).

Lists of Common Characters

Xing Hongbing lists 15 “Common Character Lists” ranging from 1928 to 1985 and 5 “General Purpose Character Lists” created between 1965 and 1987, the most recent of which make up a system of 2,500 + 1,000 + 3,500 characters that pretty much encompass everything you are likely to run into unless you have a penchant for archaeology. Though construction of the common character lists took frequency into consideration, much like Special English, there is an attempt to balance raw frequency against utility. For example, among the first 1,000 most frequent characters on Jun Da’s list, there are 7 characters not among the “《现代汉语常用字表》常用字(2500)” [Modern Chinese Commonly Used Characters]. Those seven hanzi (e.g. 尔、伊、谓、诺、伦、俄、洛) are not particularly obscure, but they aren’t exactly high priority characters (unless you are from Russia, in which case you need 俄 on the first day), reserved for the upper levels of HSK test preparation (if at all). Nevertheless, all seven of those characters are in the second list: “《现代汉语常用字表》次常用字(1000)” [Modern Chinese Secondary Commonly Used Characters].

The two sets of commonly used characters are completely contained within the “《现代汉语通用字表》(7000)” [Modern Chinese Characters for General Purposes]. This list is actually part of the centralized push to standardize characters in the People’s Republic of China. If you come across the booklet, there are about 10 pages dedicated to listing out common variants of characters that are no longer “permitted” in printing official documents.

Educational Lists

The old HSK came with a fairly comprehensive list of characters and words divided into four levels. The test prep materials listed out approximately 800, 800, 700, and 600 characters to master (2900 total). Unfortunately, the new HSK not only cut the vocabulary range down significantly, it completely scrapped the idea of a character list separate from the recommended vocabulary words.

Ministry of Education standards for compulsory education (e.g. primary and junior high school) specify that students should recognize 1,600 characters by 2nd grade, 2500 characters by 4th grade, 3,000 characters by 6th grade, and about 3,500 characters by 9th grade (requirements for writing from memory are lower). As for what those characters are, the MOE published its own list of 3,500 “Common Characters for Chinese Language Courses” also following a 2500/1000 character split. This list mainly overlaps with the Common Chinese Character list described above, but there are differences. Interestingly enough, the list highlights 300 characters out of the 2500 set which should be taught first, but otherwise there are no suggestions as to the order in which the characters should be taught.

Conclusion

I don’t know how many characters you should know, but there is always more to know. If you are curious about the overlaps between the various character lists, I have a handy excel sheet here.

See more resources


Sources:

周有光 (1992) 《中国语文纵横谈》,人民教育出版社。
[Zhou Youguang, 1992. Discussing the Length and Breadth of Chinese. Peoples Education Press.]
referenced via —
邢红兵(2007)《现代汉字特征分析与计算研究》,商务印书馆。
[Xing Hongbing, 2007, Characteristic Analysis and Computational Research of Chinese Characters. The Commercial Press.]