My Chinese Learning Story, Part 2

Previously, I anecdotally recalled my earliest encounters with Chinese (outside the context of Crouching Tiger, Hidden Dragon) and covering the critical first year in China. I did forget to mention a few critical aspects of my learning program.

In the year between my first visit and ultimately moving to China, I did buy some Chinese learning PC software from a discount book store. Though I spent quite some time doing the vocabulary drills, the whole experience slipped my mind because of how ultimately stupid the endeavor was. The software didn’t teach “blue,” it taught “deep blue” and “light blue;” the food section focused entirely on Western foods like toast and pizza.

Furthermore, I should give credit where credit was due. I got a significant boost from a PC-based dictionary which both included animated stroke order of characters and had a hand drawn look up function. This is what helped me figure out writing on my own. (Mind you, these were pre-smart phone and Pleco days). The creator of the dictionary, Pablo, also made a useful character learning game called Pingrid.

You can find them both here. The latest version is fairly different from what I played 10 years ago, but the basic idea of having to find a target character given its pinyin and/or English definition (drawn from CEDICT) from a set of characters really helps one build sight recognition. Spaced repetition was involved and there was an incentive to find the character quickly. One could change the fonts and even select random fonts to really train visual recognition. The best part was probably the ability to upload your own vocabulary list and the system could handle up to 2-character long words. I highly recommend Pingrid for anyone looking to improve their reading speed.

Recognizing characters became a game I played in my daily life. From menus to bus stops, shop signs to the subtitles on televisions, I was in a constant quest to see what I recognized and was encouraged by the fact from one week to the next, I would find new characters and words had lodged themselves into my brain.

In China, it wasn’t just a matter a self studying some textbooks, listening to tons of podcasts, and interacting in my daily life, I also invested a lot of time in building vocabulary lists. For every textbook chapter and podcast, I took notes of the vocabulary words used, collecting everything in a master Excel file. There were three main purposes to this. First, to see if I had already “learned” the word from a different source (or even a previous chapter). Secondly, to index the words by the characters contained within them. This allowed me to notice that a particular character was being used in multiple words and build up an internal semantic representation of the characters. For example, in an early lesson of Short Term Spoken Chinese, the vocabulary list included 书包 (shu1bao1, “book bag”) and 包子 (bao1zi “steamed stuffed bun”), it was obvious that 包 was contributing to the meaning of both words and as I came across more words with 包, my understanding of the character slowly shifted. [I would be uploading the Excel files and/or screen shots, but those are on an external hard drive in a suitcase in Beijing.] Finally, and possibly most importantly, writing characters out helps one remember them. I only occasionally put any effort into hand writing, but typing pinyin into an IME which brings up a list of homophone is a process that forces on to really focus on the character as written in a textbook and match it to the screen. I took the typing task even further by transcribing the dialogues (textbook only) into a Word document.

Anyways, eventually I ran out of Short Term Spoken Chinese textbooks that could be purchased at Book City, and being a cheapskate I didn’t want to invest a lot of money in other textbooks. So, I got a library card. That allowed me to access lots of 对外汉语 books (Chinese as a Foreign Language), including both other series of textbooks and various graded readers. I became voracious, and was willing to put in a little time skimming over books that were too easy for me. I found that quite helpful because it was a review/consolidation of the language in novel forms, and yet one occasionally came across new words to jot down.

At some point, I bought a used iPhone from a friend and discovered Pleco and a bunch of new podcasts. Though the phone was stolen a month later, I was hooked to the even more efficient studying tool and picked up an iPod Touch in Hong Kong.

As my listening and reading comprehension grew, I did dive into Chinese media, watching several television series, including 《家有儿女》、《男人帮》、《喜羊羊与灰太狼》、《北京青年》、《北京爱情故事》、《康熙微服私访记》以及《还珠格格》。I gave up on 《蜗居》 for being too melodramatic (not that the other serials weren’t soapy), and only watched about half of the episodes of 《我爱我家》, which for some reason didn’t have subtitles. I’m not sure how well the links will work outside of China, and its crazy to see how the internet companies are not so diligent about protecting IP. They were all free to watch back in the day, now most of the streaming sites want you to pay for a membership in order to watch a 20 year old TV show.

And things continued mostly like that for the 4 years I lived in Shenzhen. Though I became more busy with work, I would always spend at least one morning or afternoon a week in a coffee shop self studying because it was a past time I enjoyed.

WOTD: 炸酱面

Would you like some noodles with your sauce?

炸酱面(zha2jiang4 mian4, “fried sauce noodles”), aka “noodles with fried bean sauce” is one of those dishes that I strongly associate with Beijing, though its origins lie in Shandong. Aficionados of Chinese cuisine would note that Beijing food (and all cooking in northeast China) is rooted in 鲁菜 (Lu3cai4 “Shandong cuisine”), one of the 8 great schools of Chinese cooking.

I’ve always loved proper 炸酱面 because the noodles are thick and chewy, and remind me of freshly made spätzle, while the freshly shredded cucumber and radish (or even more veggies if you get lucky) offset the rich and salty sauce.

Zhajiang mian is also a Korean dish (in a sense). Alternately called 자장면 짜장면 (jajangmyeon / jjajangmyeon), it is clear we are talking about the same dish linguistically. In culinary terms, Jajangmyeon is its own dish in Korea. The sweeter sauce is full of onions and seafood (instead of just salty beans and pork scraps).

As a certain podcast host would say, always read the plaque. Interestingly enough, Incheon claims Jajangmyeon as one of its specialties, though it gives credit to the influx of Chinese workers in 1884. Apparently, they would make the sauce in China and carry it across on merchant ships, serving it on freshly made noodles.

How many characters do you need to know?

This really depends on what you mean by “know” and what you intend to do with your knowledge of Chinese characters.

Reading the Newspaper

There is a common factoid bandied about in the Chinese education world, that of the tens of thousands of characters out there, one only really needs to know about 2,000 (or 2,500 or 3,000) in order to read a newspaper. This is 废话 (fei4hua4 “garbage talk”). Firstly, what on earth do you want to read a Chinese newspaper for? Secondly, newspapers are hard. They are full of proper nouns, e.g. names of countries, world leaders, and companies, and either overly rely on abbreviations or get cute with wordplay. So, frankly unless “knowing” a character implies a deep knowledge of the morphemes it represents (i.e. a broad knowledge of the words it appears in and/or the ability to interpret it in new contexts), then any of the cited numbers of characters is going to be insufficient to read a Chinese newspaper with full comprehension.

Though one often encounters the cited factoid when someone is trying to sell you some newfangled way to learn a bunch of essential hanzi and master Chinese literacy, there is at least a logical basis to the myth.

Frequency Analysis

Firstly, frequency analysis of hanzi pretty consistently finds that a “smallish” set of hanzi cover a significant proportion of a given text or corpus. Chinese scholar Zhou Youguang compiled several character frequency studies to find that the first 1,000 most common characters have a 90% coverage, the first 2,400 most common characters have a 99% coverage, etc. (see table below)

# Characters10002400380052006600
% Coverage90%99%99.9%99.99%99.999%

So, let’s say you know those 2,400 characters and are reading some text. Well, on average 1 in 100 characters is going to be unfamiliar. That really does not sound so bad, but again, recognizing a character doesn’t do much good if you don’t know the word it is part of. By way of personal anecdote, though I have a pretty extensive vocabulary (both in terms of raw characters and words), I’m constantly running into things that have me want to check the dictionary.

So which exactly are the most frequent characters? It depends on the corpus of texts you look at. I’d recommend the frequency list put together by Jun Da, since the scope far exceeds any reasonable expectations of vocabulary, covering 9933 hanzi (there are a handful of characters that appear in their traditional form as well as some which aren’t currently part of modern Chinese).

Lists of Common Characters

Xing Hongbing lists 15 “Common Character Lists” ranging from 1928 to 1985 and 5 “General Purpose Character Lists” created between 1965 and 1987, the most recent of which make up a system of 2,500 + 1,000 + 3,500 characters that pretty much encompass everything you are likely to run into unless you have a penchant for archaeology. Though construction of the common character lists took frequency into consideration, much like Special English, there is an attempt to balance raw frequency against utility. For example, among the first 1,000 most frequent characters on Jun Da’s list, there are 7 characters not among the “《现代汉语常用字表》常用字(2500)” [Modern Chinese Commonly Used Characters]. Those seven hanzi (e.g. 尔、伊、谓、诺、伦、俄、洛) are not particularly obscure, but they aren’t exactly high priority characters (unless you are from Russia, in which case you need 俄 on the first day), reserved for the upper levels of HSK test preparation (if at all). Nevertheless, all seven of those characters are in the second list: “《现代汉语常用字表》次常用字(1000)” [Modern Chinese Secondary Commonly Used Characters].

The two sets of commonly used characters are completely contained within the “《现代汉语通用字表》(7000)” [Modern Chinese Characters for General Purposes]. This list is actually part of the centralized push to standardize characters in the People’s Republic of China. If you come across the booklet, there are about 10 pages dedicated to listing out common variants of characters that are no longer “permitted” in printing official documents.

Educational Lists

The old HSK came with a fairly comprehensive list of characters and words divided into four levels. The test prep materials listed out approximately 800, 800, 700, and 600 characters to master (2900 total). Unfortunately, the new HSK not only cut the vocabulary range down significantly, it completely scrapped the idea of a character list separate from the recommended vocabulary words.

Ministry of Education standards for compulsory education (e.g. primary and junior high school) specify that students should recognize 1,600 characters by 2nd grade, 2500 characters by 4th grade, 3,000 characters by 6th grade, and about 3,500 characters by 9th grade (requirements for writing from memory are lower). As for what those characters are, the MOE published its own list of 3,500 “Common Characters for Chinese Language Courses” also following a 2500/1000 character split. This list mainly overlaps with the Common Chinese Character list described above, but there are differences. Interestingly enough, the list highlights 300 characters out of the 2500 set which should be taught first, but otherwise there are no suggestions as to the order in which the characters should be taught.

Conclusion

I don’t know how many characters you should know, but there is always more to know. If you are curious about the overlaps between the various character lists, I have a handy excel sheet here.

See more resources


Sources:

周有光 (1992) 《中国语文纵横谈》,人民教育出版社。
[Zhou Youguang, 1992. Discussing the Length and Breadth of Chinese. Peoples Education Press.]
referenced via —
邢红兵(2007)《现代汉字特征分析与计算研究》,商务印书馆。
[Xing Hongbing, 2007, Characteristic Analysis and Computational Research of Chinese Characters. The Commercial Press.]

WOTD: 部队火锅

Before…

It’s weird coming to Korea when one is only familiar with the food through the context of China. I know the names of several classic dishes in both English and Korean, but have no idea what it would be on an English menu.

部队火锅 (bu4dui4 huo3guo1 “army hot pot”) is a dish that I always assumed the Chinese were taking some liberty with their translation. The Korean for which is 부대찌개 (budae jjigae), so one would be forgiven for thinking that 部队 (bu4dui4) is just a sound loan, while 火锅 (huo3guo1) is a slight improvement over jjigae (a general term for stews) since it is cooked on the table. I always figured the “army” aspect of it described the way that a lot of people could eat from the same pot.

In truth, the 部队 refers to American troops, who apparently introduced the key ingredients during the Korean War. Though the ingredients vary depending on the “flavor” you order, they almost invariably contain hot dogs and spam (or other off-brand processed “ham”). America’s contribution to Korean cuisine. I didn’t realize how popular Spam is here, but I have seen several stores selling gift baskets of Spam.

…and after.

This particular pot was kimchi flavored of course.


WOTD: 理发

Getting a haircut on the street near the Worker’s Gymnasium

Getting one’s haircut in a foreign language is hard, especially when there is a swirling mass of overlapping vocabulary. To begin with, how does one even say hair? Chinese distinguishes between hair on one’s head and hair on one’s body. The first is 头发 (tou2fa, literally “head hair”) while the second is 毛 (mao2, “hair/feather/down”). Don’t get 发 (fa4 “hair”) confused with 发 (fa1 “send out”), the two characters are only the same because of simplification. Also, note that fa4 loses its tone in the 头发 construction.

To cut one’s hair, the operative word is 剪 (jian3 “scissors*”) and one could say 剪头发 (jian3 tou2fa). However, one doesn’t usually just cut one’s hair in China. It is pretty standard to get a 洗剪吹 (xi3 jian3 chui1 “wash, cut, blow”) where you get a shampoo first, haircut, rinse, and blow dry as a package deal. In smaller shops, you can save a buck by opting for a haircut only: 单剪 (dan1 jian3 “single cut”).

Gender also comes into play in Chinese between the pair of words 理发 (li3fa4 “tidy hair”) and 美发 (mei3fa4 “beautiful hair”), much like the distinction between a barbershop and a beauty salon. Both words can add a 师 (shi1 “master”) to the end to refer to the person holding the scissors (e.g. 理发师,美发师) or a 店 (dian4 “shop”) to the end to refer to the room where it happens (e.g. 理发店,美发店). Prices in salons tend to rely on the “experience level” of a hairdresser, and there is a whole lexicon of terms given them important sound titles, which I won’t go into here, because I always seek out the cheapest options.

In parts of China (even Beijing) where there are lots of old people still making up a community, you may find in public parks or on the street side an old barber with a pair of sheers. Give him a try.


*To refer to scissors, one needs to add 刀 (dao1; “knife”) to the end. By itself, 剪 typically functions as a verb meaning “to cut as scissors do,” so the full term 剪刀 (jian3dao1) could be literally thought of as “knife which cuts like scissors,” i.e. scissors.

WOTD: Bags

My worldly possessions

I was working Korean in Duolingo and I came across the following:

공공칠가방
gong-gong-chil ga-bang

I immediately recognized the root 가방 as “bag” (which in my semantic space is centered around the prototypical schoolbag) and since it was a Numbers lesson, the first three syllables correspond to 0-0-7. So, a “James Bond bag,” or a briefcase.

Chinese, unfortunately, is not so creative in its description of briefcase, the two main ways of saying which are 公文包 (gong1wen1bao1) and 皮包 (pi2bao1). 包 works in Chinese as a general term for bags or anything with wrapping (e.g. see the Oscar nominated animated short Bao), and the modifiers work by describing what is stored in the bag (公文, “public documents” or briefs if you will) or what the bag is usually made of (皮, “skin” i.e. leather).

I really like the word 皮包 (pi2bao1) because it is also used in the compound word 皮包公司 (pi2bao1 gong1si1, “briefcase company”). As one can imagine, a company based out of a briefcase may not be the most reliable, so it refers to fly-by-night operations. Do be careful with 皮包, however, if you reverse the order of the characters, you refer to a portion of the male anatomy that is removed in a circumcision.

包 (bao1) is an extremely productive character making up 288 words in my master list of Chinese words. Ironically, however 包 is not used to describe luggage, which is 行李 (xing2li5, “travel plum(?)”). 李 is a strange character. Its base meaning is plum, it is a very common surname, and it shows up in the word 行李.


WOTD: 钱

Yuan, Yen, Won

Qian2 (钱, “money”) does not exactly merit “Word-of-the-Day” status, being a high frequency character and level one HSK word, but it has been on my mind a lot recently with 6 visits to the bank in the past week in order to convert some RMB into other currencies.

If you don’t already know, Chinese people refer to their money as “the People’s Money,” i.e. 人民币 (ren2min2bi4, “People Money”), much like it is the People’s Republic of China, the People’s Liberation Army, and the People’s Park. The money is denominated in yuan (元, second tone, “dollars”), jiao (角, third tone, literally “horn” but meaning a tenth of a yuan), and fen (分, first tone, “fraction” and meaning one hundredth of a yuan). Paper currency comes in denominations of 1, 5, 10, 20, 50, and 100 yuan and all feature the Great Helmsman*. Exchange rates fluctuate, but a “Pink Mao” is generally worth about 15 bucks American. There are 1 yuan coins and an assortment of coins and tiny paper bills for 1 and 5 jiao, as well as fen, although fen are seldom used. Its cheaper for a supermarket to round down than to deal with things worth a fraction of a penny.

China’s neighbors to the east also use yuan, though in their language systems it gets rendered won and yen, respectively. Korean Won are simply called 韩元 (han2yuan2, “Korea Dollar”, e.g. 원) and Japanese Yen are simply called 日元 (ri4yuan2, “Japan Dollar”, e.g. 円) in Chinese. (Pro-tip: 円 is in your Chinese IME under “yuan.”) Though US dollars and Euros also get the yuan treatment, not all currencies are “yuan.”

So, as for why I went to the bank nearly every day for a week, Chinese has extremely onerous controls on foreign currency. Non-citizens are limited to converting the equivalent of USD 500 per day. The self-service “smart” terminals that some banks have introduced (i.e. ICBC) are basically limited to Chinese ID card holders, meaning one has to wait up to an hour to see a teller, after which it takes another 20 minutes to make several photocopies and pictures of your passport and have a manager sign off on the transaction. Oh, and they make you wait two days before releasing the cash.*


*The current design of renminbi was introduced in 1999, and though anti-counterfeiting measures have been added, the basic design is consistent. See Lethal Weapon 3, which revolves around counterfeiting Chinese yuan for a glimpse of the previous design.
**Actual conditions depend on the city and bank.

My Chinese Learning Story

Part One

Before China, I had absolutely no interest in learning Chinese. Long ago, I was looking to learn a non-European foreign language and my university offered two: Chinese and Arabic. I chose Arabic. Ironically, I am fluent in Chinese today while Arabic remains inscrutable.

Prior to visiting China the first time, my mom asked me if I intended to learn any. My answer was an emphatic no. I was going to be backpacking with a friend, who ostensibly was already fluent, and I figured there should be enough English-speakers in the touristy parts of Xi’an, Beijing, and Shanghai.

Nevertheless, it is hard not to pick up at least a few words when surrounded by the language, and it certainly helps to have a non-native speaker modeling the essentials of basic communication. I seem to recall learning “ma3 ma3 hu1 hu1” (horse horse tiger tiger, i.e. so-so) as my first word in Chinese, then getting the pronouns “ni3” (you), “wo3” (I), “ta1” (he/she/it) on the second day. Afterwards, there wasn’t much conscious effort to get a handle on: good, bad, have, don’t have, want, don’t want, this, that, etc. Considering my traveling companion went around telling every street peddler and beggar that I was a rich man forced me to scramble to spit out a “mei2you3” or “bu2yao4.” I did get my hands on a pair of phrase books during the travels, which exposed me to a fair amount vocabulary and sentence structure. (The Lonely Planet phrase book used such a weird romanization system.) Nevertheless, during the solo portions of my travels, it was extremely difficult to get around. I recall getting a taste for Chinese characters, learning “大” (da4 “big”) from the highway mileage signs between Kunming and Dali and getting North, South, East, and West figured out from the street signs on the grid-like Xi’an.

I had caught the China bug, so even back in the U.S., I put a little effort into Chinese language podcasts (i.e. ChinesePod), but I recall even the Newbie level was too hard for me and I didn’t make any progress on Chinese.

When I moved to China, I would assess my Chinese level as pretty close to zero. I could count to ten and say hello, but had no clue what anyone was saying to me. Since I didn’t have any actual job to occupy my days, I spent quite a few hours a day studying Chinese, typically from a borrowed introductory textbook (Short-term Spoken Chinese) and podcasts (which I began to systematically and obsessively listen to).

My first month in Shenzhen, I went to a particular Lanzhou Pulled Noodle restaurant, which had both the picture menu on the wall and the one page laminated text-only menu. I worked my way down the text-only menu blindly and trusting in the fact that a Muslim restaurant wouldn’t serve me any pig brains. I copied the name of the dish I ordered into a notebook and used my best judgement of what came out to try to decipher the Chinese characters. Though I was quickly able to distinguish 大 (da4 “big”) from 小 (xiao3 “small”) and 牛 (niu2 “cow”) from 羊 (yang2 “sheep”), I recall being utterly flummoxed by 面 (mian2) which had so, so many entries in the dictionary, e.g.: “face, side, surface, aspect, top, classified for …”

I spent a lot of time finding places to hang out in, and having the same basic conversations over and over again. Even understanding only 50% of what was said to me, it was pretty easy to guess that people were asking “what country am I from”, “am I married”, and other questions of that ilk. I’m certain I gave some funny answers when I completely missed the mark.

Alcohol also served a major role in the early learning process, both in trying to flirt with the waitresses at a bar owned by a friend of mine and in drunkenly conversing with clients from Hong Kong in a combination of broken English and broken Mandarin.

As I found odd jobs which required a lot of commuting around the city, podcasts (mostly ChinesePod and ChineseLearnOnline, though I eventually added PopupChinese) became an essential part of the learning experience. I invested the time on my computer to edit the 10+ minute long lessons down to the approx. 30 sec. long content cores and built playlists to shuffle through them. I could “review” 50 lessons in a half hour long bus ride.

Time passed and I kept plugging away on the Chinese, moving on to the next textbook in the series. Within about 6 months, I was confident enough to do a short solo travel (3 days). Within about 9 months, I took a longer solo travel into the backwoods (2 weeks). The following year, I took and passed the Intermediate level HSK (old edition). I don’t recall how much, if any, writing was required for that.


WOTD: 冬泳

A warm sunny day in Beijing

Swimming (游泳 “you2yong3”) is one of those words composed essentially of “swim-swim” to refer to the activity in general. However, the two characters break off to form rather interesting collocations on there own.

游 (which I can never remember how to write without looking at it) has a much broader semantic space than “swimming.” It’s original meaning has something to do with rivers, which can still be seen through the words upstream and downstream (上游 “shang4you2″ and 下游 “xia4you2” respectively), while its use in 游戏 (you2xi4 “game” and 旅游 lv3you2 “travel”) occur an order of magnitude more often than 游泳.

泳, on the other hand, pretty much exclusively related to swimming and is a sticky morpheme (meaning it shouldn’t be showing up alone). If you are watching the Olympics on CCTV Sports, you’ll see it show up in the names of the various swimming styles, such as 仰泳 (yang3yong3, face up-swim, i.e. backstroke), 蛙泳 (wa1yong2, frog-swim, i.e. breaststroke), and 蝶泳 (die2yong3, butterfly-swim, i.e. butterfly stroke); in swimming accessories, such as 泳衣 (yong3yi1, swim-clothes) and 泳帽 (yong3mao4, swim-hat); and for special types of swimming, such as the word of the day.

冬泳 (dong1 yong3, winter-swim) is literally what the hanzi suggest it means: swimming in the winter. It’s quite a popular activity in China, among old men, who swear by the daily ritual as a way to stave off colds.

WOTD: 馅儿

Yummy

馅 (xian4 or xian4r) is the general term for the filling or stuffing in all kinds of dumplings (such as the 水饺 [shui3 jiao3, boiled dumplings] pictured above). The word is typically erhua‘d, meaning that it is pronounced like “xiar”.

An interesting thing about 馅 is that the meaning has drifted to encapsulated ground meat as sold in the supermarket. I am fortunate to live near a supermarket where the butcher section has ready-to-buy ground beef and ground mutton, which I find myself buying fairly regularly to make chili, meatballs, and tacos.

The word for “grind” is 磨 (mo2) and that works for grinding grains, spices, and coffee, as well as meat. If you need something to be ground on-demand, you’d ask for it to be 磨成粉 (mo2 cheng2 fen3) or 磨成馅儿 (mo2 cheng2 xian4r) if the end result is a powder or meat filling respectively.