Duoyinzi (多音字,duo1yin1zi4), or polyphonic characters, are hanzi which have more than one reading (i.e. pronunciation). It’s bad enough that there are so many hanzi to learn, but the existence of duoyinzi makes learning Chinese that much more of a headache.* According to my research, duoyinzi constitute about 12.6% of Chinese characters (out of roughly 9,000 simplified hanzi in modern use). However, if frequency is taken into consideration, duoyinzi account for over 44% of hanzi occurrences, meaning that they are more common than the average character.
Most duoyinzi have only two valid pronunciations, though a handful of them have more. The worst offenders are 和 and 嗯, which have 7 and 8 readings, respectively.
I tend to group duoyinzi into three categories. The first is those whose alternate pronunciation is the same just with the tone neutralized. This is most common in duplications such as 妈妈 (ma1ma5, “mother”)、爸爸 (ba4ba5, “father”)、哥哥 (ge1ge5, “older brother”)、姐姐 (jie3jie5, “older sister”) as well as final, bound morpheme characters such as 子 in 杯子 (bei1zi5, “cup”)、桌子 (zhuo1zi5, “table”)、椅子 (yi3zi5, “chair”). A linguist could argue that these are online phonological changes to the base pronunciation. However, there are cases where the neutral tone serves as a lexical distinction. 东西 when pronounced as dong1xi1 means “east and west,” when pronounced as dong1xi5, it means “thing/stuff.”
The second category of duoyinzi involves tonal changes more generally. For example, 兴 can be read xing1 or xing4, meaning “prosper” or “excitement” respectively; 为 can be read wei2 or wei4, acting as a verb or preposition (and meaning “act” or indicating purpose) respectively; and 数 can be read as shu4 or shu3 meaning “number” or “to count” respectively. One could posit a theory that there is an underlying relationship between the two words in which the tones serve to distinguish variations in meaning. These are the hardest duoyinzi to keep straight.
The final category of duoyinzi are the easiest because the pronunciations are the furthest off. One classic example is 参差 (cen1ci1, “uneven”), in which both characters vary from their typical readings (参 can1 “join”) and (差 cha4 “differ”). 参 also has a reading of shen1 when referring to ginseng (人参 ren2shen1), and 差 also has a reading of chai1 for going on an errand.
So, what is one to do about duoyinzi? Other than just memorizing them, I think it may be helpful when learning new vocabulary (with the correct pronunciations) to see if one has already learned a particular hanzi with a different reading. Highlighting the difference between existing knowledge and new knowledge should both raise awareness and avoid confusion. For the learner with HSK aspirations, I’ve pulled out all of the duoyinzi in the 5,000 word vocabulary set into a handy cheat sheet here.
*I have been informed that the usage of kanji in Japanese (which are based on a set of 1,000+ hanzi) is even more wild in the overlaying of multiple pronunciations.