I want to make a post covering phonosemantics and components (or what people call radicals) in Chinese, which is practically essential when you’re finally on that character grind in the thousands.
Introduction
Chinese characters are not many thousand unique glyphs with no relation that you have to remember both the meaning and pronunciation to. Rather, there’s a couple hundred unique “components” that most characters are made of.
These components can be standalone characters, corrupted components from the several thousand years of change, or meaningless parts inserted for simplification purposes (which is why it’s worthwhile to learn - at least to read - both character sets - no politics should come into play here!).
In an oversimplified sense, you can think of two main categories of chinese characters, ideographic/pictographic (象形 and 指事) and phono-semantic (形声). Course there’s actually 6 categories, but it’s not that important.
Ideograms and Pictograms
These are the first characters you learn in Chinese, the 一、二、三、上、下 etc representing ideas, and 人、火、目、日、月 etc representing things. These, unfortunately, you’ll have to just remember.
There’s a few other characters which I would call un-dissasemblable, like 我, which though it might originally be composed of components, should be treated as one object now. (Sidenote: many beginners mix up 我 and 找, whilst natives see them as entirely different, because natives see 找 immediately as 扌〔手〕+戈 whlst 我 is just 我)
There’s also ideograms made of multiple smaller characters, which give no hint to its sound but perfectly explains its meaning.
休xiu1 means rest, a man 亻〔人〕 resting against a tree 木, 森林sen1lin2 being forest, made up of many trees, 尖jian1 being small 小 on top of big 大, meaning sharp, and 尘*chen2 being small 小 mud 土, for dirt. 双* being two 又 (meaningless), for a pair. 泪*lei4 for tears, being made up of water 氵〔水〕 and eye 目, and… you get the idea.
*These characters are simplified – 塵->尘 which idk what the trad was on about; 雙(two birds 隹 in one hand 又)->双; 淚->泪.
Phono-semantic Characters
The vast majority of characters, however, are not ideo- or pictograms, but rather are “phono-semantic” characters – characters with a phonetic sound-hint component, and a semantic meaning-hint component.
Usually, phonetic parts appear on the right of a left/right character, and either on a top/bottom character, but this is not a rule, and requires some experience to see - because there’s a lot more phonetic particles than semantic ones.
when it goes well
A classic example is based off 马〔馬〕 ma3 (horse), with the characters 吗ma (question particle)、妈ma1 (mother)、骂ma4 (berate). Note how they all have 马, horse on the right - it’s there for the “ma” sound, and the characters don’t have anything to do with horses except of course for your mother. A 口kou3 (mouth) component on the left usually indicates onomatopoeia, exclamations, or something to do with mouth or speech. The 女nv*3 (woman) component indicates, well, a woman, or things to do with women. The two mouths 口口 – I think the imagery should be vivid enough :)
*v is a replacement for u-umlaut, and is also what you’d type in a pinyin IME
Other examples of nice phono semantics are 主zhu3 (lord) leading to 住zhu4 (reside), 驻zhu4 (halt/garrison), 注zhu4 (concentrate), 柱zhu4 (pillar), etc. 蜘蛛zhi1zhu being the 虫 bug radical and 知zhi1 朱zhu1.
when it goes less well
Chinese is a many-thousands-year-old family of languages, and pronunciation changes faster than writing. Just like how this is reflected in us being unable to read old english, in Chinese it can be seen with the phonetic parts going out of whack.
Consistent Weirdness
Take 工gong1 (work), which is the phonetic in 红hong2 (red), 虹hong2 (rainbow), as well as 巩(kong3), and 江jiang1, river- huh wait, what? Well, the “j” sound is a relatively recent mandarin thing, and they’d all be pronounced something akin to “k(i)ang” in older topolects / middle chinese.
It is usually fairly consistent - 失shi1 (lose) often gives a “die” reading, such as in 跌die1 (stumble), 迭die2 (again and again), and 铁*tie2 (iron), but there are always exceptions. So remember the rule, remember the less intutive rule, and then try to remember the exceptions.
Inconsistent Weirdness
Take 主. One common character is 往, which is actually pronounced wang3. Take the common surname 冯, which is Feng2. Etc. Which is why the phonetic hint is just that: a hint.
Semantic Weirdness
Noticed how 虹 (rainbow) has a 虫 radical? Yeah I don’t know why this is either. 貌mao4 has nothing to do with cats 豸, but has it anyway. Sometimes, you gotta just take it as is.
when it goes less well due to simplification
The main goal of Simplified Chinese is just that: simplification. Sometimes simplification is pretty good, such as the consistent 金->钅, 言->讠, 門->门, or when simplifying phonetic components like 識->识 (戠zhi2->只zhi3), 鐵->铁 (something->失"die"), 藝->艺 (埶yi4->乙yi3, which is one of the most drastic simplifications).
However, sometimes it is bad, because it is, franky, rather lazy.
Mostly consistent inconsistency due to simplification
In terms of semantic radicals, the left of 狗 is the 犬 dog radical, 豕(豬) is the pig radical, and 豸(貓) is the cat radical. However, in a lot of cases, these have been combined into the first one, the simplest one, such that it now effectively means “animal”. 狗 is unchanged, 豬->猪, and 貓->猫. This is, however, not entirely consistent, as 豹bao4 is unchanged, but it’s mostly good enough.
In terms of phonetic radicals, one important one to note is 虫chong2 (bug). 虫 traditionally isn’t really used as a phonetic component. Another character, 蜀shu3, an old name of a kingdom in Sichuan, is. But since 蜀 is complex and has 虫 in it, for many, many words it was simplified to just 虫. Examples being 烛zhu2 (candle), 独du2 (alone), 浊zhuo2 (turbid) BUT 屬shu3 (belongs to) was simplified to 属, and 镯zhuo2 (braclet) just wasn’t.
I’ve not been mentioning the semantic components, because they should be somewhat self explanatory - also they’re usually unchanged.
Inconsistency due to simplification
For the sake of simplification, certain characters had components removed entirely, like 廣guang3 (wide) having 黃huang2 removed becoming 广, 厰/廠 to 厂, 醫 to 医, and 聲 to 声, which often removes a phonetic component, putting these in the “inseparable” category.
Other characters were replaced with “vulgar forms”, some constructed, most coming from historical cursive / calligraphy, with a few being entirely constructed.
Note: 听ting1 (listen), is a Yuan-dynasty vulgar form of 聽, and is not 口 mouth + 斤 lb, rather 口 mouth + a corrupted form of 厅ting1, hall, which itself is a vulgar (and simplified form of) 廳, ironically containing 聽 as a phonetic
These vulgar forms more often than not employed empty components, things which are just meanlingless scratches there for the sake of it - and these are probably the most annoying to remember.
Take 雚guan4, 堇jin3, and 漢han4 (this character minus the 氵 doesn’t really exist). These were all replaced with the empty component 又, but- not all of them were.
For han: 漢han4->汉, 嘆tan4->叹, 難nan2->难, 艱jian1->艰 it’s very consistent
For guan: 觀guan1->观, 歡huan1->欢, but 灌guan4, 罐guan4 are unchanged
For jin: 僅jin3->仅, but 谨jin3, 勤qin2 are unchanged
又 is a very popular one, with 树shu4〔樹〕, 鄧Deng4〔邓〕etc also being 又-replaced. 乂 is also popular, with my most hated character, 赵Zhao4〔趙〕, as well as 风feng1〔風〕, 冈gang1〔岡〕, etc.
Still, I can get not wanting to write 18 strokes for 難.
Conclusion
Anyway, that’s a “brief” overview on roughly how it works. I’m obviously not a linguist, and have oversimplified sometimes to make things easier to understand, but this is roughly the entirety of how phono-semantics (but mostly phonetics) works, with regards to simplified chinese. In traditional chinese you wouldn’t have to worry about the simplification inconsistencies but 憂鬱的臺灣烏龜 so potato potato.