• neutron
    link
    fedilink
    arrow-up
    3
    ·
    9 months ago

    If we’re being really pedantic, the last part in Korean is counted with different units:

    • 각 as precomposed character: 1자 (unit ja for CJK characters)
    • 각 (ㄱㅏㄱ) as decomposable components: 3자모 (unit jamo for Hangul components)

    So we could have separate implementations of length() where we count such cases with different criteria… But I wouldn’t expect non-speakers of Korean know all of this.

    Plus, what about Chinese characters? Are we supposed to count 人 as one but 仁 as one (character) or two (radicals)? It gets only more complicated.