I got this reference from the video (linked) where he’s talking about the differences between Mandarin and Japanese, mainly involving 漢字 which is present in both languages but there are actual differences (besides simplified or traditional characters) associated with either printed text and handwritten formats, kind of like this:

I mean, it’s literally the same word in both languages but they’re different if you look closely. No wonder why people mistake Japanese for CHINESE! It’s infuritating when you know what Kanji actually looks like and how it’s pronounced. The issue is that characters for the most part inherit same unicode between Japanese & Mandarin.

Mainly talking about word processors and how Kanji / Hanzi is encoded onto computers, what ends up happening is that they overlap depending on the font used (as Japanese has it’s own font set while Mandarin uses a completely different one). Look at it like this, which language is the “A” looking letter from (shown below):

Both use the same font, but have the stroke on top facing different directions indicating they are from different languages. However if one doesn’t pay close attention to the letters present within a word, they can face confusion over which language it belongs to. That’s the kind of crap happens when discussing Kanji or Hanzi.

For example, the unicode for is U+8ECD which is present in both Japanese and Mandarin. Basically, the same as to why the letter “Í” (U+CD) appears in multiple languages like Spanish, Portuguese, Icelandic, Hungarian, Czech all inherit unicode U+CD so you may get mixed results in one of those languages.

  • Hapankaali@lemmy.world
    link
    fedilink
    arrow-up
    12
    ·
    il y a 10 jours

    Both use the same font, but have the stroke on top facing different directions indicating they are from different languages.

    Languages using the Latin alphabet use varying sets of diacritics, often to introduce ways to express sounds that may not have been present in (Vulgar) Latin. A particular diacritic or diacritic-letter combination can generally not be associated with any specific Latin-script-based language, of which there are many hundreds if not thousands (depending on where one draws the line between language and dialect). An interesting example is Vietnamese, a tonal language using the Latin alphabet, which uses a large number of diacritics to express tonality.

  • 𝕱𝖎𝖗𝖊𝖜𝖎𝖙𝖈𝖍@lemmy.world
    link
    fedilink
    arrow-up
    8
    ·
    edit-2
    il y a 10 jours

    If the question is whether the Latin languages use letters differently: yes, every language is different?

    I can speak for English, Spanish, and French. English is a bastard language with more exceptions than rules, as we all know. Spanish mostly uses accents as a pronunciation guide with some exceptions, whereas French accents change the letter sound more consistently. Both can change the meaning. French uses ç but not Spanish, and Spanish uses ñ exclusively. French is much more contextual.

    Sp: “Si llego a tiempo” (if I arrive in time) vs “Sí, llegó a tiempo” (yes, he arrived in time). Same general sound, different emphasis.

    Fr: «Bon mais sale» (good but dirty) vs «bon maïs salé» (good salted corn). Wildly different sounds and even syllable counts.

    • HobbitFoot
      link
      fedilink
      English
      arrow-up
      1
      ·
      il y a 10 jours

      I feel like, to add on it, letter combinations also yield wildly different sounds in different languages. For instance, “ll” in English sounds like an “l” while in Spanish it sounds like a “y”.

      • In French, «queue» rhymes with euh, «clown» rhymes with spoon, and «comment» sounds like c’mon. The Spanish ñ is closest to the French gn («mignon») and English ny/ni (“canyon”, “onion”). Don’t get me started on the R sounds.

      • quediuspayu@lemmy.dbzer0.com
        link
        fedilink
        arrow-up
        2
        ·
        il y a 10 jours

        …it sounds like a “y”

        That is called yeismo and is caracteristic of some dialects, the traditional pronunciation is /ʎ/,

  • FishFace@piefed.social
    link
    fedilink
    English
    arrow-up
    2
    ·
    il y a 9 jours

    It’s not really clear what your question is. Certainly the pronunciation of the same letter is different between different languages using the same (or nearly the same) alphabet. In some cases, the same pronunciation is indicated by different letters: Swedish and Norwegian use ö and ø, respectively, both to indicate the sound [ø].

    The modern letter j is descended from the Latin letter i. Latin itself did not distinguish them. This is a somewhat similar situation to the various languages using Han characters, where while the characters are broadly speaking the same, there are some variations which carry meaning.

    These differences don’t suffer from the problems of Han unification within Unicode. Because the alphabets are much smaller than the set of Chinese characters, there is no problem to replicate every single small difference. This means that even letters which are clearly “the same”, like the Greek letter “omicron” and the Latin letter named (in English) “oh” actually have different codepoints.

    One last semi-related point: handwriting varies quite widely between different countries using the Latin alphabet, meaning that while on the internet letters may seem to be identical, they can have wildly different shapes!

  • Owl@mander.xyz
    link
    fedilink
    arrow-up
    1
    ·
    il y a 8 jours

    é and è make different sounds in french but everybody can notice the difference

    On the other side of things, in Hungarian accents are completely vertical

  • blackbrook@mander.xyz
    link
    fedilink
    arrow-up
    1
    ·
    il y a 10 jours

    Yes different languages use different sets of characters, most of them overlapping in the case of Latin derived characters. It would be vastly less convenient if there was a completely different set (with mostly the same characters) for every single language. And what about variants over time and region and dialect? So it is much better to model them as subsets of the same larger character set. Note that À and Á are different characters with different Unicode symbols.

    The human world is messy with lots of inconsistencies and irregularities, particularly from pre-computer times because humans just deal with them and tolerate mistakes. This is a challenge for modelling properly in computer systems. It does not surprise me that there are even bigger challenges around this for Kanji / Hanzi.