An Introduction to Writing Systems and Unicode

(r12a.github.io)

43 points | by mariuz 3 days ago

4 comments

siruwastaken 5 minutes ago

Are there any developments towards font file standards that support the theoretical full space of unicode? I've always heard that fonts are limited in size to a subset of the true space.
ks2048 3 hours ago

This site has been a gem for a long time for Unicode and language-related topics. Just as good to link to the top-level,
https://r12a.github.io/

[-]
- mostafah 3 hours ago
  
  Richard is amazing. I briefly worked with him while volunteering on a W3C text layout requirements document. He cares deeply about writing systems, and he has been doing so much valuable work in this space.
vishnuharidas 1 hour ago

Also: UTF-8 Playground: https://utf8-playground.netlify.app
ovciokko 3 hours ago

The texts in the images claimed to be Simplified Chinese are not really conforming the standard glyph shapes of hanzi as defined by the government of China; they look more like the Japanese standard shapes of kanji.

[-]
- mbrubeck 2 hours ago
  
  Can you clarify which characters you're talking about? I don't see any examples of Japanese-specific kanji in the simplified Chinese examples.
  For example, the first image uses 沟 and 时 forms that are found only in simplified Chinese. In both Japanese and traditional Chinese, these are written 溝 and 時.
  The images also correctly use the Chinese forms of 統/统. The Japanese form [0] differs from both and does not appear in these images.
  请 as shown in the image is similarly used only in simplified Chinese, not Japanese. In Japanese, the traditional Chinese form is normally used in handwriting, and an alternate form of the 訁 radical (different from either of the Chinese forms) is often used in printed text.
  [0]: https://en.wiktionary.org/wiki/%E7%B5%B1#Japanese
- dhosek 2 hours ago
  
  One of the big complaints about Han-unification in Unicode is that simplified and traditional forms share the same code points so display of simplified vs traditional is up to the font to manage.
  
  [-]
  - renhanxue 1 hour ago
    
    That's not really accurate. An overwhelming majority of the simplified characters have had their own code points in Unicode ever since 1.0. Some more details here: https://r12a.github.io/scripts/chinese/
    
    [-]
    - dhosek 19 minutes ago
      
      That’s good to know. I’ve heard the complaint offered many times, but don’t have the necessary language skills to know otherwise.