Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Hans and Hant scripts; improve Hani sample #289

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

valadaptive
Copy link
Contributor

This is a somewhat hacky fix to allow properly loading fallbacks for Chinese scripts on Windows.

The "Hans" and "Hant" (simplified and traditional Chinese) script identifiers were missing, and "Hani" included a character which is included in Yu Gothic UI, Windows' default Japanese font, which only contains kanji.

This means that selecting a "Hani" (all Han ideographs) fallback would select a Japanese-only font that's missing some ideographs that seem to be used in Chinese.

I do not actually speak Chinese; I found these characters by looking in gaps in Yu Gothic UI's coverage that Microsoft JhengHei UI fills. Someone more familiar with Chinese and/or Japanese could probably do a better job finding sample characters that are used only in Chinese and not Japanese.

We need to put the Chinese-only characters first in the "Hani" sample because of the way the DirectWrite fallback handling works--we call IDWriteFontFallback::MapCharacters repeatedly, advancing the "head" of the string one character at a time, until we get a match. However, the documentation for MapCharacters says that it actually works by returning "the font that should be used to render the first mappedLength characters of the text"--that is, it returns a font that you can use to render the first bit of the string, then you advance and call it again on the next bit, repeat until all characters are mapped. This means that the way we do things, only one character in each sample (with preference given to earlier ones) needs to match in order for a fallback font to be selected.

If things worked more like the fontconfig backend seems to, where there's a whole list of fallback families, this would be fine, but the DirectWrite backend only ever returns one fallback which may not cover all characters in the given sample string. That's a less-hacky longer-term solution.

This ensures that the samples contain characters *not* included in
Yu Gothic UI, Windows' default Japanese font, which only contains kanji.
@waywardmonkeys waywardmonkeys requested a review from dfrg March 2, 2025 01:59
@xorgy
Copy link
Member

xorgy commented Mar 3, 2025

I think this is an improvement.
Font selection in han text is difficult, and to get a good result you need to have locale priority information from the environment.
You are correct about fallback; usually you will be fine, but users who encounter mixed text or text with obscure characters will expect fallback to work for individual characters. I receive text with characters that are only found in one font on my system (TW-Kai-Ext-B), and mixed fonts is the expected fallback behavior.

For example, the idiom 歹鬼𤆬頭 includes characters that are available in TW-Kai (standard), but the third character is in TW-Kai-Ext-B; or on GitHub with Chromium on my machine, three of the four characters come from Segoe UI, and one the third one comes from Noto Sans TC.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants