Compound tone diacritics iii #956

eggrobin · 2024-10-21T19:37:30Z

L2/24-232, Unicode request for compound tone diacritics III

[UTC-181-C34] Consensus: Provisionally assign 7 code points U+1ADE..U+1ADF and U+1AEC..U+1AF0 to combining diacritical marks as described in L2/24-232. [Ref: 2.7 in L2/24-228]

[185-C40] Consensus: UTC accepts for encoding in Unicode 18.0 the following 321 Arabic, Armenian, Bengali, Cuneiform, Devanagari, Hebrew, Kana, Khitan, Latin, Mongolian, Phonetic and other symbol characters for which code points have previously been assigned:

Arabic (39 characters—ref. 180-C22, 180-C26): 10EC9..10ECF, 10ED9..10EEE, 10EF0..10EF9
Armenian (3 characters—ref. 179-C46): 0558, 058B..058C
Bengali (1 character—ref. 180-C30): 0984
Cuneiform numerals (12 characters—ref. 182-C3): 1246F, 12475..1247F
Devanagari (1 character—ref. 182-C5): 11B0A
Hebrew (1 character—ref. 182-C4): 05C8
Kana (7 characters—ref. 180-C6, 182-C31, 183-C54, 184-C38): 1B123..1B125, 1B126, 1B127..1B128, 1B168
Khitan (5 characters—ref. 184-C5): 18CD6..18CDA
Latin (54 characters—ref. 181-C8, 181-C10, 182-C6, 182-C7, 182-C8, 182-C9, 183-C8): 2E60..2E63, A7DD, A7E2, AB6C..AB6D, 1DF57..1DF59, 1DF5A..1DF66, 1DF67, 1DF68..1DF81, 1DFCD..1DFCF
Mongolian (1 character—ref. 178-C30): 1879
Phonetic (114 characters—ref. 179-C55, 179-C59, 179-C60, 180-C32, 180-C33, 180-C34, 180-C35, 180-C36, 180-C37, 181-C33, 181-C34, 181-C35, 181-C36, 181-C45, 183-C10): 1ADE..1ADF, 1AEC..1AF0, 208F, 209D..209F, 107BB..107BF, 1DF1F..1DF24, 1DF2B..1DF2C, 1DF2D..1DF3A, 1DF3B..1DF3D, 1DF3E..1DF3F, 1DF40..1DF56, 1DFD0, 1DFD1, 1DFD2..1DFD7, 1DFD8..1DFE8, 1DFE9..1DFF2, 1DFF3..1DFF4, 1DFF5..1DFF9, 1DFFA..1DFFF
Symbols (81 characters—ref. 178-C31, 178-C36, 178-C37, 180-C38, 180-C39, 180-C40, 181-C38, 181-C39, 181-C40, 182-C10, 182-C11, 183-C12, 183-C13, 184-C18): 20C2, 1CEF1..1CEF5, 1D127..1D128, 1D1EB..1D1F6, 1D1F7..1D1FE, 1D1FF, 1D250..1D255, 1D256..1D25A, 1D25B..1D25F, 1D260, 1D261, 1D262..1D27F, 1D280..1D281, 1F1AE, 1F7DA
Tangut (2 characters—ref. 183-C7, 184-C4: 18D1F..18D20

…itics-III

markusicu · 2025-11-21T19:41:47Z

unicodetools/data/ucd/dev/LineBreak.txt

+1ABF..1AEA     ; CM # Mn    [44] COMBINING LATIN SMALL LETTER W BELOW..COMBINING UPWARDS ARROW ABOVE
 1AEB           ; GL # Mn         COMBINING DOUBLE RIGHTWARDS ARROW ABOVE
+1AEC..1AF0     ; CM # Mn     [5] COMBINING CARON-ACUTE..COMBINING DOUBLE COMMA ABOVE


When are COMBINING DOUBLE things lb=CM vs. GL?

https://util.unicode.org/UnicodeJsps/list-unicodeset.jsp?a=%5B%3AName%3D%2FCOMBINING+DOUBLE%2F%3A%5D&g=lb&i=

When they span two base characters, they are GL. When they just go over one base character, they are CM. The naming intersection is a little unfortunate.

Ok, and looking at L2/24-232 the two new double accents below go under just one base character. Thanks!

markusicu · 2025-11-21T19:46:42Z

Should we add the NamesList.txt annotations that L2/24-232 proposes? Or leave those to @Ken-Whistler ?

1AF0 COMBINING DOUBLE COMMA ABOVE
→ 0313 COMBINING COMMA ABOVE
→ 02EE MODIFIER LETTER DOUBLE APOSTROPHE

eggrobin · 2025-11-21T22:01:39Z

Should we add the NamesList.txt annotations that L2/24-232 proposes? Or leave those to @Ken-Whistler ?

We leave the names list to @Ken-Whistler.

markusicu · 2025-11-21T22:32:31Z

unicodetools/data/ucd/dev/LineBreak.txt

+1ABF..1AEA     ; CM # Mn    [44] COMBINING LATIN SMALL LETTER W BELOW..COMBINING UPWARDS ARROW ABOVE
 1AEB           ; GL # Mn         COMBINING DOUBLE RIGHTWARDS ARROW ABOVE
+1AEC..1AF0     ; CM # Mn     [5] COMBINING CARON-ACUTE..COMBINING DOUBLE COMMA ABOVE


Ok, and looking at L2/24-232 the two new double accents below go under just one base character. Thanks!

eggrobin added 4 commits October 21, 2024 21:21

UnicodeData.txt lines from L2/24-232

d5741a6

lb=CM

5f1d87f

Inherited

a138dcc

Regenerate UCD

2a53e24

eggrobin added data-for-new pipeline-recommended-to-UTC labels Oct 21, 2024

eggrobin added 3 commits October 21, 2024 21:48

Diacritic

4be3735

a test

64d4853

Regenerate UCD

76d0b66

eggrobin added pipeline-provisionally-assigned and removed pipeline-recommended-to-UTC labels Nov 7, 2024

eggrobin added 2 commits November 21, 2025 19:08

Merge remote-tracking branch 'la-vache/main' into compound-tone-diacr…

8d695df

…itics-III

Ignore IDNA2008_Category

cd50fe6

eggrobin added pipeline-18.0 and removed pipeline-provisionally-assigned labels Nov 21, 2025

Merge remote-tracking branch 'la-vache/main' into compound-tone-diacr…

4a55689

…itics-III

eggrobin marked this pull request as ready for review November 21, 2025 19:18

eggrobin requested a review from markusicu November 21, 2025 19:18

markusicu reviewed Nov 21, 2025

View reviewed changes

markusicu approved these changes Nov 21, 2025

View reviewed changes

eggrobin merged commit 9109e95 into unicode-org:main Nov 21, 2025
19 of 20 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Compound tone diacritics iii #956

Compound tone diacritics iii #956

Uh oh!

eggrobin commented Oct 21, 2024 •

edited

Loading

Uh oh!

markusicu Nov 21, 2025

Uh oh!

roozbehp Nov 21, 2025

Uh oh!

markusicu Nov 21, 2025

Uh oh!

markusicu commented Nov 21, 2025

Uh oh!

eggrobin commented Nov 21, 2025

Uh oh!

markusicu Nov 21, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Compound tone diacritics iii #956

Compound tone diacritics iii #956

Uh oh!

Conversation

eggrobin commented Oct 21, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

markusicu Nov 21, 2025

Choose a reason for hiding this comment

Uh oh!

roozbehp Nov 21, 2025

Choose a reason for hiding this comment

Uh oh!

markusicu Nov 21, 2025

Choose a reason for hiding this comment

Uh oh!

markusicu commented Nov 21, 2025

Uh oh!

eggrobin commented Nov 21, 2025

Uh oh!

markusicu Nov 21, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

eggrobin commented Oct 21, 2024 •

edited

Loading